joshsh / sesametools Goto Github PK

View Code? Open in Web Editor NEW

49.0 49.0 11.0 1.46 MB

A collection of utilities for use with OpenRDF Sesame (as of recently, Eclipse RDF4J)

License: Other

Java 99.91% Shell 0.09%

sesametools's People

Contributors

Stargazers

Watchers

Forkers

fkleedorfer ewpatton goerlitz xquery pulquero rollend the-alchemist vijayeluri eluriresearch zcfrank1st iq-scm

sesametools's Issues

remove "leading statement" constraint from RdfListUtil methods

RdfListUtil.getList and RdfListUtil.addList could be useful in situations where an RDF List is known (only) by the URI of its head, but currently they expect a "leading statement" pointing to the list. For example, it would be nice to support patterns like the following:

Graph graph = ...
List<Value> values = ...

URI myList = vf.createURI("http://example.org/ns#mylist");
RdfListUtil.addList(myList, values, graph);

Would it make RdfListUtil less useful for current applications if we were to take out the leading-statement logic, changing the method signatures to the following:

public static void addList(Resource head, List<Value> values, Graph graph, Resource... contexts) and
public static List<Value> getList(Resource head, Graph graph, Resource... contexts)

Note that I am also suggesting a varargs "context" parameter in getList.

@ansell, please let me know what you think.

common.jar should not contain a service registry for parsers or writers

The current common-1.2.jar contains a service registry that contains a list of parsers and writers that come with the full sesame distribution. If one runs a customised sesame package, by selectively importing parsers and writers there are errors written to the logs about missing parsers or writers. In my case, this is happening with the Trig and Trix parsers and writers that I have not included.

Solution is to remove the META-INF/services/org.openrdf.rio.RDFParserFactory and META-INF/services/org.openrdf.rio.RDFWriterFactory files, as common.jar does not include any parsers itself, so it shouldn't be declaring that it does.

The sesame distributed jar files all contain these files as necessary, so everything will function without them.

Blank node format does not include necessary "_:" prefix

From the Talis spec:

"Blank node subjects are named using a string conforming to the nodeID production in Turtle. For example: _:A1"

With RDFJSONWriter, blank nodes are output without this prefix, for example "A1"

FIx in RDFJSON?:

if (object instanceof BNode) {
valueObj.put("value", "_:" + object.stringValue());
} else {
valueObj.put("value", object.stringValue());
}

if (subject instanceof BNode) {
result.put("_:" + subject.stringValue(), predicateObj);
} else {
result.put(subject.stringValue(), predicateObj);
}

Sesametools 1.5 release

Could you do a 1.5 release of sesametools so that all sesame dependencies for people using sesame 2.5.0 are the same. I don't mind have snapshots dependencies for development but I would prefer not to keep them for releases.

Thanks,

Peter

Test against Oracle JDK8 in Travis

Travis supports a recent Java-8 build. After testing locally to make sure our Javadoc is up to scratch [1], we should add it to Travis to enable regression testing against Java-8 to identify any incompatibilities.

[1] http://openjdk.java.net/jeps/172

Version bumping

I think we need to do a version bump, or at least switch to a -SNAPSHOT version for the develop branch.

Non-snapshot versions are generally thought to be immutable, or at most, only in limbo for a day or two.

I would prefer to release the current develop branch to master as 1.7 and then bump the develop branch to 1.8-SNAPSHOT.

Redesign Linked Data Server to extend Restlet Application

I was looking into creating a test case for the fix that Roland has submitted in #39 but it is a little difficult due to the use of singletons right now, and further the reliance on Component which is slightly harder to test compared to Application. Application can be embedded inside of Component for testing, or embedded within other applications to use Linked-Data-Server as a library.

The calls to the singleton can be replaced by extending Application, and calling Resource.getApplication() from within Resources, casting it to the extended Application class to see the accessor methods such as getDatasetURI and getSail.

NQuads and RDFJSON modules are ported to Sesame Rio in Sesame-2.7

Both the NQuads [1] and RDF/JSON [2] modules now have implementations to be released in Sesame-2.7.

The RDF/JSON implementation in Sesame Rio is a Jackson based streaming implementation that is faster than the org.json non-streaming approach in my tests. If you don't mind having Jackson as a dependency then it would be useful to deprecate the implementation here and switch to using the Sesame Rio implementation

The NQuads module is implemented in a fairly similar manner to here so there should not be any difficulties there.

[1] https://openrdf.atlassian.net/browse/SES-802
[2] https://openrdf.atlassian.net/browse/SES-1784

Sesame-2.8 / RDF-1.1 changes

As discussed in #37 , now that Sesame-2.8.0 is released, we can think about the design of SesameTools-2 (assuming that it would be a useful strategy to keep the RDF-1.0 code separate from the RDF-1.1 code).

The two RDF-1.1 changes are the removal of plain literals, which are now always typed with xsd:string, and the addition of rdf:langString as the datatype internally for language tagged literals.

The main non-RDF-1.1 changes are the introduction of transaction isolation levels and changes in dependencies, including the apache httpclient library that may have other dependency effects.

RDF4J conversion

General issue for the conversion to RDF4J.

Other than the technical changes that should be fairly straightforward, should we change the name to something not including the word "sesame" anymore.

Implement JSON-LD for Sesame

This isn't an urgent issue, particularly as the JSON-LD format is still undergoing a lot of changes, but it would be nice to have a sesame parser and writer implemented in sesametools at some stage.

Current implementations in java (may not be up to date with the latest specification):

https://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/jsonld/src/main/java/org/apache/stanbol/commons/jsonld/JsonLd.java

https://code.google.com/p/iks-project/source/browse/sandbox/fise/trunk/jersey/src/main/java/eu/iksproject/fise/jsonld/JsonLd.java?spec=svn1048&r=1048

ValueComparator and SPARQL 1.1 ORDER BY

Our current ValueComparator basically matches the SPARQL 1.1 ORDER BY precedence rules.

http://www.w3.org/TR/sparql11-query/#modOrderBy

The current implementation follows it quite closely already, using NULL>BNode>URI>Literal , but the rules for Literal Datatype precedence are a little more complicated than the textual sort on the datatype URI followed by lexical sort that we are using so far.

Basically, the Operator Mapping table (link below) is sorted from top to bottom by precedence, and the actual values are sorted based on the native XML Schema datatypes, with other datatypes sorted last based, possibly, on lexical comparison of the datatype URIs.

http://www.w3.org/TR/sparql11-query/#OperatorMapping

We could have an extension method in ValueComparator that could be overriden to provide other sorting for literals with unknown datatypes, and by default sort by lexical comparison in the base ValueComparator for unknown datatypes.

For instance, we could move the current implementation down to another method that may look like:

// Note: Override me to provide different behaviour
ValueComparator.sortUnknownDatatype(Literal literal1, Literal literal2)
{
// By default sort based on literal1.getDatatype().stringValue().compareTo(...) and then literal1.stringValue().compareTo(literal2.stringValue()) if the datatypes are equal
}

Alternatively, we could add a method sortLiterals(Literal literal1, Literal literal2) and subclass the current ValueComparator to SparqlCompatibleValueComparator to override sortLiterals and provide compatible behaviour with SPARQL 1.1.