Code Monkey home page Code Monkey logo

trellis's Introduction

Trellis Linked Data Server

A scalable platform for building linked data applications.

GitHub CD Coverage Maven Central

Trellis is a rock-solid, enterprise-ready linked data server. The quickest way to get started with Trellis is to use a pre-built docker container.

Trellis is built on existing Web standards. It is modular, extensible and fast.

All source code is open source and licensed as Apache 2. Contributions are always welcome.

Docker Containers

Docker containers for Trellis are published on Docker Hub. Container environments are published with every commit to the main branch and are available for all stable releases. More details are available on the Trellis Wiki.

Docker pull command

docker pull trellisldp/trellis-triplestore

Or, for the PostgreSQL-based persistence layer

docker pull trellisldp/trellis-postgresql

Building Trellis

In most cases, you won't need to compile Trellis. Released components are available on Maven Central, and the deployable application can be downloaded directly from the Trellis website. However, if you want to build the latest snapshot, you will need, at the very least, to have Java 11+ available. The software can be built with Maven using this command:

./mvnw install

Related projects

trellis's People

Contributors

acoburn avatar ajs6f avatar christopher-johnson avatar dependabot-preview[bot] avatar dependabot[bot] avatar gregjan avatar jholleran avatar mend-bolt-for-github[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trellis's Issues

Add an OSGi-based deployment option

It should be possible to deploy the Trellis triplestore-based implementation entirely in an OSGi container. This would involve writing up an OSGi-based wiring (e.g. Blueprint) and including some PaxExam tests.

Ideally, this will end up as a deployable kar file that one can just drop into an existing Karaf instance.

Add a Resource::isDeleted method

After a Resource is deleted, subsequent requests for the resource can result in either a 404 Not Found or a 410 Gone response, depending on what the ResourceService implementation supports. The 404 Not Found case is simple: a ResourceService::get request returns an empty Optional. However, for the 410 Gone case, the code currently does some convolutions that involve checking for <> a ldp:Resource, trellis:DeletedResource triples. While this works, it is not a very clean design. It would be much better for the Resource interface to just include a method such as:

default Boolean isDeleted() {
    return false;
}

This way, if a ResourceService implementation has some special rules around deleted resources, those rules can be restricted to that part of the code (as opposed to leaking into the HTTP layer, as it does now).

`ResourceService::getIdentifierSupplier` vs. `IdentifierService`

ResourceService features Supplier<String> getIdentifierSupplier(), but IdentifierService features Supplier<String> getSupplier() and two other similar methods.

Is this an oversight in some sense, or are the semantics of the two distinct? If we can explicitly lay out those semantics, I'll enrich the Javadocs appropriately.

Rework Memento-related interfaces

The current Resource interface contains a method for retrieving mementos: getMementos. While this can certainly work (it works fine for the kafka-based implementation), it means that the resource subsystem is tied to the versioning subsystem, which may not necessarily be the case. At least, I can think of cases where the two could easily be separated. As such, I would like to propose removing the Resource::getMementos method, and adding something similar to the ResourceService interface.

Trellis Prefix

The TRELLIS_PREFIX value is currently set as trellis:. This leads to root containers with only that value. While Jena allows this as a valid IRI, not all commons-rdf implementations do. It would be much better to start using trellis:data as a root container. This will likely involve removing the TRELLIS_PREFIX constant and adding two new constants:

TRELLIS_SCHEME = "trellis:"
TRELLIS_DATA = TRELLIS_SCHEME + "data"

Ignore server-managed triples on PUT and POST

The LDP specification allows servers to ignore server-managed triples on PUT and POST. It would be convenient for Trellis to ignore ldp:contains triples on mutating requests, which would make it possible to retrieve a container resource, make changes locally and then PUT that resource back to the server (and to do so without having to filter out ldp:contains triples).

Add support for WebSub

WebSub is a W3C recommendation and it would be very easy to add support for this to Trellis (as a "publisher"). This would involve generating an additional Link header in responses.

The WebSub specification requires two link headers in responses from publishers, (e.g. for the resource /container/resource and the hub at /pubsubhub/url):

Link: <https://example.com/pubsubhub/url>; rel="hub"
Link: <https://example.com/container/resource>; rel="self"

It would likely be easiest to add this as a JAX-RS filter, which would keep it entirely separate from all of the existing implementation code. The implementation would likely be quite similar to the current CacheControlFilter -- the constructor would accept the location of the WebSub hub and that would be used to build the header in the filter method.

The second part of this (which could be part of a second ticket) would be to use a JAX-RS http client to send the WebSub hub notifications of the resource changes. These notification will take the form of a POST operation with the form values: hub.mode="publish" and hub.url=(the URL of the resource that was updated)

Support random access to Binary resources via the BinaryService

The BinaryService interface offers a single method for fetching resources:

Optional<InputStream> get(IRI);

In the case where a client requests a partial resource (via Range header requests), the HTTP layer currently requests the entire Binary as an InputStream, and it then drops any non-relevant segments. While this works, it is not efficient for very, very large binaries which may be partitioned into multiple blocks across different storage locations. It would be much more flexible to extend this method into something like:

Optional<InputStream> get(IRI, Range...);

Where Range could be a type such as a pair of Integers. Or perhaps there is an appropriate built-in type or something from a Commons library. Or it may be necessary to add a new type to the Trellis API (or better: generalize an existing type, such as o.t.api.VersionRange).

I am suggesting here a variable number of Range objects since the HTTP specification for range requests supports the possibility of multiple ranges.

Superclass for `*HandlerTest`?

There are *HandlerTests for all the various PutHandler, DeleteHandler, etc. I suspect (might be wrong!) that we could abstract over them and pull up some of the tests using some generics tricks, and this issue is a note to self to try it.

Add a WebSocket-based notification system

The SOLID specification recommends using WebSockets for notifications, and I think this would be a good way to provide an embedded message broker for Trellis.

There are a few outstanding questions to resolve w/r/t implementation. First, SOLID suggests making websocket endpoints available on each resource. This seems like a very interesting idea, as the notifications can be scoped to a particular resource or container of resources.

  1. If a websocket is initiated at resource /foo, should notifications from all contained resources be made available (i.e. recursively?) or just those that that relate to that resource/container?
  2. If 1. is answered in the negative, is there a resource location where all notifications can be accessed?

Add session information to mutating ResourceService methods

The mutating ResourceService methods (::create, ::replace and ::delete) will cause events to be emitted. Before the recent refactor in #31, this data had been available in the Audit-related triples, but now that data is only included in the ::add method. By including a Session value (or similar) in the mutating methods, it would be possible to get access to these data.

This would also be an opportunity to remove the use of the default graph in these methods, the purpose of which relates only to providing data to the event producer. This likely will require adding a field (e.g. baseURL) to the Session type. This will also lead to some cleaner code in the HTTP layer.

Upgrade to latest commons-rdf

Commons-RDF 0.5.0 is currently being voted on. Once released, the trellis code should be updated to use this. With commons-rdf-api/0.5.0, it will be possible to start offering OSGi deployment support in Trellis.

HTTP/2 over TLS support requires ALPN dependency

In order to make trellis-app with dropwizard's h2 support configuration in JDK 9 or JDK 10, I added a dependency:

compile group: 'org.eclipse.jetty', name: 'jetty-alpn-java-server', version: '9.4.8.v20171121'

This version is built for JDK 9, but it works on JDK 10.

Add support for JPMS

I am experimenting with supporting the module system with trellis. I will use this as a tracking task.
One can view the progress here

Here is what I have discovered so far:

  1. Error occurred during initialization of boot layer java.lang.module.FindException: Unable to derive module descriptor for /home/christopher/.gradle/caches/modules-2/files-2.1/org.apache.geronimo.specs/geronimo-annotation_1.2_spec/1.0-alpha-1/804747c40f1145ae9cc13cb9e927fca82e6e3c1b/geronimo-annotation_1.2_spec-1.0-alpha-1.jar Caused by: java.lang.IllegalArgumentException: geronimo.annotation.1.2.spec: Invalid module name: '1' is not a Java identifier
    This seems to be because of an "illegal" artifactID ("geronimo-annotation_1.2_spec" has a "."). This is a dependency of apache tamaya. The module name is derived from the jar if there is not an automatic module name in the manifest. Not sure how to move forward with this...but it is a problem for all geronimo specs...maybe file an issue upstream?

  2. gradle jar tasks seem to break module-info resolution. Removing them allows the build to proceed.

  3. the servicemix bundle wrapper for javax.inject does not resolve as a module. Quick fix is to add javax.inject as a dependency and add it as a requirement instead. Not clear what this will do in OSGI.

AS message produced without provenence / type

I am working on a new camel-kafka-elasticsearch integration and have noticed a possible issue related with the triplestore resource service event implementation. For some reason (as yet unknown) the event type (e.g. https://www.w3.org/ns/activitystreams#Create) is empty.

Here is an example AS message sourced from Kafka for reference:

{
  "@context" : "https://www.w3.org/ns/activitystreams",
  "id" : "urn:uuid:525573b5-f20f-489d-99dd-33668d3534bb",
  "type" : [ ],
  "object" : {
    "id" : "http://trellis:8080/ldp-test-6b56d629-3150-4ff5-8f69-1efa672f60fb",
    "type" : [ "http://www.w3.org/ns/ldp#RDFSource", "http://www.w3.org/ns/oa#TimeState", "http://www.w3.org/ns/activitystreams#Application", "http://xmlns.com/foaf/0.1/Person", "http://www.w3.org/ns/oa#Choice", "http://www.w3.org/ns/oa#HttpRequestState", "http://purl.org/dc/dcmitype/Sound", "http://www.w3.org/ns/oa#SpecificResource", "http://www.w3.org/ns/oa#TextualBody", "http://www.w3.org/ns/oa#Annotation", "http://www.w3.org/ns/oa#TextPositionSelector", "http://www.w3.org/ns/oa#FragmentSelector", "http://www.w3.org/ns/oa#CssStyle" ]
  },
  "published" : "2018-03-07T07:15:42.159590Z"
}

The target types are populated. The triple is created. Could have a simple explanation, I will keep looking at it.

Clarify semantics of Future<Boolean> in ResourceService responses

Mutating requests in the ResourceService respond with a Future<Boolean>. The HTTP layer currently handles these responses in the following way:

Create:

Future<true> => 201 Created
Future<false> => 500 Server Error
RuntimeException => 500 Server Error

Replace/Delete:

Future<true> => 204 No Content
Future<false> => 500 Server Error
RuntimeException => 500 Server Error

The Future<true> and RuntimeException cases seem correct, but it is unclear to me whether a Future<false> ought to return a 5xx error. Perhaps a 4xx error would be more appropriate. If so, does a Future<false> indicate some form of conflict (e.g. 409) or is it a generic 400 error?

Response headers in Create-on-PUT

At present, it is possible to create resources with HTTP PUT. As with all HTTP PUT operations in Trellis, the response code is 204 No Content, but for resource creation operations, it seems that responding with 201 Created would be more accurate.

Also, if the response is 201 Created, the Content-Location: header (with the value of the resource location) ought to be included.

building membership and containment messages with TriplestoreResourceService?

From what I can gather, the TripleStoreResourceService is publishing notifications to a single configured topic (default "trellis"). This is different than the FileResourceService that has an EventProducer to build containment and membership messages that are published to distinct topics read by the async processor.

Not clear to me with TripleStoreResourceService how this works. In brief evaluation today, there does not seem to be a "built-in" mechanism to do this in trellis-app yet. Is this accurate? Can you explain the design intent for this briefly? Thanks!

Simplify constraint service interface

The current constraint service interface accepts a baseUrl in the ::constrainedBy method, but in practice this is just the internal data IRI. That is, this argument could be removed without changing how the service functions.

Semantics of `Future<Boolean>` for persistence

If a persist call (that returns Future<Boolean>) is completed unexpectedly (in my example in hand, an InterruptedException which does not indicate failure of persistence, but failure of the thread monitoring persistence, since the actual persistence is happening elsewhere on the network) should true or false result?

IOW, if the backend cannot give us definite information about the completion, what should it report forward?

I don't want to get into Future<Optional<Boolean>> weirdness. But maybe an enum Completion {Success, Failure, Unknown} or the like and return Future<Completion>? Or maybe there's a better way to say the same thing within the Java concurrency APIs…

`MutableDataService::create` vs `::replace`

If ::replace is called with an identifier for a resource that does not already exist, should ::replace return without trying to mutate anything (and presumably with a false-valued Future)?

I think so, but wanted to make sure.

Reconsider Commons RDF

I'd like to understand the value being added to Trellis by the use of Commons RDF. Using it introduces a huge number of short-lived objects, the project isn't very responsive, and the only implementation that we actually use so far is commons-rdf-jena. Is the expectation that people writing new modules for Trellis might want to use RDF4J or some more obscure RDF framework?

audit info inclusion

Does Trellis take the stance of adding audit info to a response unless told otherwise, or only adding it if told to (by a Prefer header)? I know I could trace out the code and see what it currently does, but I want to make sure I understand why.

Support multiple range request segments

The HTTP specification on range requests allows for multiple, non-contiguous segments in a request. E.g.

Range: bytes=1-100,301-400

The Trellis HTTP layer currently only supports a single range (i.e. in the header parsing logic) though the underlying APIs could support an arbitrary number of ranges. It could be useful to support range requests that include multiple ranges.

Relax constraints on rdf:type

At present, the constraints module forbids setting any rdf:type triples where the type is in the LDP type domain. The idea here was to encourage clients to use Link headers for setting the resource types.

However, this seems to be rather heavy-handed. I would rather remove any sort of "type restrictions"; that is, users should be able to put whatever RDF in a resource they want to, even if that RDF makes no sense -- it shouldn't be up to Trellis to enforce such things.

That is, if a client wants to create a ldp:BasicContainer resource (via link headers) that contains the triple <> rdf:type ldp:DirectContainer, so be it. The LDP spec is clear on this point: the Link header always wins.

A middle position would be to allow setting rdf:type triples in the LDP domain but only if that type is the same as the resource's interaction model or some subtype thereof. My counter-argument to that is just that doing so will add more complexity to the code and I really don't want Trellis to be in the business of policing rdf:type semantics.

Add Memento tests

The trellis-app module now contains some good LDP-related tests. It should also have some Memento-specific tests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.