atomicdata-dev / atomic-server Goto Github PK

An open source headless CMS / real-time database. Powerful table editor, full-text search, and SDKs for JS / React / Svelte.

Home Page: https://atomicserver.eu

License: MIT License

Rust 31.42% HTML 0.49% TypeScript 66.95% JavaScript 0.49% CSS 0.34% Earthly 0.31%

actix atomic-data cli database library linked-data rdf rust server sled

atomic-server's People

Stargazers

Watchers

atomic-server's Issues

Check signatures before applying commit

atomicdata-dev/atomic-data-docs#14

Also relates to #13 and #26

Following implementations required:

serialize_deterministically to generate a string that can be used to
validate_signature function on Commit, using serialize_deterministically and a public key from an Atomic Author.

TPF query not filtering by both value and property

The storelike.tpf function filters by property or value just fine, but combined it combines both instead of being more restrictive.

Validation endpoint does not check requried attributes for classes

E.g. just cras

And this crashed the server

Make HTML an optional feature (maybe) in atomic-server

The server currently requires a /static directory and a /templates directory. These are requires because of the HTML output of the server. This makes it harder to install this project, it adds complexity and it impacts bundle size. If people just want to host .AD3 - they should perhaps be able to. Also, the default HTML resources are kind of branded and are probably not suitable for other usecases than atomicdata.dev.

However, I think that it makes sense to provide some HTML output by default, because this provides a human-friendly interface for checking the Atomic Data URLs.

Sort propvals alphabetically

And show description on top, always

Interaction between CLI and Server

The CLI and Server both use the same atomic_lib library. Currently, both even use the same store. Both can write to the local store directly. Since they both require a file lock, this means that you can't run the server and the CLI from the same machine. That makes the CLI pretty much useless.

I think the CLI should be a companion tool, which does not use local storage. This means that running the delta command, for example, should change deltas on the server. To do this, the CLI should be linked to a specific server. This means the CLI needs some setup and authentication.

The Server has no authentication at this point (and no write capabilities at all), but I'm expecting to implement OAuth / OIDC sometime in the future. The CLI might get a token using this process, and store that in the config folder (combined with the server URL) for future use.

Collections, pagination, sorting

Collections are (dynamic) lists of items that add pagination and sorting.

A few things that I'd want to be able to do:

Visit https://atomicdata.dev/classes and see all the classes
Visit https://atomicdata.dev/collections and see a list of all collections
Create these collections using something as atomic-cli new collection - not requiring any programming
be able to browse large collections using pagination

Collection resource incomplete

e.g. here: https://atomicdata.dev/properties

Missing properties:

isA (Class)
pageSize

Owned (local) and borrowed (external) resources

We can divide Resources on a Server in two categories: owned (local) and borrowed (external).

Owned Resources:

Can be updated locally, but these changes (Deltas) should be shared with subscribers
Should not be deleted during cleanup / to save disk space

Borrowed Resources:

Can only be updated by their owner. This can happen at any time, and the server might want to listen to these changes.
Can be useful to cache, but can be deleted at any time - they should still be accessible on their URL.

Currently, both Store implementations (on-disk and in-memory) store Owned resources using a special subject: _:somename, but this is not a fitting solution. This means that serializers need to be aware of the current domain, and they have to replace the subject.

I think the store should be aware of the current domain (where am I hosting my stuff?).

Also, maybe the owned resources should be stored in a different tree than the borrowed resources. This would simplify clearing the cache.

Long term goal: run atomic server on old smartphone

If we want people to take back control over their data, a nice solution would be for people to host their own server. Atomic-Server has been designed with this goal in mind, which means: no runtime dependencies, WASM / WASI-compatible, lightweight.

This would be the ultimate UX for hosting your personal data: grab an old phone, install Atomic-Server from the play store, choose your own domain name (or subdomain, or DID, or IPNS - anything that would resolve to your server), hook it up to your old charger.

Some challenges:

Make the server available without requiring port forwarding. This process would make things way too difficult for the average consumer.
Use or build some tunneling server, which deals with serving using DDNS / dynamic IP addresses. This should probably also deal with subdomain registration. Perhaps use a tunnel broker.

Reimplement cli `new` command to work with commits.

The new subcommand was removed when the cli was changed to a client-only tool. It needs to be reimplemented and made visible again.

db.remove() panics on non-present resources

Make hosting atomic data as easy as possible

Atomic Data is all about data ownership and control. To do this, we need to make it as easy as possible to host an atomic server somewhere. That's one of the reasons I've picked rust as a language - it compiles to pretty much any target, including WASM. But if we really want to achieve maximum adoption, we need to eliminate as much barriers as possible. There are various approaches to this problem:

Server ease of use

Make it as simple as possible to run and setup atomic-server. Eliminate as much mental energy and time required from the setup process. Make it compatible with most machines. Eliminate runtime dependencies. Provide several options that people are familiar with. Use sensible defaults instead of asking too many questions.

Docker support #69
Docker ARM support #80
Atomic-Server compiles to ARM
Publish pre-built binaries for various targets (this is currently a manual process)
Publish to stores (brew / snap / mac app store / windows store / whatever) #117
GUI all the things
- Tray icon for desktop #75
- Zero-CLI setup (open browser, enter form, you're done) #215
#152

Cloudflare Workers + KV

Cloudflare has this really interesting product, called Workers, which lets you run JS and WASM in a serverless context. This means it is way easier to manage than a VPS. It offers storage using KV, which has no native support for Rust ATM, although it can be accessed using wasm-bindgen.

The logic currently present in Atomic-Server is kept as minimal as possible (most resides in atomic-lib), and could probably made in Cloudflare Workers, judging from this todo example.

Running Atomic-Server on a smartphone

#25

Most people have an old smartphone in a drawer somewhere. These are powerful enough to run atomic-server. So first, we need some way of running it on smartphones. I'd rather not create new swift / java apps for both iOS and android platforms, so I prefer running the existing stuff on Android.

Luckily, atomic-server does compile to arm. This should open some possibilities. Couple of ideas:

Run linux on android (using gnuroot)
Using the android JNI interface to interface java - rust.
Using flutter + wasm. Would be ideal (just one codebase to maintain, both for UI and server). Still requires a solution to actix not having WASI compatibility, though.

Commits are not validated for required Properties

If I post a Commit for a resource that is an instance of some class, I might expect it to be validated for required properties. However, this is not yet the case. I think a store should guarantee that every commit does not result in incomplete resources. This also means that it is not always possible to remove a single property from a resource - it will fail if it results in a missing required prop.

Merging collections and the /tpf endpoint

Collections and the /tpf endpoint have some similarities. Both allow for filtering all triples by property / value. However, collections are more powerful, as they also enable pagination and sorting.

The /tpf endpoint currently uses a lot of custom logic, which might not be needed anymore if we switch to using collections.

Also, the subject field in TPF is not really useful in Atomic data, except when you want to find an Atom by subject and value.

remove &mut for regular operations (get)

When getting a resource from a store, the store might not have it locally, and will therefore try to fetch it by sending a request to its subject URL. When this happens, we want to store the newly fetched data to our store for caching. Because of this, any get command can also mutate the store. This is kind of not cool, because &mut references are exclusive. I'm kind of new to rust, but I believe this will prevent that multiple threads can do anything with this same store. This could cause latency spikes that will affect performance.

atomic_lib uses Sled, an embedded Rust key-value database, and it works entirely without mutable references. There must be something to learn here... Interesting video by the creator of sled touching this subject here.

Recognize mime / filetype extensions in URL for supported formats

E.g return a JSON file when user visits https://atomicdata.dev/classes/Class.json. Split URL path by . when user sends HTML request.

Implement prompt in atomic-cli for Timestamp, boolean

Atomic Data in the browser

I want to render Atomic Data in more fancy ways than what is currently possible. Currently, all resources use the same HTML renderer. It simply iterates over all Property/Value combination. However, some classes of resources would be way more nice to look at (and easy to grasp) if they had their own views. A Class view, for example, would be helpful for explaining Atomic Data concepts. It would probably always show the title and description at the top, followed by a list of recommended and required properties - each of which shown with a human friendly name, instead of a URL.

How can I achieve this?

Using the current stack + Tera template engine

We could let users create views in Atomic-server, store them as Tera templates in atomic data.
Doing this might require various things:

Allowing custom views to be registered for certain Classes
A mechanism for selecting the right registered view for a specific resource
Traversing the graph when rendering (e.g. showing the title of a property)
Store Tera templates as atomic data

Doing these things in the current stack could be possible, but would require quite a bit of logic for each view. Especially graph traversal seems complex - Tera might have to call functions in rust, which is not possible, or we should pass a very deeply stored nested graph to account for nested items... Which might not even be a bad idea, btw.

In any case, it would not be very useful for most developers - maybe only for static site generation, but not for more dynamic apps.

Using existing JS libraries

I want to provide front-end developers with a simple way of constructing their own custom views. One approach would be to use my colleague Thom's Link-Redux library, which offers most of what I need. However, it's also quite tightly integrated with RDFS - it uses RDF:Type for determining a view, but we might be able to work around this.

Creating a new atomic-JS library

This will probably happen some day, but it will take a lot of work, and I'd rather focus my attention on getting the current library to a higher level.

Mapping atomic to JS using wasm_bindgen

Another approach would be to use this rust based library and use wasm_bindgen to make functions available in a JS context. The obvious advantage is that I can still use this rust library, but I'm not sure what kind of problems I might encounter. All available functions need explicit js mapping using wasm_bindgen, and I'm pretty sure some of the current used libraries don't even compile to wasm in the browser. I also have to change the HTTP client, to use the browser one using web_sys.

Using a rust front-end framework

Rust in the browser isn't very popular, but it's possible. Some frameworks, such as Yew, seem to work pretty well.
However, this will limit how many people will want to work with it.

Passing non-url (or nonexistend shortname) in URL prompt freezes atomic-cli

Stores need to be aware if they are responsible for hosting data on a domain or not

When a Commit is being made to a store (when data is changed), this store will need to know if this Commit has to be stored locally, or if it has to be sent to some external server.

I think the primary concern should be getting the names of these concepts really clear. Some of the terms I'm currently using or considering:

Base URL
Server
Pod
Companion server
Target store

Maybe this can be resolved by simply adding a self_url property for servers.

Perhaps related: it might be a good idea to refactor store.set_default_agent, Agent and Config. Config overlaps heavily with Agent.

Commit builder API - linking resources to commits

Atomic Commits describe how a resource is to be mutated. A Commit might mean the resource should be removed, it might mean some fields will be added, it might mean a single field is changed.

Constucting these Commits should be simple. Ideally, developers should not have to deal with commits - they should simply call "destroy" on some instance and it should be removed accordingly.

However, sometimes developers will need to manually create these commits. For example if a developer tries to batch various changes instead of sending the commit after a single change.

This asks for a nice API for building / constructing these commits.

let commit = Commit::new("mySubject");
commit.set("someprop", "someval");
commit.set("otherprop", "otherval");
commit.send();
// send makes sure the signature and timestamp are correct, and it's sent to the right places.

Async support for atomic-lib

Since Atomic Data uses a lot of data fetching, some calls might take a while. Sometimes, multiple resources will need to be fetched at once. Instead of executing each request serially, having an parallelized async process for this would minimize execution time. Currently, every call is blocking in atomic_lib, and the HTTP library ureq is sync, too. Perhaps this will cause issues, but we can change ureq for some other HTTP library.

I'm kind of new to rust, and even more so to async programming, so I need to write down my thoughts on this. First, let's identify which processes might have the highest 'blocking' time. E.g. serializing internal AD3 to JSON. This requires all properties to be fetched, and these might not have been loaded yet.

Now, if we'd start implementing async, a lot needs to happen:

ureq has to be replaced by some async HTTP(S) library, preferably with rustls support. Maybe reqwest or surf?
a lot of the current functions (serializers that depend on store, get_resource) will have to become async, which will also 'infect' all functions that might use these.
the Storelike trait (and perhaps more) need to use async_trait

Fully automated HTTPS certificate generation

Currently, most of the HTTPS / SSL process is automated, but it still requires manual reboot and .env changes. Could be simplified by checking presence of key / cert file.

Commit (delta / mutation) endpoint

Relates to atomicdata-dev/atomic-data-docs#11

Destroy command not destroying

Commit is stored properly, but resource still exists

e.g. https://atomicdata.dev/commits/E053mWJejjkxSWLGZeVxAx6vBiifuCMU/GDAFTTP+vXy8psz1xxn1cv4J8D8nh6Jn5M1fiB/3lWmsHJ/HyEaAw==

Adding new data results in recursive process / infinite loop

Say we have an empty store, and we add the description property. Before saving this to the store, the store needs to known which datatypes the properties of description uses. So it will need to fetch these datatypes. But these datatypes have a description! We have circle. And the store cannot save anything, because both depend on each other.

How to fix this?

Get rid of ResourceString

Adding default collections

When you start an Atomic Server, there are no functioning collections. The only functioning ones are hardcoded for AtomicData.dev.

Approaches

extend storelike.populate() to include some collections.
?

Full-text search

Being able to search through data inside your personal atomic server seems like a nice feature to have. In this issue, I'd like to explore the requirements and some possible approaches for implementing a full text search service.

Wishes

Find specific resources and their URLs extremely fast. This enables using it in things like semantic document editors, so instead of simply typing a word, we're inserting a specific URL of some thing. This will mean that items that are often used will come out 'on top'. To do this, we need a sorting algorithm that uses historical searches and selections. We also need some form of autocompletion.
Be able to either ignore or include types of resources. Perhaps I'm looking for a specific person, and I don't want to see all documents with that persons name inside it.
A persons first name in a "firstName" property should weigh more than that string in a "description". Perhaps make character count diminish relevance score.
Iterative indexing on Commits / when resources are added to the store.

Approaches / implementation ideas

The API itself should use atomic data, too. The Collections model is probably useful here, since it introduces paginated content.
The Sonic crate offers performant search features, and simply returns a URL. The developer is planning on making it embeddable.
It would be nice if it works as an installable plugin. This would require that plugins function as some sort of middleware handler, offer a custom endpoint, be able to write / access data.

Auto renew Let's Encrypt HTTPS certificates

The current implementation does not renew certs. How to approach this?

First we need to decide how this process is initialized.

Check every time the server starts if the certs are still valid. This means that a running server will at some point have to be rebooted. Not great, but not horrible either.
Checking if the certs are valid can be done by checking when the cert was initialized.
Perhaps a simpler way to approach this is to always ask for a new HTTPS certificate when the server starts. This should work fine, as longs as the server isn't rebooted too often in HTTPS mode (10x per 3 hours).

Update: tried some things, but renewing the certs when the server is running seems a bit difficult. Running some logic when the server starts is difficult, because actix is running and basically uses all the threads for doing server stuff. I'd have to manually spin up a single thread for HTTPS checks and all that. Another way is to have a handler that responds to external requests for renewing the certs, but this seems cumbersome. Simply removing the .https folder and starting the server again works fine, for now.

Setting Root Agent

Any change to data needs to be signed by some Agent. As Agents themselves are just resources, they need to be created by some other Agent. I'll call the very first Agent the Root Agent. The Root Agent is a user that has the highest of rights - e.g. create admins, destroy everything. So how should a Root Agent be created? Some options:

Create a root agent when instantiating a new server with a new database. Pass the private key and the URL to STDOUT, and let the user copy it somewhere safe. Seems a bit ugly (unclear, unsafe), but simple to implement.
Provide a CLI command in atomic-server for creating Agents, e.g. atomic-server agent new. A bit more of a hassle, but I'm very likely to need more CLI tooling in server at some point anyhow.
Read a secret seed from .env which generates the root Agent.
Have some setup endpoint used for generating the first Agent. Seems user friendly for most, but hard to control by machine.
Check the default config folder ~/.config/atomic, and check for a config file. If it does not exist, create one with some defaults and a newly created author.

I think multiple of these should be possible.

Relates to client server interaction #6 and authentication #13. Also to atomic-data authorization.

in-mem cache for constructed properties

Many calls rely on Properties. Every time you set some value using a string, for example. This means that the get_property method in Storelike is called a ton of times. This method currenlty relies on fetching a resource and converting it to a Property.

It's pretty fast, but it can be way faster if we memoize the Properties in a Hashmap.

It might also be worth considering to change the Property URLs to u8 when storing to the database. We could have a map for every u8 to URL, which helps compress the data. However, it will make debugging harder and could introduce data loss if we ever have issues with this mapping table.

Enable init from any path - embed initial store + mapping in binary

https://github.com/pyros2097/rust-embed

RDF serialization

Although the library has its own JSON-LD and AD3 serialization, and a basic N-Triples serializer, it lacks valid Turtle / RDF/XML serialization. A library such as Sophia should make this easier.

Merge Resource::resolve_shortname and get_shortname

Both have complementary and overlapping functionality. These should be merged

Repurpose Validate

The validate function used to be handy, as the store contained (unvalidated) string representations of atoms. Now, however, it only contains valid data, and Validate kind of lost its purpose. Perhaps I should remove it entirely, but It also had its use. It was able to detect multiple issues with some graph, and generate a report. Now, it throws an error as soon as it detects one.

Link Resources to Store

When a Resource is updated, the Store should be updated, too. The developer using atomic_lib should have minimal overhead while interacting with the resource - they should be able to simply update a prop of the resource, and it should be applied to the store, too.

Error handling in atomic_lib - specify and limit error types

Current error is a dyn Box, which means it's impossible to expect what kind of error will be emitted. Let's limit with some Enum error type.

Better view for Collections (tables? lists? cards?)

One of the Classes that you're likely to spend a lot of time in, is the Collection. Currently, it looks like this:

Improving on this can't be too hard, but what is the best type of view? For most views, the Class will matter a lot in what makes sense. If we render a list of TODO's, we might want to see their name and whether they're completed. If we render a list of people, we might want to see their profile pictures, but not read their entire biography.

In any case - I don't want to make highly domain specific views, such as the ones mentioned above. I think Tables are a good place to start, because they are versatile and powerful.

Class renderers

When a request for some collection is parsed and processed, the Server automatically starts rendering a Tera Resource template. Instead of this, we might need a match statement that checks for existing custom class renderers (#37)

Table

Rendering SQL data in a table is trivial, since the database handles a strict schema. Atomic Data, however, also allows any other property on any resource. This means that any one resource can have an almost infinite numer of properties. We could solve this by using the class selection in Collections (when they are available). We could render the class' required props first on the left, and the recommended on the right. Another approach is to render all encountered props, and let the table grow to the right.

Table view shows wrong header properties for commits

ValidationReport

The store.validate function should:

(be able to) Check if the URLs resolve
Generate a report that provides users valuable feedback

So the server could have:

A text input field which accepts AD3, and on submit performs the validation checks and shows a useful report

Save reference to Store inside Resource

Many methods of a resource will require a store. Every time the resource is updated, the store needs to be updated. This currently means that a store has to be explicitly passed to every mutation to a resource. Not cool! We can fix this by saving a reference to the store.

Persisting Commits

Commits are atomic changes to a resource. If these are persisted, we can use these to generate logs and previous versions. Pretty cool.

I think it makes sense to simply store these as resources (since they are just that) - not as some special struct.

One thing I'm still pondering about, is how to deal with failed commits. For now, if a commit cannot be applied, it will not be stored. This works fine for synchronous APIs (e.g. posting a Commit to a /commit endpoint), but this would not work when these commits are sent in bulk.

Make Commit convertible to Resource (requires nested resources #16)
Persist them when a delta is successfully applied (in store.commit())

For the future:

Generate commits when importing atomic data (parsing ad3, for example)
Limit write access to the commit API (maybe)
Start thinking about how to generate older versions of resources, and how to query these resources (probably TPF)

Implement get

Fetch resource from cache with fallback to fetch from web

Get rid of `where Self: std::marker::Sized`

I'm still learning Rust, and that means that I'm still not fully understanding traits and object safety.

We have a Store struct based on the Storelike trait, which contains all the data. We also have Resource, which have a reference to a Storelike.

pub struct Resource<'a> {
    propvals: PropVals,
    subject: String,
    classes: Option<Vec<Class>>,
    store: &'a dyn Storelike,
    commit: CommitBuilder,
}

This enables a more convenient API, so we can have:

resource.set_prop("prop", "val")

instead of requiring the store in every single call:

resource.set_prop(store, "prop", "val")

Since our Store is a trait, it's size is not known at compile time. From what I understand, this means that in methods on the Storelike trait where we create Resources, we need to explicitly state that the Store (Self) is Sized:

fn create_agent(&self, name: &str) -> AtomicResult<(String, String)>
    where
        Self: std::marker::Sized,
    {
        let mut agent = Resource::new_instance(urls::AGENT, self)?;
        Ok(())
    }

This where clause is now needed in all functions that create resources, and I feel like that is wrong, although I can't say for sure that it is.

I thought it might make sense to require Sized in the Storelike trait, like this: pub trait Storelike: Sized, and although that removes the need for the where clauses, it introduces another problem. In every signature where a reference to the store is made:

impl Collection {
    /// Constructs a Collection, which is a paginated list of items with some sorting applied.
    pub fn new(
        store: &dyn Storelike,
        collection_builder: crate::collections::CollectionBuilder,
    )

...this error appears:

the trait `storelike::Storelike` cannot be made into an object
the trait `storelike::Storelike` cannot be made into an objectrustc(E0038)
storelike.rs(47, 22): ...because it requires `Self: Sized`

TPF indexing in Db

Although TPF queries are implemented, they are very slow and won't scale - a single TPF query iterates over all individual atoms in the store. To solve this, we need some type of index. Since we're using Sled, a key-value store, we can't use some SQL index, we need to build it ourselves.

One solution is to create two new Sled tree (a new k-v store). In the first one (for searching by Value) every k represents an Atomic Value, and v a vector of all subjects. In the second one for Properties, k = property, v = subject.

However, a very common TPF query will be like this: * isA SomeClass. If we only do above indexes, this will still be a costly query, because we'll still iterate over many resources - pretty much all resources will have the isA property.

We could improve performance if we'd also store the Property in the v fields mentioned above, instead of only storing the subjects. To prevent unnecessary data duplication / minimize storage impact, it might make sense to not store entire atoms, but to leave out the thing that's already known (the thing in the key).

A TPF query such as * isA SomeClass would probably start with using the ValueIndex, which return all SubjectProperty combinations. Then, the implementation will iterate over all SubjectProperties, filtering by property, returning all subjects.

I think Atomic Collections will rely on this query quite a bit: make a list of all Persons (or some class), sorted by some thing. This will do such a TPF query using the indexes, than returns all subjects.

Another possible optimization strategy is caching Collections (which internally use TPF queries). We could rebuild (or invalidate) them on Commits.

Change URL of a resource

Resources can get a new URL. The old one should redirect to the new one. I think this calls for a new type of method in Commits.

Add new method in Commits
Add move option to atomic-cli
Let old URL redirect to new one. This probably means creating commits for every (local?) resource that references the renamed one.

Storing & querying nested resources

All Atomic Data Resources that we've discussed so far have a URL as a subject.
Unfortunately, creating unique and resolvable URLs can be a bother, and sometimes not necessary.
If you've worked with RDF, this is what Blank Nodes are used for.
In Atomic Data, we have something similar: Nested Resources.

Let's use a Nested Resource in the example from the previous section:

["https://example.com/john", "https://example.com/lastName", "McLovin"]
["https://example.com/john https://example.com/employer", "https://example.com/description", "The greatest company!"]

By combining two Subject URLs into a single string, we've created a nested resource.
The Subjet of the nested resource is https://example.com/john https://example.com/employer, including the spacebar.

So how should we deal with these in atomic_lib?

Approaches

Store nested in their parent as Enum

In both Db and Store, this would mean that we make a fundamental change to the internal model for storing data. In both, the entire store is a HashMap<String, Hashmap<String, String>>

We could change this to:

HashMap<String, Hashmap<String, StringOrHashmap>>, where StringOrHashMap is some Enum that is either a String or a hashmap. This will have a huge impact on the codebase, since the most used method (get_resource_string) changes. Don't think this is the way to go.

Store nested in parent as Value

An alternative is to not store the string representations, but store the Values in the store. Currently, all Atom values in the store are strings. We could changes this to store Values. Some performance implications:

Serializing the string representations (ad3) would be slower.
Serializing to non-string representations would be faster (e.g. JSON)
Using the data in some structured way would be faster.
Adding data to the store from highly-optimized serialized formats (AD3) would be slower.

This would also

Store as new entities, with path as subject

In this approach, the nested resources are stored like all other resources, except that the subject has two URLs with a spacebar. This has a couple of implications:

When deleting the original resource, all its nested ones will not be deleted (but should be), so this requires some extra logic
When iterating over all resources, we can no longer assume that every single Key (subject) is a valid URL.

Store inside parent resource, with path in Property URL

Similar to the approach above, but in this approach we use the Property URL to store nested paths. Implications:

Iterating over the Properties will not result valid Properties - these must be split up.
Finding some nested value needs a range query: select all properties that start with some string

Store all Atoms as BtreeMap<Path, Value>

Perhaps it makes sense to store all Atoms in something such as BtreeMap<Path, Value>, where the path is the subject followed by a property path (one or more property URLs). This should work by using BtreeMap's (and Sled's) range function to select all the right properties.

API design

And what should the API for the library user look like? Should a nested resource be a special Value? This seems sensible. However, in reality it is just a regular AtomicURL.

Serialization

Let's go over serialization by example. Let's assume a Resource of a person with some nested friends.

JSON

This is the easiest. Simply nest an object!

{
  "@id": "https://example.com/arthur",
  "name": "Arthur",
  "friends": [{
     "name": "John"
  }, {
    "name": "Suzan"
  }]
}

Note that these nested resources don't need to have an @id field, contrary to the root resource. Their identity is implicit.

AD3

JSON has nice nesting, but AD3 is originally designed to be very flat. If we use the Subject field to store paths, we get quite long subjects. This gets a bit awkward:

["https://example.com/arthur", "https://example.com/friends", ["https://example.com/arthur https://example.com/friends 0", "https://example.com/arthur https://example.com/friends 1"] ]
["https://example.com/arthur https://example.com/friends 0", "https://example.com/name", "John"]
["https://example.com/arthur https://example.com/friends 1", "https://example.com/name", "Suzy"]

The first Atom seems entirely redundant - it provides no more information than the second two. However, leaving it out might cause issues down the line: imagine if I'd GET https://example.com/arthur, but the first atom didn't exist. It would return no atoms - it would be empty. In order to prevent this, we could tweak the store a bit, so that a GET will search for all subjects that either are the URL, or start with the URL followed by a spacebar.

Another approach might be to nest triples in AD3, too:

["https://example.com/arthur", "https://example.com/friends", [[["https://example.com/name", "John"]],[["https://example.com/name", "Suzy"]]]

But this, too, is ugly and not human readable. JSON might be the way to go.

Authentication and restricting read access

docs: atomicdata-dev/atomic-data-docs#55
front-end: atomicdata-dev/atomic-data-browser#108

~~Write actions should only be possible for authenticated users. Currently, I haven't even implemented write capabilities on the server because of the lack of authentication methods.~~ Write actions are now done using signed commits, so they're safe.

For reading items, it might be a good idea to start off with an OAuth 2.0 implementation, (some nice rust libraries exist, such as oxide-auth), but still seemd kind of complex. Perhaps, for now, it is good enough to work with some API key that is sent with every request to a protected endpoint.

Versioning, history, find commits from resource

One of the advantages of using and storing Commits, is that we have a full history of every single resource. However, we currently have no way of browsing this information.

Wishes

I want to click on a link on a resource page to see all versions.
I want to see a log of changes of some resource. This might be different from the versions.
I want to see a previous version of some resource, and I want to share its URL.
I want to restore a previous version of some resource, yet keep the previous changes.

Considerations

For any list of things (commits for some resource, we should use the Collections model.

Approaches

Versions endpoint

example.com/versions?subject=someresource lists all the version for the resource included in the Subject query parameter. It returns a collection. Perhaps first define the Endpoints feature?

User requests a list of versions for some resource
A collections is constructed, containing all (or some subset of) the existing versions, one for each Commit. Each version points to a URL of a resource that needs to be constructed (e.g. example.com/versions?subject=someresource&version=signatureHashBlablabla)
user requests a specific version
That version is constructed by the server, using the commits

Introduce sequential TPF queries

Knowing which commits are linked to a resource, can be done using two TPF queries:

Get me all the items where is-a = commit
AND subject = mySubject

Some time ago, the great Ruben Verborgh told me that you can actually convert (all?) SPARQL queries into a set of sequential TPF queries. That's pretty cool. How should we create an interface to facilitate this? It probably requires some form of nesting queries, and some form of combined queries.

But... this kind of discussion quickly turns into a difficult discussion of which methods need to be included in the newly designed query language, and will kind of lead to some almost-turing-complete language, but not fully. Therefore, we might want to try a plugin-like approach, where the query logic resides in code.

Plugin (endpoint + extend resources)

Atomic Plugins are currently just a bunch of thoughts, but perhaps Versions is a logical first plugin. Plugins should be apps that can be added to an Atomic Server at runtime. They might be run using a WASM runtime. In any case, they need to add functionality to an Atomic Server.

Ideally, the Versioning plugin will have:

An endpoint (see above) for constructing a version of a resource, and listing all versions / commits of a resource
Maybe add some links to resources, so you can find the versions of a resource from the resource page.
A way to restore previous versions (by producing a Commit that reverts some changes)

Constructing a version

A Version is a representation of some Resource at some point in time. Constructing one can (theoretically be done in several ways:

Get the initial commit of a resource, and apply all commits until you get to the requested version
Persist all versions and representations (costly!)
Persist reverse actions, which describe how you can do the inverse of a Commit.

I feel like the first approach is the most logical.

I have a problem with this, though. Some resources don't have commits: for example, resources that are imported from a static AD3 file (such as the default atomic data).

I think this should simply be handled by the plugin. If you try to construct a version for a resource that has no commits, just return an error. Or should it return the static version that?

atomicdata-dev / atomic-server Goto Github PK

atomic-server's People

Stargazers

Watchers

Forkers

atomic-server's Issues

Server ease of use

Cloudflare Workers + KV

Running Atomic-Server on a smartphone

Using the current stack + Tera template engine

Using existing JS libraries

Creating a new atomic-JS library

Mapping atomic to JS using wasm_bindgen

Using a rust front-end framework

Get rid of ResourceString

Approaches

Wishes

Approaches / implementation ideas

Class renderers

Table

Approaches

API design

Serialization

JSON

AD3

Wishes

Considerations

Approaches

Versions endpoint

Introduce sequential TPF queries

Plugin (endpoint + extend resources)

Constructing a version

Recommend Projects

Recommend Topics

Recommend Org