All Atomic Data Resources that we've discussed so far have a URL as a subject.
Unfortunately, creating unique and resolvable URLs can be a bother, and sometimes not necessary.
If you've worked with RDF, this is what Blank Nodes are used for.
In Atomic Data, we have something similar: Nested Resources.
Let's use a Nested Resource in the example from the previous section:
["https://example.com/john", "https://example.com/lastName", "McLovin"]
["https://example.com/john https://example.com/employer", "https://example.com/description", "The greatest company!"]
By combining two Subject URLs into a single string, we've created a nested resource.
The Subjet of the nested resource is https://example.com/john https://example.com/employer
, including the spacebar.
So how should we deal with these in atomic_lib?
Approaches
Store nested in their parent as Enum
In both Db
and Store
, this would mean that we make a fundamental change to the internal model for storing data. In both, the entire store is a HashMap<String, Hashmap<String, String>>
We could change this to:
HashMap<String, Hashmap<String, StringOrHashmap>>
, where StringOrHashMap
is some Enum that is either a String or a hashmap. This will have a huge impact on the codebase, since the most used method (get_resource_string) changes. Don't think this is the way to go.
Store nested in parent as Value
An alternative is to not store the string representations, but store the Values in the store. Currently, all Atom values in the store are strings. We could changes this to store Values. Some performance implications:
- Serializing the string representations (ad3) would be slower.
- Serializing to non-string representations would be faster (e.g. JSON)
- Using the data in some structured way would be faster.
- Adding data to the store from highly-optimized serialized formats (AD3) would be slower.
This would also
Store as new entities, with path as subject
In this approach, the nested resources are stored like all other resources, except that the subject has two URLs with a spacebar. This has a couple of implications:
- When deleting the original resource, all its nested ones will not be deleted (but should be), so this requires some extra logic
- When iterating over all resources, we can no longer assume that every single Key (subject) is a valid URL.
Store inside parent resource, with path in Property URL
Similar to the approach above, but in this approach we use the Property URL to store nested paths. Implications:
- Iterating over the Properties will not result valid Properties - these must be split up.
- Finding some nested value needs a
range
query: select all properties that start with some string
Store all Atoms as BtreeMap<Path, Value>
Perhaps it makes sense to store all Atoms in something such as BtreeMap<Path, Value>
, where the path is the subject followed by a property path (one or more property URLs). This should work by using BtreeMap's (and Sled's) range
function to select all the right properties.
API design
And what should the API for the library user look like? Should a nested resource be a special Value? This seems sensible. However, in reality it is just a regular AtomicURL.
Serialization
Let's go over serialization by example. Let's assume a Resource of a person with some nested friends.
JSON
This is the easiest. Simply nest an object!
{
"@id": "https://example.com/arthur",
"name": "Arthur",
"friends": [{
"name": "John"
}, {
"name": "Suzan"
}]
}
Note that these nested resources don't need to have an @id
field, contrary to the root resource. Their identity is implicit.
AD3
JSON has nice nesting, but AD3 is originally designed to be very flat. If we use the Subject field to store paths, we get quite long subjects. This gets a bit awkward:
["https://example.com/arthur", "https://example.com/friends", ["https://example.com/arthur https://example.com/friends 0", "https://example.com/arthur https://example.com/friends 1"] ]
["https://example.com/arthur https://example.com/friends 0", "https://example.com/name", "John"]
["https://example.com/arthur https://example.com/friends 1", "https://example.com/name", "Suzy"]
The first Atom seems entirely redundant - it provides no more information than the second two. However, leaving it out might cause issues down the line: imagine if I'd GET https://example.com/arthur
, but the first atom didn't exist. It would return no atoms - it would be empty. In order to prevent this, we could tweak the store a bit, so that a GET will search for all subjects that either are the URL, or start with the URL followed by a spacebar.
Another approach might be to nest triples in AD3, too:
["https://example.com/arthur", "https://example.com/friends", [[["https://example.com/name", "John"]],[["https://example.com/name", "Suzy"]]]
But this, too, is ugly and not human readable. JSON might be the way to go.