Light

jazzdotdev / contentdb Goto Github PK

View Code? Open in Web Editor NEW

5.0 4.0 0.0 22 KB

The Jazz Database which Stores Documents in Files and Supports Version Control and Sharing

Lua 100.00%

database document-oriented file-based robust pelican nosql jazz

contentdb's Introduction

ContentDB

contentdb is a document store database library that is similar to CouchDB and MongoDB.

This project follows the general "NoSQL" trend away from table-oriented databases, usually toward document-oriented or otherwise graph-oriented. It will also support outline-oriented databses, like OPML and YAML, which facilitate great data management interfaces in simple text editors and even more powerful UIs, like WorkFlowy and OmniOutliner or open source solutions like Drupal Paragraphs Drag & Drop Mode and Drupal Draggable Views with Sections (draggableviews).

What makes ContentDB significantly different from other databases is that it uses a file to store each "record", called a document, in a specified format. This is similar to RDF-style XML databases.

With this approach, you can easily manage your data storage, retreival, and other metadata storage and retrieval as well. You can easily do version control for history tracking and easy deployment. Relationships are easy to represent using reference fields. See and Pelican and Grav for other similar systems, more so Pelican because of its pelican reference plugin.

To massively improve performance and scalability, Tantivy integration can provide immediate indexing and querying. This is the same as materialized views and mongo collections.

Format

ContentDB combines one structured and one unstructured data format. Headers use SCL, and bodies use Markdown or Lua.

Header field values provide a mechanism to represent attributes of world entities in virtual document texts. Relationship fields work by using the target's model for its field. You can see this implemented in Python in the pelican plugin above and some more background info about using antisymmetric relationships.

Default ID is UUIDv4, but this will soon shift to NanoID. See #17

Short names are a missing feature. #12

Model documents started by following some json document schema or something, forgot which schema

Stores are folders with documents from a single source.

contentdb's People

Contributors

Stargazers

Watchers

contentdb's Issues

find Damien Tournoud (damz)'s DrupalCon 2011 London Lightning Talk

that video and slideshow explain document-oriented databases very well

add outline support

After #20
Should support folder based hierarchies
Need outline-oriented model format, so need to clarify current document-oriented model format first
This will also work nicely for logs, not just opml-like texts

log documents should use the same document functions

add config file (immature content)

similar to #21

see bottom of https://news.ycombinator.com/item?id=19137259

currently found patterns "issue-labels" are in yaml, so they are inaccessible as content for referencing and working off their models (the prefix)
see also jazzdotdev/rolling-projects#3

url field in content module

url::parse ?

"Actix-web uses http::Uri for uri parsing."
https://docs.rs/actix-web/0.7.14/actix_web/error/enum.UrlParseError.html
https://github.com/hyperium/http/blob/master/src/uri/tests.rs

web log model

kind: request / response
from/to: (can "server" and "client" be united?) - incoming: 1, or outgoing 1

function renames etc

reviews appreciated

schema polymorphism / model inheritance / object oriented data types

https://github.com/jazzdotdev/lighttouch/blob/364b3e3cb1be8caac90ab212ce3db4593d7d4b78/loaders/models.lua#L22

Lighttouch loaders preprocess everything, so this would make sense to do that way too..

While loading the models, check for this type of syntax and setup the data interfaces to Classes instead of database objects

cleanup

remove luvent from write document
remove log traces for running

Reading files from missing stores

To read files, lighttouch walks through the files in the store's directories, but when given a missing store it breaks. Make it detect a missing store and just walk through an empty list of files.

Lighttouch Entity Relationship Diagrams GUI

https://www.drupal.org/project/erd
https://www.devart.com/dbforge/mysql/studio/ (database design)

move home-store.txt variable to contentdb.scl

or maybe torchbear.scl / contentdb /

save git history

fix upstream markdown ambiguities, inconsistencies, and ordered listing antipatterns

needs: 1) feature complete structured writing, automatic list item numbering 2) unambiguous syntax 3) round-tripping between markup and markdown

numbering # for list ordering
portioning % for text sectioning, paragraphs
bulletting * for unordered listing

~~
raw notes:
~~

bug: ordered list syntax uses ambiguous, cumbersome, hardcoded symbols

https://spec.commonmark.org/0.28/#ordered-list-marker
https://en.wikipedia.org/wiki/Hard_coding (antipattern)

using unique, hardcoded characters for each list item, and then an ambiguous ending character is breaking my content. I'd like to change this syntax to a single character syntax.

the number sign, #, would make the most sense to me, since we're numbering things. this would make for an easy fix with nesting ordered lists, because then we could use multiple markers for list depth.

related: # [ ]

~~

bug: heading syntax uses a number symbol, instead of a portion symbol

https://spec.commonmark.org/0.28/#atx-heading
https://en.wikipedia.org/wiki/Number_sign

this bug also breaks numbering things. see []#

percentage is a portion calculated in terms of the rest of the portions, but it's the closest general character I can find to indicate a portion

the only other one I can think of is + a plus sign, but that's part of the problem I'm trying to solve. I would like to better organize content across documents for a combinatorial value whose mechanisms work differently than simple addition.

~~

~~

~~

use data geometry for transparent viewstore lay-up

https://en.wikipedia.org/wiki/Data_compression

shapes and symmetries of groups

http://www.coloring-book.co/
https://news.ycombinator.com/item?id=15326480
https://hn.algolia.com/?query=group%20theory&sort=byPopularity&prefix&page=0&dateRange=all&type=story

file systems

geometries

spatial: Advances in Spatial and Temporal Databases: 10th International Symposium p424
temporal: https://www.percona.com/blog/2013/08/29/considering-tokudb-as-an-engine-for-timeseries-data/ https://www.percona.com/doc/percona-tokudb/ft-index.html
model: https://ieeexplore.ieee.org/abstract/document/6378372 Parallel Implementation of Model Based Data Compression Algorithms?

preprocessed field type

so Lighttouch interfaces can:

when document viewed
  if document model has preprocessed field
    trigger model's / document's CRUD events (can be processed async, but before return)

write document unescaped syntax doesn't always work

I updated the url of a slide, and it broke it

model model

to read model meta-documents as documents

switch from uuid to nanoid

first: jazzdotdev/jazz#253

date field in content module

add write document logic

see jazzdotdev-packages/json-interface#14

and make it predictable, preferably sorted with name first
lighttouch document witness has this logic already

(async + parallel processing) switch to promise thing

should make error handling like https://github.com/foundpatterns/contentdb-lua/blob/c1606735a2194a23866d4f032513ef62bc59eaf2/walk_documents.lua#L7-L9 easier

color code document hash/id-prefix as prime color

use tantivy only for searching

and work with https://github.com/foundpatterns/lighttouch/issues/155

design notes

models: https://json-schema.org/

shortname

slug
might need to error if multiple documents with same shortname exist in same data store, and warn if same across stores

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.