Code Monkey home page Code Monkey logo

pyramid's Introduction

pyramid

A library for storing and querying graph data in Clojure.

Features:

  • Graph query engine that works on any in-memory data store
  • Algorithm for taking trees of data and storing them in normal form in Clojure data.

Install & docs

Clojars Project cljdoc badge

Why

Clojure is well known for its graph databases like datomic and datascript which implement a datalog-like query language. There are contexts where the power vs performance tradeoffs of these query languages don't make sense, which is where pyramid can shine.

Pyramid focuses on doing essential things like selecting data from and traversing relationships between entities, while eschewing arbitrary logic like what SQL and datalog provide. What it lacks in features it makes up for in read performance when combined with a data store that has fast in-memory look ups of entities by key, such as Clojure maps. It can also be extended to databases like Datomic, DataScript and Asami.

What

Pyramid can be useful at each evolutionary stage of a program where one needs to traverse and make selections out of graphs of data.

Selection

Pyramid starts by working with Clojure data structures. A simple program that uses pyramid can use a query to select specific data out of a large, deeply nested tree, like a supercharged select-keys.

(def data
  {:people [{:given-name "Bob" :surname "Smith" :age 29}
            {:given-name "Alice" :surname "Meyer" :age 43}]
   :items {}})

(def query [{:people [:given-name]}])

(pyramid.core/pull data query)
;; => {:people [{:given-name "Bob"} {:given-name "Alice"}]}

Transformation

Pyramid combines querying with the Visitor pattern in a powerful way, allowing one to easily perform transformations of selections of data. Simply annotate parts of your query with metadata {:visitor (fn visit [data selection] ,,,)} and the visit function will be used to transform the data in a depth-first, post-order traversal (just like clojure.walk/postwalk).

(def data
  {:people [{:given-name "Bob" :surname "Smith" :age 29}
            {:given-name "Alice" :surname "Meyer" :age 43}]
   :items {}})

(defn fullname
  [{:keys [given-name surname] :as person}]
  (str given-name " " surname))

(def query [{:people ^{:visitor fullname} [:given-name :surname]}])

(pyramid.core/pull data query)
;; => {:people ["Bob Smith" "Alice Meyer"]}

Accretion

A more complex program may need to keep track of that data over time, or query data that contains cycles, which can be done by creating a pyramid.core/db. "Pyramid dbs" are maps that have a particular structure:

;; for any entity identified by `[key id]`, it follows the shape:
{key {id {,,,}}

Adding data to a db will normalize the data into a flat structure allowing for easy updating of entities as new data is obtained and allow relationships that are hard to represent in trees. Queries can traverse the references inside this data.

See docs/GUIDE.md.

Durability

A program may grow to need durable storage and other features that more full featured in-memory databases provide. Pyramid provides a protocol, IPullable, which can be extended to allow queries to run over any store that data can be looked up by a tuple, [primary-key value]. This is generalizable to most databases like Datomic, DataScript, Asami and SQLite.

Full stack

The above shows the evolution of a single program, but many programs never grow beyond the accretion stage. Pyramid has been used primarily in user interfaces where data is stored in a data structure and queried over time to show different views on a large graph. Through its protocols, it can now be extended to be used with durable storage on the server as well.

Concepts

Query: A query is written using EQL, a query language implemented inside Clojure. It provides the ability to select data in a nested, recursive way, making it ideal for traversing graphs of data. It does not provide arbitrary logic like SQL or Datalog.

Entity map: a Clojure map which contains information that uniquely identifies the domain entity it is about. E.g. {:person/id 1234 :person/name "Bill" :person/age 67} could be uniquely identified by it's :person/id key. By default, any map which contains a key which (= "id" (name key)) is true, is an entity map and can be normalized using pyramid.core/db.

Ident function: a function that takes a map and returns a tuple [:key val] that uniquely identifies the entity the map describes.

Lookup ref: a 2-element Clojure vector that has a keyword and a value which together act as a pointer to a domain entity. E.g. [:person/id 1234]. pyramid.core/pull will attempt to look them up in the db value if they appear in the result at a location where the query expects to do a join.

Usage

See docs/GUIDE.md

Prior art

Copyright

Copyright Β© 2023 Will Acton. Distributed under the EPL 2.0.

pyramid's People

Contributors

aiba avatar dependabot[bot] avatar eneroth avatar green-coder avatar lilactown avatar phronmophobic avatar souenzzo avatar wardle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyramid's Issues

Recursive unions

Hi there πŸ‘‹

I've been tinkering with pyramid in a language project that supports deeply nested logical expressions. Expressions themselves can be either some form of binary expression (e.g. and, or, etc), or a terminal. I've modeled these as two separate entity types.

They normalize just fine, but I'm having troubles pulling the data back out using a "recursive union" query. I've forked and committed a failing test of what I was expecting the output to be. I could very well be doing something obviously wrong here but I can't seem to massage this query into a shape that does what I want...

https://github.com/cjsauer/pyramid/blob/29071e88072737260d46b3d1ddc5310f590a92a1/test/pyramid/core_test.cljc#L294-L320

I'll copy the test here as well:

  (t/testing "recursive union"
    (let [data {:expr/id       0
                :expr/operator :add
                :expr/operands [{:expr/id       1
                                 :expr/operator :mult
                                 :expr/operands [{:term/id    0
                                                  :term/value 42}
                                                 {:term/id    1
                                                  :term/value 100}]}
                                {:expr/id       2
                                 :expr/operator :sub
                                 :expr/operands [{:term/id    3
                                                  :term/value -10}
                                                 {:term/id    4
                                                  :term/value 17}]}]}
          db1 (p/db [data])
          query [{[:expr/id 0] {:expr/id [:expr/operator
                                          {:expr/operands '...}]
                                :term/id [:term/value]}}]]
      (t/is (= {:expr/operator :add
                :expr/operands [{:expr/operator :mult
                                 :expr/operands [{:term/value 42}
                                                 {:term/value 100}]}
                                {:expr/operator :sub
                                 :expr/operands [{:term/value -10}
                                                 {:term/value 17}]}]}
               (get (p/pull db1 query) [:expr/id 0])))))

This results in:

recursive union
Expected:
  {:expr/operator :add,
   :expr/operands
   [{:expr/operator :mult,
     :expr/operands [{:term/value 42} {:term/value 100}]}
    {:expr/operator :sub,
     :expr/operands [{:term/value -10} {:term/value 17}]}]}
Actual:
  {:expr/operator :add, :expr/operands []}
Diff:
  - {:expr/operands
     [{:expr/operator :mult,
       :expr/operands [{:term/value 42} {:term/value 100}]}
      {:expr/operator :sub,
       :expr/operands [{:term/value -10} {:term/value 17}]}]}
  + {:expr/operands nil}

Thanks for an excellent project.

pull-report

Create a function pull-report which provides a collection containing the queried entities

`lambdaisland/uri`

I don't know if this is expected behavior or not, but it is an oddity that I encountered:

(pyramid/db [{:my/id  "hello"
              :a-uri  (lambdaisland.uri/uri "http://www.example.com")}])

This expands to,

{:my/id {"hello" {:my/id "hello",
                  :a-uri ([:fragment nil]
                          [:query nil]
                          [:path nil]
                          [:port nil]
                          [:host "www.example.com"]
                          [:password nil]
                          [:user nil]
                          [:scheme "http"])}}}

Rather than the…

{:my/id {"hello" {:my/id "hello",
                  :a-uri #lambdaisland/uri "http://www.example.com"}}}

… one might hope for.

Do you have any guidance as how to deal with a case like this?

Edit: Tracked down and fixed the problem in #18.

Improve support for partial data sets

I find myself sometimes working with data that is not entirely complete. For example:

;; OK

(def db (pyramid/db [{:id   0
                      :refs [[:id 1]]}
                     {:id    1
                      :hello :world}]))

(pyramid/pull db [{[:id 0]
                   [{:refs [:id :hello]}]}])

;; => {[:id 0] {:refs [{:id 1, :hello :world}]}}

I.e., I have a number of entities (for example, retrieved from persistent storage), and I'd like to throw them at Pyramid and ask some questions, ad-hoc. This works well, as long as the data set is complete (no dangling idents).

In the following example, it doesn't work as well:

;; Not OK

(def db (pyramid/db [{:id   0
                      :refs [[:id 1] [:id 2]]}
                     {:id    1
                      :hello :world}]))

(pyramid/pull db [{[:id 0]
                   [{:refs [:id :hello]}]}])

;; => Execution error (NullPointerException) at pyramid.pull/visit$fn (pull.cljc:157).

When working with Pathom in a similar scenario, the ident itself will be treated as a minimal entity (i.e., just converted to a map: {:id 2}), with the result of the pull (which may be nil) merged to that map. This guarantees that a minimal result will be returned where data is missing.

The result will be something like:

{[:id 0] {:refs [{:id 1, :hello :world}
                 {:id 2}]}}

What do you think of this?

Dynamic map keys

Support something like

(p/pull
  (p/db
    [{:classroom/id "c1" :school [:school/id 1]}
     {:classroom/id "c2" :school [:school/id 1]}
     {:classroom/id "c3" :school [:school/id 2]}
     
     {:school/id 1
      :allocations {[:classroom/id "c1"] [[:course/id 1] [:course/id 2]]
                    [:classroom/id "c2"] [[:course/id 1]]}}
     {:school/id 2
      :allocations {[:classroom/id "c3"] [[:course/id 1] [:course/id 3]]}}

     {:course/id 1 :course/name "A"}
     {:course/id 2 :course/name "B"}
     {:course/id 3 :course/name "C"}])

  [{[:school/id 1] [{:allocations [{???? [:course/name]}]}]}])
;; => {[:school/id 1]
;;     {:allocations
;;      {[:classroom/id "c1"] [#:course{:name "A"} #:course{:name "B"}],
;;       [:classroom/id "c2"] [#:course{:name "A"}]}}}

We'd want to support this for joins and unions.

A few ideas:

  1. Use _ as the key, e.g. [{'_ [:course/name]}]
  2. Support some kind of regex a la spec, perhaps use spec itself. E.g. [{(s/spec (s/tuple keyword? string?)) [:course/name]}]
  3. Accept a predicate fn, such as [{(fn [x] (and (vector? x) (= :classroom/id (first x)))) [:course/name]}]

add-report

Should also report all entities added πŸ˜„

Replacing an entity using `delete` and `add` throws exception if entity does not exist

When there is already data in the database, this works:

(-> {}
    (pyr/add {:person/id 2 :person/name "Mark"})
    (pyr/delete [:person/id 2])
    (pyr/add {:person/id 2 :person/name "Will"}))

=> #:person{:id {2 #:person{:id 2, :name "Will"}}}

If one avoids the delete, there is also no issue:

(-> {}
    (pyr/add {:person/id 2 :person/name "Mark"})
    #_(pyr/delete [:person/id 2])
    (pyr/add {:person/id 2 :person/name "Will"}))

=> #:person{:id {2 #:person{:id 2, :name "Will"}}}

But if there is no initial data in the database at all for that entity:

(-> {}
    #_(pyr/add {:person/id 2 :person/name "Mark"})
    (pyr/delete [:person/id 2])
    (pyr/add {:person/id 2 :person/name "Will"}))

Execution error (NullPointerException) at pyramid.core/update-ref (core.cljc:85).
Cannot invoke "clojure.lang.Associative.assoc(Object, Object)" because "em" is null

Of course, it is fine if there is at least one other entity in the database:

(-> {}
    (pyr/add {:person/id 1 :person/name "Mark"})
    (pyr/delete [:person/id 2])
    (pyr/add {:person/id 2 :person/name "Will"}))

=> #:person{:id {1 #:person{:id 1, :name "Mark"}, 2 #:person{:id 2, :name "Will"}}}

Data loss in p/add with nested maps

Hi,
I've noticed that in some cases pyramid normalisation replaces an entity with a reference but losses data of this entity.
Example:

(p/add {} {:a/id 1 :b [{:c {:d/id 1 :d/txt "a"}}]})
=>  #:a{:id
      {1 {:a/id 1, :b [{:c [:d/id 1]}]}}}

Reverse order for lists?

Hey @lilactown !

Thanks for this great library.

I was wondering whether lists being reversed was a known/expected behavior or if this could be a bug or misuse on my side:

(let [entity {:id 9
              :my-list [{:thing {:id 1}}  ;; ← vector
                        {:thing {:id 2}}
                        {:thing {:id 3}}]}
      db (pyramid.core/add {} entity)]
  (pyramid.core/pull db [{[:id 9] [:id {:my-list [{:thing [:id]}]}]}]))
;; Order is preserved: {[:id 9] {:id 9, :my-list [{:thing {:id 1}} {:thing {:id 2}} {:thing {:id 3}}]}}


(let [entity {:id 9
              :my-list '({:thing {:id 1}}  ;; ← list
                         {:thing {:id 2}}
                         {:thing {:id 3}})}
      db (pyramid.core/add {} entity)]
  (pyramid.core/pull db [{[:id 9] [:id {:my-list [{:thing [:id]}]}]}]))
;; Reverse order: {[:id 9] {:id 9, :my-list ({:thing {:id 3}} {:thing {:id 2}} {:thing {:id 1}})}}

I noted the README says:

Collections like vectors, sets and lists should not mix entities and non-entities. Collections are recursively walked to find entities.

Could this be the explanation? I'm not sure I fully understand what exactly should be avoided.

Thanks for your time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.