levelgraph / levelgraph Goto Github PK

View Code? Open in Web Editor NEW

1.5K 67.0 121.0 1.84 MB

Graph database JS style for Node.js and the Browser. Built upon LevelUp and LevelDB.

License: MIT License

JavaScript 99.96% Shell 0.04%

levelgraph's People

Stargazers

Watchers

Forkers

sirmmo nvdnkpr fth-ship refset juliangruber covus-reader richorama elf-pavlik unlucio boxxxie liluxdev notsoux reqshark edrex arkadiuszsz apsaltis meandavejustice bmpvieira-forks maxkueng humeafo pietercolpaert kyleamathews infoburp imclab juniperchicago markdalgleish snowyu diniscruz fanweixiao volland neonstalwart vonwenm finnp transcranial kmyl mkmelin nichoth shuanzi doowb cycmy2001 disappearedgod henyihanwobushi eduac ineo4j cgvarela rtvt123 neozhangthe1 adrieankhisbearchives anukat2015 stevestrong thomaslevans smmoosavi capricube a0viedo repconiga gregmarlin ryalian jumprocks iilab digideskio gdg maxired wiltonlazary bigbluehat teomurgi adriano-di-giovanni mjunaidi karthik2883 mdlenin colinsongf sujith9 xon91 thientu lyxsus tipico25 davidrwallace xiangminghu ramy-ahmed themagiccat mattcollier brodybits samholt aredridel jchristov sensecollective cuulee atarunis dongnanpro kael carloslema umutkarakocold asteriq matthewmmorrow romick2005 jbuck167 bastiion hhy5277 puzoliang szhorizon p0ooooo0ya

levelgraph's Issues

question about properties

hi this is just a question. i am just trying out levelgraph but wondered if it is possible to add multiple properties for the vertices (which i believe are subject and object) and multiple properties to the edges (predicates)? thank you very much for trying to bring graph databases to node.js! all the best

Firefox Support

Testling reports that it does not work in Firefox :(

http://ci.testling.com/mcollina/levelgraph

start, end, reverse options for get and search

Hello!

It would be cool if the .get and .search (and the streams) functions would support start, end and reverse options that levelup provides, analogue to the limit and offset options already implemented in #49.

A possible use case would be getting the latest triple when the subject has a value that makes sense for sorting (e.g. contains a timestamp). To get the latest triple I would supply an options object like this:

{
  limit: 1,
  reverse: true
}

My current workaround is to keep a second database with the subject name as the key and the value in order to find the latest subject name.

live joins?

Would it be possible to use level-live-stream on the joins
so that you can get a changes feed of the relations as they are added?

also, is it possible to do open ended joins, perhaps:

db.join([{
  subject: db.v('author1'),
  predicate: 'maintains',
  object: db.v('module'),
},
{
  subject: db.v('module'),
  predicate: 'depends',
  object: db.v('module2')
},
{
  subject: db.v('author2'),
  predicate: 'maintains',
  object: db.v('module2')
}
], function (err, join) {
  console.log(join)
})

which might return:

{author1: 'mcollina', module: 'levelgraph', author2: 'rvagg', module2: 'levelup'}

Is that correct?

Predicates as variables in Navigator API

The Navigator API should allow predicates to be variables, too.
There might be cases where this is handy, e.g. to get all the 'neighbors' of a vertex (noted by @Marketcentric in #9).

putStream and delStream

We need a way to stream results back into the triple store, for #5.

[style] move away from comma first?

happy to see move from " to ' 19f89e4

i wonder if you could also consider moving away from putting commas first?
https://github.com/rwaldron/idiomatic.js#comma-first

if you look at README you can see there that commas there sometimes occur in the end of the line and sometimes go first. myself i find commas first bit confusing especially in JSON objects.

of course if you feel strongly about putting commas first, please just close this issue! maybe after updating README to use it consistently 😉

Upgrade to browserify 3.0.0

This will allow to ditch the readable-stream dependency at all, and possibly reduce the library size

Internet Explorer 10 support

According to http://caniuse.com/indexeddb, LevelGraph "should" work on IE 10.
However it gives us a laconic "syntax error".

LevelDOWN special operations for level-sublevel@6

The following combination produces some errors on search when levelgraph is put on a sublevel:

{ "dependencies": {
    "level-sublevel": "^6.3.8",
    "leveldown": "^1.0.0",
    "levelgraph": "^0.8.2",
    "levelup": "^0.19.0" } }

In particular, approximateSize doesn't end up on db for queryplanner here:

https://github.com/mcollina/levelgraph/blob/37e2f8d0/lib/queryplanner.js#L27

This band-aid seems to work:

level = require('level')(process.env.DATABASE)
sub = require('level-sublevel')(level)
objects = sub.sublevel('objects')
graphLevel = sub.sublevel('graph')
graphLevel.db = level.db
graphLevel.approximateSize = level.db.approximateSize.bind(level.db)
graph = require('levelgraph')(graphLevel)

Is there a roadmap for rolling in compat with the new level-sublevel? Need a hand?

concat-stream

I think you could use https://github.com/maxogden/node-concat-stream instead of the callback stream lib, that would make you codebase smaller.

gremlin-node integration?

Wow. Nice work so far. It would be great to add another level of abstraction such as that provided by Gremlin (https://github.com/tinkerpop/gremlin/wiki) and the very nice node implementation of gremlin (https://github.com/entrendipity/gremlin-node) either via the BluePrints api or a direct implementation.

If you have not been watching Gremlin's recent activity, it is emerging as the most popular method querying/traversing/pattern matching most of the leading graph databases - for good reason. Marko Rodriguez, Gremlin's author is very approachable and helpful (http://thinkaurelius.com/team/).

Perhaps borrowing some of the pieces from gremlin-node would make this effort easier (although they interface directly to Java(!) libraries from node). Maybe they would be interested in a collaboration.

Adopting the Gremlin approach to graph traversal would enhance the awesomeness of Levelgraph as a performant backend for Node. My assessment of graphDB alternatives for Node is that there are few high speed options (i.e., those not accessing the graphdb via REST). Levelgraph could fill this void.

Browser support!

LevelGraph should run as is in the Browser, on top of level-js.
Some browserify-fu is needed.

How to convert CSV data for levelgraph?

I'm quite new to semantic technologies. I would want to test the database with a dataset from the Finnish tax authorities. It is about taxes paid by companies. How should I go about to transform the data to triplets? Is there some CSV2RDF tool I could use so that levelgraph-n3 would understand it. I just found Rasqal but didn't test it.

traversals

Is it possible to do traversals with levelgraph?
for example, follow the "friend" link up to 3 hops from X...

I've been drawing up some ideas here...

https://gist.github.com/dominictarr/6043557

But I need to think more on this.

Fix testling badge

Not sure what to do, the hook is set up as instructed, and with testling -u it works smoothly in Chrome.

overwriting data doesn't overwrite data... problems with true/false objects

here is some output from my chrome dev tools resources indexedDB menu

0
"spo::6654115829151124::api_key::aaaa"
"{"subject":"6654115829151124","predicate":"api_key","object":"aaaa"}"
1
"spo::6654115829151124::connected::"
"{"subject":"6654115829151124","predicate":"connected","object":false}"
2
"spo::6654115829151124::connected::false"
"{"subject":"6654115829151124","predicate":"connected","object":"false"}"
3
"spo::6654115829151124::connected::true"
"{"subject":"6654115829151124","predicate":"connected","object":true}"

38
"spo::9274266117718071::name::k"
"{"subject":"9274266117718071","predicate":"name","object":"k"}"
39
"spo::9274266117718071::name::ko"
"{"subject":"9274266117718071","predicate":"name","object":"ko"}"
40
"spo::9274266117718071::name::kok"
"{"subject":"9274266117718071","predicate":"name","object":"kok"}"
41
"spo::9274266117718071::name::koko"
"{"subject":"9274266117718071","predicate":"name","object":"koko"}"
42
"spo::9274266117718071::name::kokok"
"{"subject":"9274266117718071","predicate":"name","object":"kokok"}"
43
"spo::9274266117718071::type::project"
"{"subject":"9274266117718071","predicate":"type","object":"project"}"

sorry if the above is a bit ugly, but i would like to point attention to the top and bottom lines.

it looks like when i overwrite the triple {"subject":"9274266117718071","predicate":"name","object":"kokok"} that the old values of object "koko" remain.

also, there is a quirk near the top where the value of false is not being represented in the index key of "spo::6654115829151124::connected::"
this quirk is actually causing my to not be able to look up true/false triples properly (wrong values returned when i update a value from false to true).

Use mergesort-stream2

https://github.com/eugeneware/mergesort-stream2

Memleak

To test this library in order to use it for a bigger project, I've tried to ingest DBPedia. It crashes however at ~1% of the triples saying that the process is out of memory. It only happens when I try to write to a leveldb. If I comment the write, the memory usage stays stable.

some code

var db = levelgraph(levelup(dbname));
var dbputstream = db.putStream();
var filename = "pathtodbpedia.nt";
fs.createReadStream(filename).on("data", function (data) {
  N3Parser().parse(data, function (error, triple) {
    dbputstream.write(triple);
  }
});

Any ideas on how to limit memory usage?

Crashes when using sublevels

Hello, it's me.

I'm trying to use levelgraph with level-sublevel. I'm doing it the way described in the readme. Inserting data with .put() and reading with .get() works fine but using .search() always results in a crash:

/tmp/levelgraph/node_modules/level-sublevel/sub.js:239
  var r = root(db)
               ^
ReferenceError: db is not defined
    at SubDB.SDB.approximateSize (/tmp/levelgraph/node_modules/level-sublevel/sub.js:239:16)
    at async.each.result.forEach.q.stream (/tmp/levelgraph/node_modules/levelgraph/lib/queryplanner.js:27:10)
    at /tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:111:13
    at Array.forEach (native)
    at _each (/tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:32:24)
    at Object.async.each (/tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:110:9)
    at planner (/tmp/levelgraph/node_modules/levelgraph/lib/queryplanner.js:23:11)
    at Object.searchStream (/tmp/levelgraph/node_modules/levelgraph/lib/levelgraph.js:136:5)
    at Object.search (/tmp/levelgraph/node_modules/levelgraph/lib/utilities.js:21:27)
    at /tmp/levelgraph/app.js:16:9

Reproducing the crash

The following little demo app will produce a the crash.

app.js:

var levelup = require("level");
var sublevel = require("level-sublevel");
var levelWriteStream = require("level-writestream");
var levelgraph = require("levelgraph");
var db = sublevel(levelWriteStream(levelup("likes.db")));
var notgraph = db.sublevel('notgraph');
var graph = levelgraph(db.sublevel('graph'));

var graphdata = [
  { subject: 'fred', predicate: 'likes', object: 'emma' },
  { subject: 'betty', predicate: 'likes', object: 'fred' },
  { subject: 'emma', predicate: 'likes', object: 'betty' }
];

graph.put(graphdata, function () {
  graph.search([
    { subject: graph.v('who'), predicate: 'likes', object: 'betty' }  

  ], function (err, results) {
    console.log(results);
  });
});

package.json:

{
  "name": "levelgraph-sublevel-test",
  "version": "0.0.0",
  "main": "app.js",
  "dependencies": {
    "level": "^0.18.0",
    "level-sublevel": "^5.2.0",
    "level-writestream": "^0.1.1",
    "levelgraph": "^0.8.2"
  }
}

npm install levelgraph fails on windows 7

just a heads up that i had a lot of problems with:
npm install levelgraph
on windows 7 service pack 1 (x64) with Visual Studio 2013 Express installed.
in the end i installed Windows SDK 7.1 and from it's command prompt was able to get it working without a node-gyp rebuild error.
i got the version i needed from:
https://www.microsoft.com/en-us/download/details.aspx?displayLang=en&id=8279

i also had to uninstall all Microsoft C++ 2010 Redistributable Packages prior to installing Windows SDK as install would otherwise fail.
the SDK command prompt sets all environment variables to use what seems like 2010 version compilers.
hope that helps someone.

Query triple properties through a search

Is it possible to get the properties from a triple with a search?
I know you can do this with a 'get', since it returns the whole triple, but in some cases, it would be useful to return the properties when performing a search.

Negation in Joins

Inside a join condition we might want to "negate" it, meaning we want the triples that do not satisfy that condition. Example:

var stream = db.joinStream([{
  subject: "matteocollina",
  predicate: 'maintains',
  object: db.v('module'),
},
{
  subject: db.v('module'),
  predicate: 'depends',
  object: "levelgraph",
  filter: { type: "negation" }
}]);

Note on difference to Neo4J and Cypher

Hi,

I was scanning the docs, and I would find it helpful if there was a note how the concepts in Levelgraph differs from the one in Neo4J and its query language Cypher. Many people playing with graphs may come out of that area.

There is some discussions on how Gremlin differs here: #9 .

I would gladly issue a PR if you point out some thoughts/links that show what LevelGraph does, and how it differs from Neo4J (esp. from Cypher)

Thanks!

Query planner for JOIN

As @dominictarr suggested in #3:

Agree, we need more generic join modules. Also, we can use db.getApproximateSize(range) to estimate which sides of the join is smaller, and thus which should be in memory.

Faster Search/Join by not decoding JSONs

We need to support a getStream() that only fecthes and process the keys.
Instead of decoding the JSON value, we can use a regexp for parsing the key. Then, we need to leverage it inside search/join as body decoding is never needed there.

cc @RubenVerborgh

Heroku support?

Hi,

can this be used with applications on a service like Heroku? I am wondering, since the dyno mechanics might delete files in the file system, while the files are needed for the database.

It would be great to hear your thoughts on this?

Thanks,

Patrick

Limit when offset is set needs to be limit = limit + offset

Problem

Example: when you have 1000 results total, and you request a limit 100 and an offset 800, there are 0 results returned. If you request however a limit of 900 and an offset of 800, 100 results are returned starting from offset 800. The expected behaviour however would be that limit 100 and offset 800 returns this.

Solution

In your code, a quick fix might be to do this: limit = limit + offset;. Even nicer would be to note call that limit after all ;)

Extract Navigator in its own repo?

I think Navigator belongs to its own repo/npm instead of this one, because I would like to keep this small and tight.

What do you think?

Pagination

Implement pagination by implementing the option: startTriple.

This way, a next page of triples of {limit:100} can be requested as follows:
{limit: 100, startTriple: {subject : "..." , predicate: "...", object : "..."}}

By which you should be able to only give the beginning of the object, not the entire object

Rename getStream -> readStream and {put/del}Stream in writeStream

I regret having called those in that way, and I prefer the LevelUp terminology.

Anybody has an opinion on this?

Get batch from triples

Use-case:

user.batch(userOps);
log.batch(logOps);
graph.put(graphTriples); // there's no batch right now, right?

These are all sublevels – I want to roll them up into one batch operation so they can fail atomically:

db.batch(allOps);

I want to pass my triples and get back an array of ops.

var allOps = userOps.concat(logOps, graph.getBatch(graphTriples));

named graphs

i can think of few use cases where i would use named graphs, possibly it could work similar to level-sublevel ?

i would like it mostly for provenance, so when i aggregate data from various sources i can keep it in named graphs!

for example if one publishes list of one's friends using foaf:knows predicate, one could also add triple stating that https://twitter.com/pontifex knows her/him. I might like to check such claims for example against http://vatican.va and would prefer not to put everything into the same graph...

Get returns triples which start with the search criteria

Hi,

When I do a search for triples, I get back all triples that start with the supplied criteria. I would expect to see only triples that exactly match the criteria.

Illustrated by this repl session:

$ node
> var db = require('levelgraph')('test');
> db.put({subject:'a', predicate:'b', object:'c'});
> db.put({subject:'a1', predicate:'b1', object:'c1'});
> db.put({subject:'a2', predicate:'b2', object:'c2'});
> db.get({subject:'a'}, console.log);

[ { subject: 'a1', predicate: 'b1', object: 'c1' },
  { subject: 'a2', predicate: 'b2', object: 'c2' },
  { subject: 'a', predicate: 'b', object: 'c' } ]

However, I would expect (like!) this behaviour:

> db.get({subject:'a'}, console.log);

[ { subject: 'a1', predicate: 'b1', object: 'c1' } ]

Is there a way to configure this?

Thanks,

Richard.

exact matching not working

I encounter a bug(?) when doing a get request with an id and expecting one result but having more results (it is not doing an exact match).

db.get({subject: 'asset_1', predicate: 'is', object: 'Asset'}, console.log);

actually returns :

null [ { subject: 'asset_1', predicate: 'is', object: 'Asset' },
{ subject: 'asset_11', predicate: 'is', object: 'Asset' },
{ subject: 'asset_12', predicate: 'is', object: 'Asset' } ]

Should returns :
null [ { subject: 'asset_1', predicate: 'is', object: 'Asset' } ]

Where is index.js?

The new version is broken because index.js is missing :(

Remove `.un~` temp files from npm package

from a fresh npm install:

$ ls -alh node_modules/levelgraph{,/lib,/examples,/benchmarks,/test} | awk '{print $9}' | grep -e '^\.' | grep -ve '^\.\+$'
..jshintrc.un~
..travis.yml.un~
.CONTRIBUTING.md.un~
.HISTORY.md.un~
.Makefile.un~
.README.md.un~
.Vagrantfile.un~
.browser.js.un~
.index.js.un~
.jshintrc
.npmignore
.package.json.un~
.travis.yml
.contrainedJoin.js.un~
.joinStream.js.un~
.readStream.js.un~
.readStreamLeveldb.js.un~
.reads.js.un~
.writeStream.js.un~
.writes.js.un~
.writesLeveldb.js.un~
.writesStream.js.un~
.foaf-web.html.un~
.foaf.js.un~
.foafNavigator.js.un~
.callbackstream.js.un~
.contextbinderstream.js.un~
.editstream.js.un~
.getdb-browser.js.un~
.getdb.js.un~
.joinstream.js.un~
.keyfilterstream.js.un~
.levelgraph.js.un~
.livejoinstream.js.un~
.materializerstream.js.un~
.navigator.js.un~
.queryplanner.js.un~
.sortjoinstream.js.un~
.utilities.js.un~
.variable.js.un~
.writestream.js.un~
.abstract_join_algorithm.js.un~
.abstract_join_algorithm_spec.js.un~
.basic_graph_spec.js.un~
.basic_join_spec.js.un~
.common.js.un~
.creation_api_spec.js.un~
.foaf.js.un~
.join_spec.js.un~
.live_join_spec.js.un~
.navigator_spec.js.un~
.properties_spec.js.un~
.queryplanner_spec.js.un~
.sort_join_spec.js.un~
.triple_store_spec.js.un~
.variable_spec.js.un~

You should be able to add an .npmignore that contains *.un~ and republish to get rid of unwanted artifacts.

nav() before indexedDB is ready causes exception to be thrown

basically i have an app that is putting/getting/navigating the DB in chrome and when i use nav (may be the same for join, haven't tried) i get some exceptions thrown. the exception is from line 10055 preventSuccessCallback || this.onStoreReady();

it is only happening for nav(), not for get() or put().

I am wondering if there would be some simple check i could make to see if i am able to invoke nav()?

Test failing on Chrome

not ok 16 navigator should follow multiple archs, in and out a path
  Error: Uncaught TypeError: Cannot call method 'end' of null (http://localhost:58589/__testling?show=true:26647)
      at window.onerror (http://localhost:58589/__testling/node_modules/mocha/mocha.js:5297:10)
      at IDBTransaction.cursorTransaction.oncomplete (http://localhost:58589/__testling?show=true:17794:11)

This is related to the recent @substack patch of ReadStream, at this specific line:
https://github.com/rvagg/node-levelup/blob/master/lib/read-stream.js#L96-L99

where _iterator get null.
I think it happens because I close the database before the actual iterator is closed.

I'm pulling in also @maxogden as this is a Level.js issue.

Materialized joins

As @dominictarr suggested in #4 :

a join is also a relationship on it's own, maybe it could return results in the {subject, object, predicate} triple also?

what about:
{ subject: 'daniele', object: 'marco', predicate: 'friend-of-friend' },
{ subject: 'daniele', object: 'matteo', predicate: 'friend-of-friend' },
{ subject: 'lucio', object: 'marco', predicate: 'friend-of-friend' },
{ subject: 'lucio', object: 'matteo', predicate: 'friend-of-friend' }
then, potentially, you could materialize these, or use them as parts of other joins?

An option might be something like:

db.materialize(conditions, 
{ subject: db.var("s"), predicate: db.var("p"), object: db.var("o") } , ....

Rename contexts to solutions everywhere

problems with true/false objects

here is some output from my chrome dev tools resources indexedDB menu

0
"spo::6654115829151124::api_key::aaaa"
"{"subject":"6654115829151124","predicate":"api_key","object":"aaaa"}"
1
"spo::6654115829151124::connected::"
"{"subject":"6654115829151124","predicate":"connected","object":false}"
2
"spo::6654115829151124::connected::false"
"{"subject":"6654115829151124","predicate":"connected","object":"false"}"
3
"spo::6654115829151124::connected::true"
"{"subject":"6654115829151124","predicate":"connected","object":true}"

there is a quirk near the top where the value of false is not being represented in the index key of "spo::6654115829151124::connected::"

Publish on bower

http://sindresorhus.com/bower-components/

Sorted relationships and limiting

In order to enhance performance for certain types of relationships a value should be able to be added to the relationship that will sort it. Then adding a range capabilities and limits in they querying will keep the operations from loading more than is needed.

This takes advantage of leveldb's key sorting. A great example of where this feature would be used is in an activity stream–really any large list of sorted data. You only want the newest 20 items, or the next newest 20 items (for paging). You shouldn't have to load every event node for every friend a user is connected to. Think every tweet node belonging to every user node that you are following. You have to store the created_date in the relationship anyway so you don't have to load every tweet for the lookup. We can just include it in the keys as well.

I will try and make time for this soon-ish, but I can't this month. But I wanted to get it written down. Anyone is welcome to take it. I'll comment here again when I start on it.

Add visualization of Triples

Hi,

as follow up to discussions in

#66 (comment)

I found this graph drawing library - http://sigmajs.org/

Maybe we can use it for visualizing triples.

If I find time, I will try.

The data comes from http://downloads.dbpedia.org/3.9/en/geo_coordinates_en.nt.bz2 and the file size is around 274M.

var fs = require('fs');
var lg = require('levelgraph');
var lgN3 = require('levelgraph-n3');
var db = lgN3(lg('mydb'));

var stream = fs.createReadStream('./geo_coordinates_en.nt').pipe(db.n3.putStream());

stream.on('finish', function() {
    return console.log('Import completed');
});

my dependencies:

{
  "dependencies": {
    "levelgraph": "^0.8.2",
    "levelgraph-n3": "^0.3.3",
    "levelup": "^0.18.6"
  }
}

Upgrade levelup to 0.9

The new LevelUp 0.9 release is huge: http://r.va.gg/2013/05/levelup-v0.9-released.html and http://r.va.gg/2013/05/levelup-v0.9-some-major-changes.html.

This adds the support needed by #8 and #10.

creating graph in sublevel

$ npm install levelup leveldown level-sublevel levelgraph

then running code

var LevelUp = require('levelup'),
    Sublevel = require('level-sublevel'),
    LevelGraph = require('levelgraph');

var db = Sublevel(LevelUp('dev.ldb')),  
    graphdb = LevelGraph(db.sublevel('graph'));

results in

/code/play/levelgraph-sublevel/node_modules/levelgraph/lib/levelgraph.js:75
    , close: leveldb.close.bind(leveldb)
                           ^
TypeError: Cannot call method 'bind' of undefined
    at levelgraph (/code/play/levelgraph-sublevel/node_modules/levelgraph/lib/levelgraph.js:75:28)
    at Object.<anonymous> (/code/play/levelgraph-sublevel/index.js:6:15)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:901:3