levelgraph / levelgraph Goto Github PK
View Code? Open in Web Editor NEWGraph database JS style for Node.js and the Browser. Built upon LevelUp and LevelDB.
License: MIT License
Graph database JS style for Node.js and the Browser. Built upon LevelUp and LevelDB.
License: MIT License
hi this is just a question. i am just trying out levelgraph but wondered if it is possible to add multiple properties for the vertices (which i believe are subject and object) and multiple properties to the edges (predicates)? thank you very much for trying to bring graph databases to node.js! all the best
Testling reports that it does not work in Firefox :(
Hello!
It would be cool if the .get
and .search
(and the streams) functions would support start
, end
and reverse
options that levelup provides, analogue to the limit
and offset
options already implemented in #49.
A possible use case would be getting the latest triple when the subject
has a value that makes sense for sorting (e.g. contains a timestamp). To get the latest triple I would supply an options object like this:
{
limit: 1,
reverse: true
}
My current workaround is to keep a second database with the subject name as the key and the value in order to find the latest subject name.
Would it be possible to use level-live-stream on the joins
so that you can get a changes feed of the relations as they are added?
also, is it possible to do open ended joins, perhaps:
db.join([{
subject: db.v('author1'),
predicate: 'maintains',
object: db.v('module'),
},
{
subject: db.v('module'),
predicate: 'depends',
object: db.v('module2')
},
{
subject: db.v('author2'),
predicate: 'maintains',
object: db.v('module2')
}
], function (err, join) {
console.log(join)
})
which might return:
{author1: 'mcollina', module: 'levelgraph', author2: 'rvagg', module2: 'levelup'}
Is that correct?
The Navigator API should allow predicates to be variables, too.
There might be cases where this is handy, e.g. to get all the 'neighbors' of a vertex (noted by @Marketcentric in #9).
We need a way to stream results back into the triple store, for #5.
happy to see move from " to ' 19f89e4
i wonder if you could also consider moving away from putting commas first?
https://github.com/rwaldron/idiomatic.js#comma-first
if you look at README you can see there that commas there sometimes occur in the end of the line and sometimes go first. myself i find commas first bit confusing especially in JSON objects.
of course if you feel strongly about putting commas first, please just close this issue! maybe after updating README to use it consistently 😉
This will allow to ditch the readable-stream dependency at all, and possibly reduce the library size
According to http://caniuse.com/indexeddb, LevelGraph "should" work on IE 10.
However it gives us a laconic "syntax error".
The following combination produces some errors on search when levelgraph is put on a sublevel:
{ "dependencies": {
"level-sublevel": "^6.3.8",
"leveldown": "^1.0.0",
"levelgraph": "^0.8.2",
"levelup": "^0.19.0" } }
In particular, approximateSize
doesn't end up on db
for queryplanner
here:
https://github.com/mcollina/levelgraph/blob/37e2f8d0/lib/queryplanner.js#L27
This band-aid seems to work:
level = require('level')(process.env.DATABASE)
sub = require('level-sublevel')(level)
objects = sub.sublevel('objects')
graphLevel = sub.sublevel('graph')
graphLevel.db = level.db
graphLevel.approximateSize = level.db.approximateSize.bind(level.db)
graph = require('levelgraph')(graphLevel)
Is there a roadmap for rolling in compat with the new level-sublevel? Need a hand?
I think you could use https://github.com/maxogden/node-concat-stream instead of the callback stream lib, that would make you codebase smaller.
Wow. Nice work so far. It would be great to add another level of abstraction such as that provided by Gremlin (https://github.com/tinkerpop/gremlin/wiki) and the very nice node implementation of gremlin (https://github.com/entrendipity/gremlin-node) either via the BluePrints api or a direct implementation.
If you have not been watching Gremlin's recent activity, it is emerging as the most popular method querying/traversing/pattern matching most of the leading graph databases - for good reason. Marko Rodriguez, Gremlin's author is very approachable and helpful (http://thinkaurelius.com/team/).
Perhaps borrowing some of the pieces from gremlin-node would make this effort easier (although they interface directly to Java(!) libraries from node). Maybe they would be interested in a collaboration.
Adopting the Gremlin approach to graph traversal would enhance the awesomeness of Levelgraph as a performant backend for Node. My assessment of graphDB alternatives for Node is that there are few high speed options (i.e., those not accessing the graphdb via REST). Levelgraph could fill this void.
LevelGraph should run as is in the Browser, on top of level-js.
Some browserify-fu is needed.
I'm quite new to semantic technologies. I would want to test the database with a dataset from the Finnish tax authorities. It is about taxes paid by companies. How should I go about to transform the data to triplets? Is there some CSV2RDF tool I could use so that levelgraph-n3 would understand it. I just found Rasqal but didn't test it.
Is it possible to do traversals with levelgraph?
for example, follow the "friend" link up to 3 hops from X...
I've been drawing up some ideas here...
https://gist.github.com/dominictarr/6043557
But I need to think more on this.
Not sure what to do, the hook is set up as instructed, and with testling -u
it works smoothly in Chrome.
here is some output from my chrome dev tools resources indexedDB menu
0
"spo::6654115829151124::api_key::aaaa"
"{"subject":"6654115829151124","predicate":"api_key","object":"aaaa"}"
1
"spo::6654115829151124::connected::"
"{"subject":"6654115829151124","predicate":"connected","object":false}"
2
"spo::6654115829151124::connected::false"
"{"subject":"6654115829151124","predicate":"connected","object":"false"}"
3
"spo::6654115829151124::connected::true"
"{"subject":"6654115829151124","predicate":"connected","object":true}"
38
"spo::9274266117718071::name::k"
"{"subject":"9274266117718071","predicate":"name","object":"k"}"
39
"spo::9274266117718071::name::ko"
"{"subject":"9274266117718071","predicate":"name","object":"ko"}"
40
"spo::9274266117718071::name::kok"
"{"subject":"9274266117718071","predicate":"name","object":"kok"}"
41
"spo::9274266117718071::name::koko"
"{"subject":"9274266117718071","predicate":"name","object":"koko"}"
42
"spo::9274266117718071::name::kokok"
"{"subject":"9274266117718071","predicate":"name","object":"kokok"}"
43
"spo::9274266117718071::type::project"
"{"subject":"9274266117718071","predicate":"type","object":"project"}"
sorry if the above is a bit ugly, but i would like to point attention to the top and bottom lines.
it looks like when i overwrite the triple {"subject":"9274266117718071","predicate":"name","object":"kokok"}
that the old values of object "koko" remain.
also, there is a quirk near the top where the value of false
is not being represented in the index key of "spo::6654115829151124::connected::"
this quirk is actually causing my to not be able to look up true/false triples properly (wrong values returned when i update a value from false to true).
To test this library in order to use it for a bigger project, I've tried to ingest DBPedia. It crashes however at ~1% of the triples saying that the process is out of memory. It only happens when I try to write to a leveldb. If I comment the write, the memory usage stays stable.
var db = levelgraph(levelup(dbname));
var dbputstream = db.putStream();
var filename = "pathtodbpedia.nt";
fs.createReadStream(filename).on("data", function (data) {
N3Parser().parse(data, function (error, triple) {
dbputstream.write(triple);
}
});
Any ideas on how to limit memory usage?
Hello, it's me.
I'm trying to use levelgraph with level-sublevel. I'm doing it the way described in the readme. Inserting data with .put()
and reading with .get()
works fine but using .search()
always results in a crash:
/tmp/levelgraph/node_modules/level-sublevel/sub.js:239
var r = root(db)
^
ReferenceError: db is not defined
at SubDB.SDB.approximateSize (/tmp/levelgraph/node_modules/level-sublevel/sub.js:239:16)
at async.each.result.forEach.q.stream (/tmp/levelgraph/node_modules/levelgraph/lib/queryplanner.js:27:10)
at /tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:111:13
at Array.forEach (native)
at _each (/tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:32:24)
at Object.async.each (/tmp/levelgraph/node_modules/levelgraph/node_modules/async/lib/async.js:110:9)
at planner (/tmp/levelgraph/node_modules/levelgraph/lib/queryplanner.js:23:11)
at Object.searchStream (/tmp/levelgraph/node_modules/levelgraph/lib/levelgraph.js:136:5)
at Object.search (/tmp/levelgraph/node_modules/levelgraph/lib/utilities.js:21:27)
at /tmp/levelgraph/app.js:16:9
The following little demo app will produce a the crash.
app.js:
var levelup = require("level");
var sublevel = require("level-sublevel");
var levelWriteStream = require("level-writestream");
var levelgraph = require("levelgraph");
var db = sublevel(levelWriteStream(levelup("likes.db")));
var notgraph = db.sublevel('notgraph');
var graph = levelgraph(db.sublevel('graph'));
var graphdata = [
{ subject: 'fred', predicate: 'likes', object: 'emma' },
{ subject: 'betty', predicate: 'likes', object: 'fred' },
{ subject: 'emma', predicate: 'likes', object: 'betty' }
];
graph.put(graphdata, function () {
graph.search([
{ subject: graph.v('who'), predicate: 'likes', object: 'betty' }
], function (err, results) {
console.log(results);
});
});
package.json:
{
"name": "levelgraph-sublevel-test",
"version": "0.0.0",
"main": "app.js",
"dependencies": {
"level": "^0.18.0",
"level-sublevel": "^5.2.0",
"level-writestream": "^0.1.1",
"levelgraph": "^0.8.2"
}
}
just a heads up that i had a lot of problems with:
npm install levelgraph
on windows 7 service pack 1 (x64) with Visual Studio 2013 Express installed.
in the end i installed Windows SDK 7.1 and from it's command prompt was able to get it working without a node-gyp rebuild error.
i got the version i needed from:
https://www.microsoft.com/en-us/download/details.aspx?displayLang=en&id=8279
i also had to uninstall all Microsoft C++ 2010 Redistributable Packages prior to installing Windows SDK as install would otherwise fail.
the SDK command prompt sets all environment variables to use what seems like 2010 version compilers.
hope that helps someone.
Is it possible to get the properties from a triple with a search?
I know you can do this with a 'get', since it returns the whole triple, but in some cases, it would be useful to return the properties when performing a search.
Inside a join condition we might want to "negate" it, meaning we want the triples that do not satisfy that condition. Example:
var stream = db.joinStream([{
subject: "matteocollina",
predicate: 'maintains',
object: db.v('module'),
},
{
subject: db.v('module'),
predicate: 'depends',
object: "levelgraph",
filter: { type: "negation" }
}]);
Hi,
I was scanning the docs, and I would find it helpful if there was a note how the concepts in Levelgraph differs from the one in Neo4J and its query language Cypher. Many people playing with graphs may come out of that area.
There is some discussions on how Gremlin differs here: #9 .
I would gladly issue a PR if you point out some thoughts/links that show what LevelGraph does, and how it differs from Neo4J (esp. from Cypher)
Thanks!
As @dominictarr suggested in #3:
Agree, we need more generic join modules. Also, we can use db.getApproximateSize(range) to estimate which sides of the join is smaller, and thus which should be in memory.
We need to support a getStream()
that only fecthes and process the keys.
Instead of decoding the JSON value, we can use a regexp for parsing the key. Then, we need to leverage it inside search/join as body decoding is never needed there.
Hi,
can this be used with applications on a service like Heroku? I am wondering, since the dyno mechanics might delete files in the file system, while the files are needed for the database.
It would be great to hear your thoughts on this?
Thanks,
Patrick
Example: when you have 1000 results total, and you request a limit 100 and an offset 800, there are 0 results returned. If you request however a limit of 900 and an offset of 800, 100 results are returned starting from offset 800. The expected behaviour however would be that limit 100 and offset 800 returns this.
In your code, a quick fix might be to do this: limit = limit + offset;
. Even nicer would be to note call that limit after all ;)
I think Navigator belongs to its own repo/npm instead of this one, because I would like to keep this small and tight.
What do you think?
Implement pagination by implementing the option: startTriple
.
This way, a next page of triples of {limit:100}
can be requested as follows:
{limit: 100, startTriple: {subject : "..." , predicate: "...", object : "..."}}
By which you should be able to only give the beginning of the object, not the entire object
I regret having called those in that way, and I prefer the LevelUp terminology.
Anybody has an opinion on this?
Use-case:
user.batch(userOps);
log.batch(logOps);
graph.put(graphTriples); // there's no batch right now, right?
These are all sublevels – I want to roll them up into one batch operation so they can fail atomically:
db.batch(allOps);
I want to pass my triples and get back an array of ops.
var allOps = userOps.concat(logOps, graph.getBatch(graphTriples));
i can think of few use cases where i would use named graphs, possibly it could work similar to level-sublevel ?
i would like it mostly for provenance, so when i aggregate data from various sources i can keep it in named graphs!
for example if one publishes list of one's friends using foaf:knows predicate, one could also add triple stating that https://twitter.com/pontifex knows her/him. I might like to check such claims for example against http://vatican.va and would prefer not to put everything into the same graph...
Hi,
When I do a search for triples, I get back all triples that start with the supplied criteria. I would expect to see only triples that exactly match the criteria.
Illustrated by this repl session:
$ node
> var db = require('levelgraph')('test');
> db.put({subject:'a', predicate:'b', object:'c'});
> db.put({subject:'a1', predicate:'b1', object:'c1'});
> db.put({subject:'a2', predicate:'b2', object:'c2'});
> db.get({subject:'a'}, console.log);
[ { subject: 'a1', predicate: 'b1', object: 'c1' },
{ subject: 'a2', predicate: 'b2', object: 'c2' },
{ subject: 'a', predicate: 'b', object: 'c' } ]
However, I would expect (like!) this behaviour:
> db.get({subject:'a'}, console.log);
[ { subject: 'a1', predicate: 'b1', object: 'c1' } ]
Is there a way to configure this?
Thanks,
Richard.
I encounter a bug(?) when doing a get request with an id and expecting one result but having more results (it is not doing an exact match).
var db = require('levelgraph')('test')
db.put({ subject: 'asset_1', predicate: 'is', object: 'Asset' }, console.log);
db.put({ subject: 'asset_11', predicate: 'is', object: 'Asset' }, console.log);
db.put({ subject: 'asset_12', predicate: 'is', object: 'Asset' }, console.log);
db.get({subject: 'asset_1', predicate: 'is', object: 'Asset'}, console.log);
actually returns :
null [ { subject: 'asset_1', predicate: 'is', object: 'Asset' },
{ subject: 'asset_11', predicate: 'is', object: 'Asset' },
{ subject: 'asset_12', predicate: 'is', object: 'Asset' } ]
Should returns :
null [ { subject: 'asset_1', predicate: 'is', object: 'Asset' } ]
The new version is broken because index.js is missing :(
from a fresh npm install
:
$ ls -alh node_modules/levelgraph{,/lib,/examples,/benchmarks,/test} | awk '{print $9}' | grep -e '^\.' | grep -ve '^\.\+$'
..jshintrc.un~
..travis.yml.un~
.CONTRIBUTING.md.un~
.HISTORY.md.un~
.Makefile.un~
.README.md.un~
.Vagrantfile.un~
.browser.js.un~
.index.js.un~
.jshintrc
.npmignore
.package.json.un~
.travis.yml
.contrainedJoin.js.un~
.joinStream.js.un~
.readStream.js.un~
.readStreamLeveldb.js.un~
.reads.js.un~
.writeStream.js.un~
.writes.js.un~
.writesLeveldb.js.un~
.writesStream.js.un~
.foaf-web.html.un~
.foaf.js.un~
.foafNavigator.js.un~
.callbackstream.js.un~
.contextbinderstream.js.un~
.editstream.js.un~
.getdb-browser.js.un~
.getdb.js.un~
.joinstream.js.un~
.keyfilterstream.js.un~
.levelgraph.js.un~
.livejoinstream.js.un~
.materializerstream.js.un~
.navigator.js.un~
.queryplanner.js.un~
.sortjoinstream.js.un~
.utilities.js.un~
.variable.js.un~
.writestream.js.un~
.abstract_join_algorithm.js.un~
.abstract_join_algorithm_spec.js.un~
.basic_graph_spec.js.un~
.basic_join_spec.js.un~
.common.js.un~
.creation_api_spec.js.un~
.foaf.js.un~
.join_spec.js.un~
.live_join_spec.js.un~
.navigator_spec.js.un~
.properties_spec.js.un~
.queryplanner_spec.js.un~
.sort_join_spec.js.un~
.triple_store_spec.js.un~
.variable_spec.js.un~
You should be able to add an .npmignore
that contains *.un~
and republish to get rid of unwanted artifacts.
basically i have an app that is putting/getting/navigating the DB in chrome and when i use nav (may be the same for join, haven't tried) i get some exceptions thrown. the exception is from line 10055
preventSuccessCallback || this.onStoreReady();
it is only happening for nav()
, not for get()
or put()
.
I am wondering if there would be some simple check i could make to see if i am able to invoke nav()
?
not ok 16 navigator should follow multiple archs, in and out a path
Error: Uncaught TypeError: Cannot call method 'end' of null (http://localhost:58589/__testling?show=true:26647)
at window.onerror (http://localhost:58589/__testling/node_modules/mocha/mocha.js:5297:10)
at IDBTransaction.cursorTransaction.oncomplete (http://localhost:58589/__testling?show=true:17794:11)
This is related to the recent @substack patch of ReadStream, at this specific line:
https://github.com/rvagg/node-levelup/blob/master/lib/read-stream.js#L96-L99
where _iterator
get null.
I think it happens because I close the database before the actual iterator is closed.
I'm pulling in also @maxogden as this is a Level.js issue.
As @dominictarr suggested in #4 :
a join is also a relationship on it's own, maybe it could return results in the {subject, object, predicate} triple also?
what about:
{ subject: 'daniele', object: 'marco', predicate: 'friend-of-friend' }, { subject: 'daniele', object: 'matteo', predicate: 'friend-of-friend' }, { subject: 'lucio', object: 'marco', predicate: 'friend-of-friend' }, { subject: 'lucio', object: 'matteo', predicate: 'friend-of-friend' }
then, potentially, you could materialize these, or use them as parts of other joins?
An option might be something like:
db.materialize(conditions,
{ subject: db.var("s"), predicate: db.var("p"), object: db.var("o") } , ....
here is some output from my chrome dev tools resources indexedDB menu
0
"spo::6654115829151124::api_key::aaaa"
"{"subject":"6654115829151124","predicate":"api_key","object":"aaaa"}"
1
"spo::6654115829151124::connected::"
"{"subject":"6654115829151124","predicate":"connected","object":false}"
2
"spo::6654115829151124::connected::false"
"{"subject":"6654115829151124","predicate":"connected","object":"false"}"
3
"spo::6654115829151124::connected::true"
"{"subject":"6654115829151124","predicate":"connected","object":true}"
there is a quirk near the top where the value of false
is not being represented in the index key of "spo::6654115829151124::connected::"
In order to enhance performance for certain types of relationships a value should be able to be added to the relationship that will sort it. Then adding a range capabilities and limits in they querying will keep the operations from loading more than is needed.
This takes advantage of leveldb's key sorting. A great example of where this feature would be used is in an activity stream–really any large list of sorted data. You only want the newest 20 items, or the next newest 20 items (for paging). You shouldn't have to load every event node for every friend a user is connected to. Think every tweet node belonging to every user node that you are following. You have to store the created_date in the relationship anyway so you don't have to load every tweet for the lookup. We can just include it in the keys as well.
I will try and make time for this soon-ish, but I can't this month. But I wanted to get it written down. Anyone is welcome to take it. I'll comment here again when I start on it.
Hi,
as follow up to discussions in
I found this graph drawing library - http://sigmajs.org/
Maybe we can use it for visualizing triples.
If I find time, I will try.
Hi,
is it possible to get list of unique nodes in the db?
As @dominictarr reported, it mislead people that wants to dive into the code.
The error message is:
FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory
The data comes from http://downloads.dbpedia.org/3.9/en/geo_coordinates_en.nt.bz2 and the file size is around 274M.
var fs = require('fs');
var lg = require('levelgraph');
var lgN3 = require('levelgraph-n3');
var db = lgN3(lg('mydb'));
var stream = fs.createReadStream('./geo_coordinates_en.nt').pipe(db.n3.putStream());
stream.on('finish', function() {
return console.log('Import completed');
});
my dependencies:
{
"dependencies": {
"levelgraph": "^0.8.2",
"levelgraph-n3": "^0.3.3",
"levelup": "^0.18.6"
}
}
The new LevelUp 0.9 release is huge: http://r.va.gg/2013/05/levelup-v0.9-released.html and http://r.va.gg/2013/05/levelup-v0.9-some-major-changes.html.
$ npm install levelup leveldown level-sublevel levelgraph
then running code
var LevelUp = require('levelup'),
Sublevel = require('level-sublevel'),
LevelGraph = require('levelgraph');
var db = Sublevel(LevelUp('dev.ldb')),
graphdb = LevelGraph(db.sublevel('graph'));
results in
/code/play/levelgraph-sublevel/node_modules/levelgraph/lib/levelgraph.js:75
, close: leveldb.close.bind(leveldb)
^
TypeError: Cannot call method 'bind' of undefined
at levelgraph (/code/play/levelgraph-sublevel/node_modules/levelgraph/lib/levelgraph.js:75:28)
at Object.<anonymous> (/code/play/levelgraph-sublevel/index.js:6:15)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Function.Module.runMain (module.js:497:10)
at startup (node.js:119:16)
at node.js:901:3
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.