Code Monkey home page Code Monkey logo

sql2graph's People

Contributors

redapple avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sql2graph's Issues

Adjust the batchimport to the new features

Hi there,
I imported the musicbrainz database to Neo4j using the following approach, helped by @jexp:

Define 2 indexes (one mbid exact, for MBIDs and one mb fulltext, for everything else) in batch.properties:

batch_import.keep_db=false
batch_import.mapdb_cache.disable=true
batch_import.node_index.mb=fulltext
batch_import.node_index.mbid=exact
batch_import.csv.quotes=false
cache_type=none
use_memory_mapped_buffers=true
neostore.nodestore.db.mapped_memory=300M
neostore.relationshipstore.db.mapped_memory=3G
neostore.propertystore.db.mapped_memory=500M
neostore.propertystore.db.strings.mapped_memory=500M
neostore.propertystore.db.arrays.mapped_memory=0M
neostore.propertystore.db.index.keys.mapped_memory=15M
neostore.propertystore.db.index.mapped_memory=15M

Then, create the indexing instructions directly in the node.csv and rels.csv files, so we don't need the ...index.csv files anymore, see https://github.com/jexp/batch-import -> automatic indexing

kind:string:mb  comment status  position    name:string:mb  area    gender  format  barcode number  ended   length  end_date_year   begin_date_year mbid:string:mbid    type:string:mb  pk
artist              Talkshow Boy                        f               e8d94cf5-fafa-48fc-a6fa-aa50cf54d7f3        288762
artist              Vibulator                       f               735bfaad-6eb1-4f9c-b21d-cbaef7c79a92        97944
artist              Eat Me                      f               c38a93e8-2ecf-4848-b1d2-364202d9dc0c    Group   499198
artist              Uffe Andersen                       f               a7f3c871-3ba3-40b1-ba58-d08b40312789    Person  514886
artist              Headust                     f               eda60727-7036-437b-b53d-ae472818ee3a        212148
artist              Sons Of The Subway                      f               232d5716-c2b2-47e1-aa0c-264ec69e6a18        100774
artist              The Poe Boy Family                      f               672d599e-6a6c-456e-98ba-dac5a45e3ed8        43132
artist              Ralph Gusovius  Germany Male                f           1950    6ecfcea1-677d-427b-a38b-9c76ce92e313    Person  295052
artist              Elastik Band                        f               46e0639c-1ccf-45f5-b886-4cbf5549a2a1        61467

And then import the two files with something like

java -Xmx10G -server -Dfile.encoding=UTF-8 -jar ~/neo/batch-import/target/batch-import-jar-with-dependencies.jar ./graph.db nodes.csv rels.csv 

WDYT? It would make the output a lot easier, and the import took about 10min on my machine, 160M Properties, 75M relatoinships ...

creating csv files

Hi

I have been following the steps to create the csv files for musicbrainz. Currently my script has only outputted the following lines

UPDATE 0 UPDATE 0 NOTICE: table "entity_mapping" does not exist, skipping DROP TABLE

I am concerned whether this takes a very long time to execute or if the script is halted. It seems that the script is sleeping when I check the processes. Any insight pls?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.