Code Monkey home page Code Monkey logo

graph-databases-testsuite's Introduction

Graph Databases Benchmarking Suite

This repository contains the code for the following publication:

Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation

by Matteo Lissandrini, Martin Brugnara, Yannis Velegrakis

published in the Proceedings of the Conference on Very Large Databases (PVLDB), 2018

A guide to reproduce the experiments in the paper is provided in the REPRODUCE.md file.


Comparison of Graph Databases
Graph databases (GDBs) are grounded on the concepts of graph theory: abstracting data in the form of nodes, edges and properties.

A variety of GDB implementations has been appearing, and most of the systems are usually accompanied with proprietary tests showcasing their performance in specific use-cases.

There are also some experimental comparisons that focus on particular aspects and some benchmarking suites have been proposed. Most of them are outdated, others are just extremely partial in their scope. A broad study of scalability and applicability of existing solutions is required in order to highlight in which context (type of data, management of metadata, and type of queries) each existing implementation performs better and at which scale. In particular we set the goal to identify the actual advantages or limitations of specific design choices, operator implementations, and processing techniques in different GDB systems.

This repository
Thi repository contains the framework we developed to carry out such study. We made it open-source such that 1) it can be used at the current state to verify our findings and 2) be extended via the addition of queries or database images to carry out new tests.

In particular, we provide docker images for the following databases:

  • ArangoDB
  • Blazegraph
  • Neo4j (Versions 1.9, and 3.0 )
  • OrientDB
  • PostgreSQL (via sqlg)
  • Sparksee
  • Titan (Versions 0.5.4, and 1.0.0)

Other may be added as explained in SETUP.md.

Quick Setup

  1. Clone this repository.

  2. Install the dependencies (Python 2.7):

    virtualenv .venv && source .venv
    pip install -r requirements.txt
    
  3. Prepare the database image ( in this example we will use neo4j v1.9 for Tinkerpop 2): cd images/ && make gremlin-neo4j.dockerfile

  4. Provide the test dataset[s]. Let's use the provided toy graph:

    # Make sure the directory exists.
      mkdir -p runtime/data
    # Provided the data.
      cp ~/freebase_small.json{,2} runtime/data/
  5. Configure the experiment with the select query-set.
    Queries are stored in runtime/tp[2,3]/queries and their meta-parmeters can be found in runtime/meta.

    # Configuration example
    cat<<EOF > test.json
    {
      "datasets": [
         "freebase_0000.json2"
      ],
       "queries": [
          "count-edges.groovy",
          "count-nodes.groovy"
      ]
    }
    EOF
  6. Run the tests:

    # Execute three times (-r) the tests defined in 'test.json' (-s) 
    # on neo4j (-i) providing JAVA_OPTS as extra environment variable (-e).
    python test.py -r 3 \
    		-i dbtrento/gremlin-neo4j \
    		-e JAVA_OPTS="-Xms1G -Xmn128M -Xmx8G" \
    		-s test.json

Personalize your experiments

How to use the test suite: RUN.md

Insights on the folder structure and scripts role: FILES.md

Guide to include a new database system in the framework: SETUP.md

Authors

Matteo Lissandrini [email protected] (@Kuzeko).
Martin Brugnara [email protected] (@MartinBrugnara).

Contributors
Liviu Bogdan (UniTN), Allan Nielsen (UniTN), Denis Gallo (UniTN).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.