Code Monkey home page Code Monkey logo

ogp's Introduction

This project is intended to measure performance of OrientDB under specific conditions. Those conditions are similar to those of a proprietary application that is much larger and more complex, and not sharable.

Usage

Build and run with Gradle:

./gradlew run [ <options> ]

-Pmodel=radial|scatter|sprawl|mixed|light|heavy    (defaults to radial)
-Pdbpath=[ memory:... | plocal:... | remote:... ]  (defaults to memory:test)
-Pindexes=[ query | graph ]  (defaults to no edge indexes)
-POV=<orient version>
-PGV=<groovy version>

or to run a series of models under varying conditions:

./batch-run-models

This has been tested with Orient versions 1.7.8 and 2.0.7, and Groovy versions 1.8.9 and 2.4.3.

To run as a standalone application (for profiling):

./gradlew installApp [ -POV=<orient version> ] [ -PGV=<groovy version> ]
build/install/orient-graph-performance/bin/orient-graph-performance [ radial|scatter|sprawl|mixed|light|heavy ] [ <dbpath> ] [ <indexes> ]

Detailed data will be written to CSV files in the results directory. The CSV files are manually combined into Excel worksheets with graphs for better visualization of the data. Logs are displayed and written to performance.log.

Performance Observations

  • Orient 2.0.7 performs better than 1.7.8, in general.
    • Orient 1.7.8 performs better than 2.0.7 when using only lightweight edges.
    • Orient 2.0.7 performs equally well with both lightweight and heavyweight edges, so there's no apparent penalty for using heavyweight edges.
  • Groovy 2.4.3 performs better than 1.8.9, when run without invokedynamic.
  • The average time to ingest a single node appears to be O(1), regardless of sub-graph size -- this is good.
  • The average time to ingest a single edge appears to be O(E), where E is the number of edges already connected to the source node.
  • Based upon profiling, the ingester is spending 58% or more of its time in findOrCreateEdge(), evaluating the iterator returned by OrientVertex::getEdges(). This iterator should only return 0 or 1 edges, but it takes an inordinate amount of time to do so.
  • It appears that the OrientVertex::getEdges() iterator does a linear search through all of the edges for a source node in order to return the matching edges.
  • The first dozen ingest operations have poor performance as the Java JIT is getting up-to-speed and optimizing. (This is not a problem for our application, but matters with performance testing).

Theory of Operation

This test suite creates a series of sub-graphs and then ingests those graphs into a single (in-memory) graph database, no sharding, no clustering, no replication.

  • Application node classes are descended from the Node class.
  • Application nodes are indexed by key, in separate indexes for each class.
  • Application edge classes are descended from the Edge class.
  • Most edges are lightweight, but the Edge class includes properties.
  • For any sub-graph, all of the nodes are ingested first, and then all of the edges are ingested.
  • Each sub-graph is ingested in its own transaction.
  • If a given node already exists, then its properties will be updated from the sub-graph.
  • There can be at most one edge of any given type between any two nodes. (There can be multiple edges of different types between a pair of nodes, but at most one of each type).

The suite measures the time in milliseconds necessary to ingest all of the nodes and all of the edges for a given sub-graph and calculates the average time to ingest a single node and edge, for a given transaction. The time to create the sub-graph is not included in the measurements.

Data Models

There are a number of models that represent different types of sub-graphs that can be ingested by the application. Each run will ingest 500 sub-graphs, based upon the selected data model. While the sub-graphs are randomly generated, each run uses a fixed seed, so each set of graphs will be repeatable. The data models for individual graphs are:

  • radial -- This produces a single central node, and then a number of nodes, each connected to the central node.
  • scatter -- This produces a collection of small graphs, not connected.
  • sprawl -- This produces a chain of nodes, with various branches along the way.

There are specific patterns of data available as well:

  • mixed -- Starts with a large radial sub-graph, then a series of increasingly larger scatter sub-graphs.
  • light -- Increasing sized radial graphs built with lightweight edges. This model demonstrates the O(E) performance of the getEdges() method.
  • heavy -- Increasing sized radial graphs built with heavyweight edges. This model demonstrates the O(E) performance of the getEdges() method.
  • special -- Constant size radial graphs built with heavyweight edges.

Classes

  • GraphPerformance -- This class drives everything else and contains the logic for ingesting graphs.
  • Database -- This class creates a fresh database, including schema classes for nodes and edges, and indexes for each class.
  • Data -- This class randomly creates sub-graphs, using a seeded random number generator, so every run will result in the same data.
  • SubGraph, MyNode, MyEdge -- These classes are POGOs to model a sub-graph.
  • PerfCounter -- This class captures and formats metrics about each ingest.

Profiling

Oracle has a reasonably good profile tool Java Mission Control that is bundled with the Oracle JDK (7 & 8). If you want to use that to profile the appplication, I've included a script (flight_recorder) to make that easier. Use it as follows:

./flight_recorder build/install/orient-graph-performance/bin/orient-graph-performance radial memory:test graph

It will create a Java Flight Recorder (.jfr) file that can be analyzed with Java Mission Control ($JAVA_HOME/bin/jmc).

ogp's People

Contributors

wcraigtrader avatar

Stargazers

 avatar Eugene Kamenev avatar Riccardo Tasso avatar Vyacheslav Rusakov avatar  avatar Dustin Deus avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.