TAOBench

A distributed database benchmark based on Meta's social graph workload.

Building & Schema Setup

Follow the instructions in each database directory for build instructions and Schema setup.

For SQL databases, TAOBench uses an objects and an edges table to represent TAO's graph data model:

CREATE TABLE objects (
    id BIGINT PRIMARY KEY,
    timestamp BIGINT,
    value VARCHAR(150));
CREATE TABLE edges (
    id1 BIGINT,
    id2 BIGINT,
    type BIGINT,
    timestamp BIGINT,
    value VARCHAR(150),
    PRIMARY KEY CLUSTERED (id1, id2, type));

Configuration

`experiments.txt` file:

TAOBench supports running multiple experiments in a single run via a configurable experiments.txt file. Each line of that file specifies a different experiment and should be of the format:

num_threads,num_ops,target_throughput

num_threads specifies the number of threads concurrently making requests during the experiment; num_ops specifies the total number of DB operations that will be run, and target_throughput specifies a maximum target throughput, beyond which the benchmark will throttle by putting threads to sleep. As a guideline, num_ops and target_throughput can be set to very high numbers (~ 1 billion) and effectively ignored; experiments will automatically timeout, and throughput targets can instead be tweaked by the num_threads, which is the number of client threads making requests in parallel.

`src/constants.h`

Other benchmark-level attributes can be tweaked in this file. In particular, different values of READ_BATCH_SIZE and WRITE_BATCH_SIZE might improve performance for batch inserts and batch reads.

Prepping the Database

This phase populates the DB tables with an initial set of edges and objects. We batch insert data into the DB and batch read them into memory to be used when running experiments. To run the batch insert phase, use the following command:

./benchmark -threads <num_threads> -db <db> -P path/to/database_properties.properties -C path/to/config.json -load -n <num_edges>

Ideal values for num_threads and num_edges will vary by database and by use-case, but 50 and 165,000,000 should be good starting points, respectively.

Running Experiments

./benchmark -threads <num_threads> -db <db> -P path/to/database_properties.properties -C path/to/config.json -run -E path/to/experiments.txt

This command first batch reads all the keys that were inserted in the batch insert phase and then begins to run experiments. Note that the batch read phase is only run for the first experiment and can take several hours depending on the number of keys in the DB. Here, num_threads specifies the number of threads used for batch reading, not for the experiments. The value specified here must be less than or equal to the number of shards. 50 is the default value.

Interpreting Results

Here's a sample result of an experiment run. These statistics are printed to standard output---here's a sample:

Total runtime(sec): 612.332
Runtime excluding warmup (sec): 552.331
Total completed operations excluding warmup: 955070
Throughput excluding warmup: 1729.16
Number of overtime operations excluding warmup: 958438
Number of failed operations excluding warmup: 3378
862657 operations; [INSERT: Count=31525 Max=212570.51 Min=3928.44 Avg=7536.34] [READ: Count=606680 Max=212879.86 Min=1483.02 Avg=2546.12] [UPDATE: Count=167828 Max=394803.53 Min=3993.65 Avg=7885.27] [READTRANSACTION: Count=53338 Max=998148.18 Min=5130.46 Avg=41861.58] [WRITETRANSACTION: Count=3286 Max=240072.81 Min=10341.05 Avg=37818.67] [WRITE: Count=199353 Max=394803.53 Min=3928.44 Avg=7830.09]

A few clarifications:

For throughput, each read/write/read transaction/write transaction counts as a single completed operation.
The last line describes operation latencies in microseconds. The WRITE operation category is an aggregate of inserts/updates/deletes.

dbsid / taobench Goto Github PK

taobench's Introduction

TAOBench

Building & Schema Setup

Configuration

`experiments.txt` file:

`src/constants.h`

Prepping the Database

Running Experiments

Interpreting Results

taobench's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

dbsid / taobench Goto Github PK

taobench's Introduction

TAOBench

Building & Schema Setup

Configuration

experiments.txt file:

src/constants.h

Prepping the Database

Running Experiments

Interpreting Results

taobench's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org

`experiments.txt` file:

`src/constants.h`