Code Monkey home page Code Monkey logo

nancy's Introduction

Postgres.AI   GitHub code size in bytes     CircleCI

About screen shot 2018-09-18 at 03 04 09

Nancy helps to conduct automated database experiments.

The Nancy Command Line Interface is a unified way to manage automated database experiments either in clouds or on-premise.

What is a Database Experiment?

Database experiment is a set of actions performed to test

  • (a) specified SQL queries ("workload")
  • (b) on specified machine / OS / Postgres version ("environment")
  • (c) against specified database ("object")
  • (d) with an optional change – some DDL or config change ("target" or "delta").

Two main goals for any database experiment:

  • (1) validation – check that the specified workload is valid,
  • (2) benchmark – perform deep SQL query analysis.

Database experiments are needed when you:

  • add or remove indexes;
  • for a new DB schema change, want to validate it and estimate migration time;
  • want to verify some query optimization ideas;
  • tune database configuration parameters;
  • do capacity planning and want to stress-test your DB in some environment;
  • plan to upgrade your DBMS to a new major version;
  • want to train ML model related to DB optimization.

Currently Supported Features

  • Works anywhere where Docker can run (checked: Linux Ubuntu/Debian, macOS)
  • Experiments are conducted in a Docker container with extended Postgres setup
  • Supported Postgres versions: 9.6, 10
  • Postgres config specified via options, may be partial
  • Supported locations for experimental runs:
    • Any machine with Docker installed
    • AWS EC2:
      • Run on AWS EC2 Spot Instances (using Docker Machine)
      • Allow to specify EC2 instance type
      • Auto-detect and use current lowest EC2 Spot Instance prices
      • Support i3 instances (with NVMe SSD drives)
      • Support arbitrary-size EBS volumes
  • Support local or remote (S3) files – config, dump, etc
  • The object (database) can be specified in various ways:
    • Plain text
    • Dump file (.sql, .gz, .bz2) – ⚠️ only plain, single-file dumps are currently supported
  • What to test (a.k.a. "target" or "delta"):
    • Test Postgres parameters change
    • Test DDL change (specified as "do" and "undo" SQL to return state)
  • Supported types of workload:
    • Use custom SQL as workload
    • Use "real workload" prepared using Postgres logs
  • For "real workload", allow replaying it with increased speed
  • Allow to keep container alive for specified time after all steps are done
  • Collected artifacts:
    • Workload SQL logs
    • Deep SQL query analysis report

Requirements

  1. To use Nancy CLI you need Linux or MacOS with installed Docker.

  2. To run on AWS EC2 instances, you also need:

Installation

In the minimal configuration, only two steps are needed:

  1. Install Docker (for Ubuntu/Debian: sudo apt-get install docker)

  2. Clone this repo and adjust $PATH:

git clone https://github.com/startupturbo/nancy
echo "export PATH=\$PATH:"$(pwd)"/nancy" >> ~/.bash_profile
source ~/.bash_profile

Additionally, to allow use of AWS EC2 instances:

  1. Follow instructions https://docs.aws.amazon.com/cli/latest/userguide/installing.html

  2. Follow instructions https://docs.docker.com/machine/install-machine/

  3. install jq (for Ubuntu/Debian: sudo apt-get install jq)

Getting started

Start with these commands:

nancy help
nancy run help

"Hello World!"

Locally, on any Linux or macOS machine:

echo "create table hello_world as select i from generate_series(1, (10^6)::int) _(i);" \
  | bzip2 > ./sample.dump.bz2

# "Clean run": w/o index
# (seqscan is expected, total time ~150ms, depending on resources)
nancy run \
  --db-dump file://$(pwd)/sample.dump.bz2 \
  --workload-custom-sql "select i from hello_world where i between 10 and 20;"

# Now check how a regular btree index affects performance
# (expected total time: ~0.05ms)
nancy run \
  --db-dump file://$(pwd)/sample.dump.bz2 \
  --workload-custom-sql "select i from hello_world where i between 10 and 20;" \
  --delta-sql-do "create index i_hello_world_i on hello_world(i);" \
  --delta-sql-undo "drop index i_hello_world_i;"

AWS EC2:

nancy run \
  --run-on aws \
  --aws-ec2-type "i3.large" \
  --aws-keypair-name awskey \
  --aws-ssh-key-path file://$(echo ~)/.ssh/awskey.pem  \
  --db-dump "create table hello_world as select i from generate_series(1, (10^6)::int) _(i);" \
  --workload-custom-sql "select i from hello_world where i between 10 and 20;"

Additional notes

If you experience issues with running (locally) nancy run inside screen or tmux, double-check that Docker is running and add your user to the docker group:

usermod -aG docker ${USER}

On MacOS, it is recommended to specify --tmp-path explicitly, similar to this:

mkdir ./tmp
nancy run ... --tmp-path "$(pwd)/tmp"

nancy's People

Contributors

dmius avatar michelp avatar nikolays avatar startupturbo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.