Code Monkey home page Code Monkey logo

flashback's Introduction

What is Flashback

How can you measure how good your MongoDB (or other databases with similar interface) performance is? Easy, you can benchmark it. A general way to solve this problem is to use a benchmark tool to generate queries with random contents under certain random distribution.

But sometimes you are not satisfied with the randomly generated queries, since you're not confident in how much these queries resemble your real workload.

The difficulty compounds when one MongoDB instance may host completely different types of databases that each have their own unique and complicated access patterns.

That is the reason we came up with Flashback, a MongoDB benchmark framework that allows us to benchmark with "real" queries. It is comprised of a set of scripts that fall into the 2 categories:

  1. records the operations(ops) that occur during a stretch of time;
  2. replays the recorded ops.

The two parts are not tied to each other and can be used independently for different purposes.

How it works

Record

How do you know which ops are performed by MongoDB? There are a lot of ways to do this. But in Flashback, we record the ops by enabling MongoDB's profiling.

By setting the profile level to 2 (profile all ops), we'll be able to fetch the ops information detailed enough for future replay -- except for insert ops.

MongoDB does not log insertion details in the profile DB. However, if a MongoDB instance is working in a "replica set", we can capture insert information by reading the oplog.

Thus, we record the ops with the following steps:

  1. The script starts multiple threads to pull the profiling results and oplog entries for collections and databases that we are interested in. Each thread works independently.
  2. After fetching the entries, we'll merge the results from all sources to get a full picture of all operations.

Replay

With the ops being recorded, we also have a replayer to replay them in different ways:

  • Replay ops with "best effort". The replayer diligently sends these ops to databases as fast as possible. This style can help us to measure the limits of databases. Please note to reduce the overhead for loading ops, we'll preload the ops to the memory and replay them as fast as possible. This potentially limits the number of ops played back per session to the available memory on the Replay host.
  • Reply ops in accordance to their original timestamps, which allows us to imitate regular traffic.

The replay module is written in Go because Python doesn't do a good job in concurrent CPU intensive tasks.

How to use it

Record

Prerequisites

  • The "record" module is written in python. You'll need to have pymongo, mongodb's python driver installed.
  • Set MongoDB profiling level to be 2, which captures all the ops.
  • Run MongoDB in a replica set mode (even there is only one node), which allows us to access the oplog.

Configuration

  • If you are a first time user, please run cp config.py.example config.py.
  • In config.py, modify it based on your need. Here are some notes:
    • We intentionally separate the servers for oplog retrieval and profiling results retrieval. As a good practice, it's better to pull oplog from secondaries. However profiling results must be pulled from the primary server.
    • duration_secs indicates the length for the recording.

Start Recording

After configuration, please simply run python record.py.

Replay

Prerequisites

  • Go 1.4
  • PyMongo 2.9.x (earlier 2.x versions may work. 3.x does NOT currently work)

Installation

$ go get github.com/ParsePlatform/flashback/cmd/flashback

Command

Required options:

flashback \
    --style=[real|stress] \
    --ops_filename=<file_name> \ # Operations file, such as generated by the Record tool

To use a specific host/port and/or to use authentication, specify a mongodb:// url:

flashback \
    --url=mongodb://myuser:[email protected]:27017
    ... 

For a full list of options:

flashback --help

Misc

pcap_converter

pcap_converter is an experimental way to build a recorded ops file from a pcap of mongo traffic.

Note: 'getmore' operations are not yet supported by pcap_converter

$ go get github.com/ParsePlatform/flashback/cmd/pcap_converter
$ tcpdump -i lo0 -w some_mongo_cap.pcap 'tcp and dst port 27017'
$ pcap_converter -f some_mongo_cap.pcap -o ops_filename.bson

flashback's People

Contributors

agfeldman avatar charity avatar dbmurphy avatar igorcanadi avatar jameswahlin avatar liukai avatar skinp avatar tmc avatar wojcikstefan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flashback's Issues

The order of $hint and $orderby is not preserved

Since a JS object preserves order of the keys while Python and Go don't, we need to make sure that the keys in $hint and $orderby are in the right order.

Right now for example, an operation in the profile collection can look like this:

{
  "op": "query",
  "ns": "closeio.activity",
  "query": {
    "$query": {
      "organization": DBRef("organization", "orga_awqeiorqwejrlkqwerklqjwe"),
      "lead": DBRef("lead", "lead_asdlkjflaksjdflkajsdlfqiwueroiqu")
    },
    "$hint": {
      "lead": 1,
      "date_created": -1
    },
    "$orderby": {
      "date_created": -1
    }
  },
  ...
  "ntoreturn": 51,
  "ntoskip": 0,
}

... and is recorded as:

{"ns": "closeio.activity", "ts": {"$date": 1443222746092}, "ntoreturn": 51, "query": {"$hint": {"date_created": -1, "lead": 1}, "$orderby": {"date_created": -1}, "$query": {"organization": {"$ref": "organization", "$id": "orga_awqeiorqwejrlkqwerklqjwe"}, "lead": {"$ref": "lead", "$id": "lead_asdlkjflaksjdflkajsdlfqiwueroiqu"}}}, "ntoskip": 0, "op": "query"}

... and when you try to replay it, MongoDB returns:

2015-09-25T16:21:55.279-0700 I QUERY    [conn5792] assertion 17007 Unable to execute query: error processing query: ns=closeio.activity limit=51 skip=0
Tree: $and
    lead == { $ref: "lead", $id: "lead_asdlkjflaksjdflkajsdlfqiwueroiqu" }
    organization == { $ref: "organization", $id: "orga_awqeiorqwejrlkqwerklqjwe" }
Sort: { date_created: -1 }
Proj: {}
 planner returned error: bad hint ns:closeio.activity query:{ $hint: { date_created: -1, lead: 1 }, $orderby: { date_created: -1 }, $query: { organization: { $ref: "organization", $id: "orga_awqeiorqwejrlkqwerklqjwe" }, lead: { $ref: "lead", $id: "lead_asdlkjflaksjdflkajsdlfqiwueroiqu" } } }

Feature Request: Allow for adjustment of replay rate

When load testing a database cluster it is important to determine to what degree your current system can scale. The current "real" mode is useful, but I would love to see the ability to replay real traffic with a replay rate multiplier.

If I expect for example that my production traffic is going to triple over the next 12 months, I would like to run current production load first at 3 times the current rate to confirm we can handle. I would then scale that up to the point where performance is no longer acceptable to understand where the breaking point is.

pcap_converter (possibly) does not output compatible bson file

Hey guys,

It's very possible I'm lost/did something incorrect but using HEAD today I could not get a usable file out of pcap_converter for use for replay with the 'flashback --ops_filename=' option.

It seems the 'pcap_converter' go code outputs a JSON file of operations whereas the go 'flashback' tool expects BSON, as seen here: https://github.com/ParsePlatform/flashback/blob/master/ops_reader.go#L67. My guess is maybe JSON was the previous way of doing things before 'replay' moved to go?

When I use the json file outputted from pcap_converter as flashback's --ops_filename= option, flashback seems to ignore my file and execute zero operations although I have many (100s) in the file.

If this can be resolved by moving pcap_converter to output BSON instead of JSON, I am happy to make a PR for that, but I wanted to sanity check what didn't work here first and if more than just BSON/JSON changed.

Thanks!

How to use replay module

Hello, is it possible to add more information in your documentation?
Have a lot of questions like
How to interpret the content of file db_name_host:port?
How to save result of replay in a file? And what the result of replay?
Here is what I have
screenshot-1

Am I using replay in a wrong way?

main.go should have better output

So I am working on a better password handler for the go replay system and notice the --help output is painful.

I propose we switch to something a little nicer.
OLD:
screen shot 2015-06-26 at 5 30 10 pm

NEW:
screen shot 2015-06-26 at 5 30 31 pm

NEW with "real" sub-command:
screen shot 2015-06-26 at 5 35 17 pm

This was just a mockup with kingpin, but use groups and subcommands seems like it would be nicer , since so many options are for real vs stress testing.

These were added by my patch, so dont worry about them yet ;)
--auth Enable Auth mode
--authdb="admin" Database to authenticate against! If --auth used.
--username=USERNAME Username to authenticate with if --auth used.
--password=PASSWORD Password to authenticate with if --auth used.

pcap_converter: add mongoproto.OpDelete, mongoproto.OpUpdate and mongoproto.OpGetMore operations once available

Delete, Update and GetMore operations don't work with pcap_converter because github.com/tmc/mongoproto does not have a complete set of methods (.FromReader(), .OpCode() and .String()) for these types.

I've started to fix this on tmc/mongoproto on this branch of mongoproto: https://github.com/timvaillancourt/mongoproto/tree/update_delete_fromreaders.

This issue is a placeholder to remind me (or someone) to add Delete, Update and GetMore (if that's even possible) to pcap_converter once tmc/mongoproto supports it.

pcap having parse issues

See bolded items below , this was a simple tcpdump -i eth0 'dst port 27017' and it appears only GLE's are parseable. @tredman have any thoughts?

12028 packets captured
12066 packets received by filter
38 packets dropped by kernel
[host/user hidden]$ ./gocode/bin/pcap_converter -f mongod2.pcap
2015/11/24 11:29:46 starting stream 35337->27017 35337->27017
2015/11/24 11:29:46 starting stream 36688->27017 36688->27017
2015/11/24 11:29:46 starting stream 41306->27017 41306->27017
2015/11/24 11:29:46 starting stream 47213->27017 47213->27017
2015/11/24 11:29:46 starting stream 53872->27017 53872->27017
2015/11/24 11:29:46 starting stream 37393->27017 37393->27017
2015/11/24 11:29:46 starting stream 40384->27017 40384->27017
2015/11/24 11:29:46 starting stream 55352->27017 55352->27017
2015/11/24 11:29:46 starting stream 43669->27017 43669->27017
2015/11/24 11:29:46 starting stream 56853->27017 56853->27017
2015/11/24 11:29:46 starting stream 34560->27017 34560->27017
2015/11/24 11:29:46 starting stream 51858->27017 51858->27017
2015/11/24 11:29:46 starting stream 41928->27017 41928->27017
2015/11/24 11:29:46 starting stream 36132->27017 36132->27017
2015/11/24 11:29:46 starting stream 56186->27017 56186->27017
2015/11/24 11:29:46 starting stream 47873->27017 47873->27017
2015/11/24 11:29:46 starting stream 49007->27017 49007->27017
2015/11/24 11:29:46 starting stream 58107->27017 58107->27017
2015/11/24 11:29:46 error parsing op: unexpected EOF
2015/11/24 11:29:46 error parsing op: mongoproto: got invalid document size
discarded 719272

{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}
{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}
{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}
{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}
{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}
{"command":{"getlasterror":1,"j":false,"fsync":false,"wtimeout":null},"ns":"apache.$cmd","ntoreturn":-1,"ntoskip":0,"op":"command","ts":{"$date":"1970-01-17T18:19:45.634Z"}}

python record gives configuration error unknown option slaveOk

Traceback (most recent call last):
File "record.py", line 527, in
main()
File "record.py", line 516, in main
recorder = MongoQueryRecorder(db_config)
File "record.py", line 144, in init
self.oplog_clients[server_string] = self.connect_mongo(server)
File "record.py", line 274, in connect_mongo
client = MongoClient(server_config['mongodb_uri'], slaveOk=True)
File "/Users/yayati/CodePlay/python/venv/lib/python2.7/site-packages/pymongo/mongo_client.py", line 342, in init
for k, v in keyword_opts.items())
File "/Users/yayati/CodePlay/python/venv/lib/python2.7/site-packages/pymongo/mongo_client.py", line 342, in
for k, v in keyword_opts.items())
File "/Users/yayati/CodePlay/python/venv/lib/python2.7/site-packages/pymongo/common.py", line 465, in validate
value = validator(option, value)
File "/Users/yayati/CodePlay/python/venv/lib/python2.7/site-packages/pymongo/common.py", line 107, in raise_config_error
raise ConfigurationError("Unknown option %s" % (key,))
pymongo.errors.ConfigurationError: Unknown option slaveOk

using python 2.7, pymongo 3.2.1 and under virtual env 14.0.0

flashback and pcap_converter: add getmore support

It looks like getmore isn't really supported in flashback and pcap_converter, tracking this here.

The executor currently does nothing in 'flashback': https://github.com/ParsePlatform/flashback/blob/master/ops_executor.go#L106-L108.

I'm curious what thoughts are out there on how to add this. Right now my best idea is tracking query -> cursorId mappings so that getmores can be called on real cursors, but I'm guessing this would introduce some memory usage problems in the right scenarios. I might start a branch to try this out unless there are other ideas.

Binary fields cause user assertion

We were using flashback in a troubleshooting session and we observed problems when using Flashback on MongoDB 2.6 when saving binary fields.

The error message we see in the logs is this:

2014-12-19T13:08:17.404-0800 [conn69] User Assertion: 52:The dollar ($) prefixed field '$binary' in 'enc_key.$binary' is not valid for storage.

I suspect though haven't confirmed that this assertion was triggered by FlashBack when it attempted to save a query which included a binary field.

I'm reporting this now for consideration and transparency, but I won't be able to troubleshoot further. I invite others to do so.

Add support for multi-updates

Currently in my testing I am unable to get record.py (or pcap_converter) to properly record multi-updates, ie: {multi:1}. Currently multi-updates are logged as regular, single updates.

This issue is to track that multi-updates are currently unsupported.

I may have some time to assist implementing this if we can get agreement on an approach.

panic: interface conversion: interface is int32, not float64

I am getting this error when starting replay with recorded OUTPUT json and go 1.4.2:

panic: interface conversion: interface is int32, not float64

goroutine 113 [running]:
github.com/ParsePlatform/flashback.(*OpsExecutor).execQuery(0xc20866b2c0, 0xc2085d0120, 0xc208697530, 0x0, 0x0)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/ops_executor.go:50 +0x20e
github.com/ParsePlatform/flashback.*OpsExecutor.(github.com/ParsePlatform/flashback.execQuery)·fm(0xc2085d0120, 0xc208697530, 0x0, 0x0)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/ops_executor.go:35 +0x4d
github.com/ParsePlatform/flashback.func·004(0x0, 0x0)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/ops_executor.go:144 +0x2b5
github.com/ParsePlatform/flashback.retryOnSocketFailure(0xc208815e70, 0xc2086621a0, 0xc20803c420, 0x0, 0x0)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/ops_executor.go:116 +0x47
github.com/ParsePlatform/flashback.(*OpsExecutor).Execute(0xc20866b2c0, 0xc208608050, 0x0, 0x0)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/ops_executor.go:146 +0xc6
main.func·003(0xc20866b2c0, 0x68ab90, 0x7)
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/cmd/flashback/main.go:317 +0x6b
created by main.func·005
        /home/ubuntu/gocode/src/github.com/ParsePlatform/flashback/cmd/flashback/main.go:328 +0x725

normalizeObj in replay not fully working

Before normalizeObj:  map[ntoreturn:10000 query:map[$hint:map[_id:1] $query:map[ag_id:map[$oid:5433f7254d32daa1e3000198] _id:map[$in:[map[$oid:547af4cf62194fd252005d9a]]] $and:[map[up.pu_at:<nil> pt.t:map[$ne:<nil>]]]]] ns:XXXX.users op:query ntoskip:0 ts:map[$date:1.431009423378e+12]]
After nomralizeObj:  map[ts:2015-05-07 07:37:03.378 -0700 PDT ntoreturn:10000 query:map[$hint:map[_id:1] $query:map[ag_id:ObjectIdHex("5433f7254d32daa1e3000198") _id:map[$in:[map[$oid:547af4cf62194fd252005d9a]]] $and:[map[up.pu_at:<nil> pt.t:map[$ne:<nil>]]]]] ns:XXXX.users op:query ntoskip:0]

As you can see it is not parsing inside $or , $in and other operators. My thoughts are rather than just a type switch it needs to be a type switch or where the value starts with $ it needs to cast it?

I honestly feel very week at golang, so I could be off base but I am seeing this with $and, $or, and $in type operators.

Allow sharded clusters to self discover

If using a sharded system, it could be useful to have it self discover nodes and have something like

auto_config: true,
// profiler could be [ all, secondaries, primary]
// additional options could be added later
auto_config_options: { profiler : "all" }

One question is due the pyMongos change in 3.0 should we support only the new MongoClient which include the replica_set_connection class, or should we detect the version and then build the connection based on which driver version is installed.

@charity : Thoughts on this?

Which Versions?

Which versions of python and pymongo were used for these scripts?

fatal error: runtime: out of memory

Hello,

I've been stress testing a database using flashback, and it seems to me that the style "stress" is not working. I am using go1.4 and get no data in my "statsfilename" and similarly no data in the "stdout" file.

I have been getting the "out of memory" error that is in the title of this post. Any ideas on what might be causing this?

Unable to install flashback replay module on Ubuntu

Steps:

  • Checkout this repo
  • go get github.com/ParsePlatform/flashback/cmd/flashback
ubuntu@ip-172-31-24-xxx:~/flashback/cmd/flashback$ go get github.com/ParsePlatform/flashback/cmd/flashback
# github.com/mongodb/mongo-tools/common/json
../../../gocode/src/github.com/mongodb/mongo-tools/common/json/encode.go:245: undefined: sync.Pool

Environment:

ubuntu@ip-172-31-24-xxx:~/flashback$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.2 LTS
Release:    14.04
Codename:   trusty

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.