Code Monkey home page Code Monkey logo

reap's Introduction

reap

A tool for parsing Ruby heap dumps by analyzing the reference graph.

Supports drilldown into just the memory retained by a given object, and optional graphical output.

When to use reap

This tool is intended to be useful for optimizing memory usage as well as debugging memory leaks. If you have a snapshot of the heap of a Ruby process (see below for tips on getting one), reap can help you understand the contents of that snapshot.

To do so, we build a dominator tree from the reference graph showing which objects are holding on to large quantities of memory. (Node v "dominates" node w in a directed graph if all paths from a given root to w run through v. In the context of memory references, this implies that object w is only live because object v is live.)

Limitations & comparisons

reap does not currently understand garbage collection "generations", which can also be useful for finding leaks.

The disadvantage of analyzing GC generations is that in order to collect the necessary data, you need to trace object allocations, which can be prohibitively expensive in production. If this is not a problem for you, you may want to try another tool such as heapy instead of, or in addition to, reap.

reap is intended to provide useful data even when allocations are not being traced. It can also analyze fairly large (gigabyte-plus) heaps in seconds, thanks to being written in Rust rather than Ruby.

How to use reap

Run with --help for full documentation.

Basic usage:

$ cargo run -q --release -- /tmp/heap.json -f flamegraph.svg -c 3
Object types using the most live memory:
Thread: 2.1 MB (40 objects)
String: 462.6 KB (9235 objects)
Class: 223.7 KB (287 objects)
...: 653.0 KB (5909 objects)

Objects retaining the most live memory:
root: 3.4 MB (15472 objects)
Thread[0x7f83df87dc40]: 1.1 MB (25 objects)
Thread[0x7f83e107cd78]: 1.0 MB (7 objects)
...: 4.6 MB (59857 objects)

Object types retaining the most live memory:
ROOT: 3.4 MB (15472 objects)
Thread: 2.1 MB (70 objects)
ARRAY: 949.3 KB (13053 objects)
...: 3.6 MB (46766 objects)

Objects unreachable from root:
Class: 189.6 KB (617 objects)
String: 81.8 KB (1174 objects)
ARRAY: 38.6 KB (298 objects)
...: 91.5 KB (1422 objects)

Wrote 15471 nodes to flamegraph.svg

Dig into a subtree (in this case, the larger Thread):

$ cargo run -q --release -- /tmp/heap.json -d out.dot -c 3 -r 0x7f83df87dc40
Object types using the most live memory:
Thread: 1.0 MB (1 objects)
Class: 1.6 KB (3 objects)
Hash: 1.3 KB (7 objects)
...: 980 B (14 objects)

Objects retaining the most live memory:
Thread[0x7f83df87dc40]: 1.1 MB (25 objects)
Hash[0x7f83e10452d8][size=5]: 1.2 KB (6 objects)
Object[0x7f83df8d62c8][CLASS]: 992 B (8 objects)
...: 3.0 KB (24 objects)

Object types retaining the most live memory:
Thread: 1.1 MB (25 objects)
Hash: 2.2 KB (12 objects)
Class: 1.9 KB (10 objects)
...: 1.1 KB (16 objects)

Objects reachable from, but not dominated by, 0x7f83df87dc40:
String: 352.3 KB (6604 objects)
Class: 220.6 KB (283 objects)
Regexp: 108.8 KB (139 objects)
...: 465.2 KB (5716 objects)

Wrote 1 nodes & 0 edges to out.dot

Installation

Ensure you have Rust's cargo package manager installed, then just cargo install reap.

Getting a heap dump

If you have rbtrace installed, and required in the process you're planning to trace, you can run:

rbtrace -p $PID -e "Thread.new{require 'objspace';f=open('/tmp/heap.json','w');ObjectSpace.dump_all(output: f, full: true);f.close}"

Otherwise, you can connect to the Ruby process with gdb, then run:

call rb_eval_string_protect("Thread.new{require 'objspace';f=open('/tmp/heap.json','w');ObjectSpace.dump_all(output: f, full: true);f.close}", 0)

reap's People

Contributors

djudd avatar shanempope avatar bmansoob avatar manuelfelipe avatar dependabot[bot] avatar

Stargazers

Sebastian Cohnen avatar Rafael Anaice avatar Ted Behling avatar Nicolò Rebughini avatar Luiz Eduardo Kowalski avatar Gonçalo Mendes Cabrita avatar Max Justus Spransy avatar Devon Meunier avatar Dan Svetlov avatar Benoit Tigeot avatar 6 avatar myungwoo-song avatar ouyangjinting avatar Markus Unterwaditzer avatar Zayne Abraham avatar  avatar  avatar  avatar Daniel Uber avatar rhymes avatar Mac Siri avatar Scott Tesler avatar Mark Mulder avatar gahag avatar

Watchers

 avatar  avatar

reap's Issues

More memory retained by hashes than by roots?

Hi, I'm looking at output like this:

Object types retaining the most live memory:
Hash: 1010.1 MB (10510995 objects)
Array: 706.5 MB (7566193 objects)
ROOT: 615.8 MB (5365383 objects)
Thread: 494.9 MB (4833542 objects)
[...]

... and I'm having trouble interpreting this. I'd have thought that any memory retained by a hash would also have to be retained by a GC root, so ROOT would always be at the top of this list. Is this because, since multiple objects can have pointers to the same object, the same memory can be counted multiple times?

Version 1.0

Other people are using this and I have no plans for actively developing it, so I should release a version 1.0 to reflect that expectation of stability.

Other cleanup I'd like to do as part of this:

  • Set up CI with GitHub Actions;
  • Improve command-line help & README;
  • Maybe improve test coverage?

How to read/interpret the flame graph?

Any tips on how to interpret the heap dump flame graphs? I am guessing root objects at the top, with each layer down being a traversal down the dominance tree? Is the width the amount of memory referenced? Number of objects further down the tree?

Pointers and tips would be greatly appreciated. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.