svenstaro / cargo-profiler Goto Github PK

View Code? Open in Web Editor NEW

420.0 9.0 17.0 45.57 MB

Cargo subcommand to profile binaries

License: MIT License

Rust 100.00%

cargo-subcommand cargo valgrind rust cachegrind cargo-profiler

cargo-profiler's Introduction

cargo-profiler

Cargo subcommand to profile binaries

To install

NOTE: This subcommand can only be used on Linux machines.

First install valgrind:

$ sudo apt-get install valgrind

Then you can install cargo-profiler via cargo install.

$ cargo install cargo-profiler

Alternatively, you can clone this repo and build the binary from the source.

$ cargo build --release

Now, copy the built binary to the same directory as cargo.

$ sudo cp ./target/release/cargo-profiler $(dirname $(which cargo))/

To run

Cargo profiler currently supports callgrind and cachegrind.

You can call cargo profiler anywhere in a rust project directory with a Cargo.toml.

$ cargo profiler callgrind
$ cargo profiler cachegrind --release

You can also specify a binary directly:

$ cargo profiler callgrind --bin $PATH_TO_BINARY

To specify command line arguments to the executable being profiled, append them after a --:

$ cargo profiler callgrind --bin $PATH_TO_BINARY -- -a 3 --like this

You can chose to keep the callgrind/cachegrind output files using the --keep option

$ cargo profiler callgrind --keep

You can limit the number of functions you'd like to look at:

$ cargo profiler callgrind --bin ./target/debug/rsmat -n 10

Profiling rsmat with callgrind...

Total Instructions...198,466,456

78,346,775 (39.5%) dgemm_kernel.rs:matrixmultiply::gemm::masked_kernel
-----------------------------------------------------------------------
23,528,320 (11.9%) iter.rs:_..std..ops..Range..A....as..std..iter..Iterator..::next
-----------------------------------------------------------------------
16,824,925 (8.5%) loopmacros.rs:matrixmultiply::gemm::masked_kernel
-----------------------------------------------------------------------
10,236,864 (5.2%) mem.rs:core::mem::swap
-----------------------------------------------------------------------
7,712,846 (3.9%) memset.S:memset
-----------------------------------------------------------------------
7,197,344 (3.6%) ???:core::cmp::impls::_..impl..cmp..PartialOrd..for..usize..::lt
-----------------------------------------------------------------------
6,979,680 (3.5%) ops.rs:_..usize..as..ops..Add..::add
-----------------------------------------------------------------------

With cachegrind, you can also sort the data by a particular metric column:

$ cargo profiler cachegrind --bin ./target/debug/rsmat -n 10 --sort dr

Profiling rsmat with cachegrind...

Total Memory Accesses...320,385,356

Total L1 I-Cache Misses...371 (0%)
Total LL I-Cache Misses...308 (0%)
Total L1 D-Cache Misses...58,549 (0%)
Total LL D-Cache Misses...8,451 (0%)

 Ir  I1mr ILmr  Dr  D1mr DLmr  Dw  D1mw DLmw
0.40 0.18 0.21 0.35 0.93 1.00 0.38 0.00 0.00 dgemm_kernel.rs:matrixmultiply::gemm::masked_kernel
-----------------------------------------------------------------------
0.08 0.04 0.05 0.12 0.00 0.00 0.02 0.00 0.00 loopmacros.rs:matrixmultiply::gemm::masked_kernel
-----------------------------------------------------------------------
0.12 0.02 0.02 0.10 0.00 0.00 0.15 0.00 0.00 iter.rs:_std..ops..RangeAasstd..iter..Iterator::next
-----------------------------------------------------------------------
0.05 0.01 0.01 0.07 0.00 0.00 0.08 0.00 0.00 mem.rs:core::mem::swap
-----------------------------------------------------------------------
0.03 0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 ???:core::cmp::impls::_implcmp..PartialOrdforusize::lt
-----------------------------------------------------------------------
0.03 0.01 0.01 0.04 0.00 0.00 0.03 0.00 0.00 ops.rs:_busizeasops..Addausize::add
-----------------------------------------------------------------------
0.04 0.01 0.01 0.04 0.00 0.00 0.03 0.00 0.00 ptr.rs:core::ptr::_implconstT::offset
-----------------------------------------------------------------------
0.02 0.01 0.00 0.03 0.00 0.00 0.01 0.00 0.00 ???:_usizeasops..Add::add
-----------------------------------------------------------------------
0.01 0.01 0.01 0.02 0.00 0.00 0.01 0.00 0.00 mem.rs:core::mem::uninitialized
-----------------------------------------------------------------------
0.02 0.01 0.01 0.02 0.00 0.00 0.04 0.00 0.00 wrapping.rs:_XorShiftRngasRng::next_u32
-----------------------------------------------------------------------

What are the cachegrind metrics?

Ir -> Total Instructions
I1mr -> Level 1 I-Cache misses
ILmr -> Last Level I-Cache misses
Dr -> Total Memory Reads
D1mr -> Level 1 D-Cache read misses
DLmr -> Last Level D-cache read misses
Dw -> Total Memory Writes
D1mw -> Level 1 D-Cache write misses
DLmw -> Last Level D-cache write misses

TODO

cmp subcommand - compare binary profiles
profiler macros
better context around expensive functions
support for more profiling tools

cargo-profiler's People

Contributors

Stargazers

Watchers

Forkers

bluss andersk nateozem fauxfaux ptillemans mattlknight isgasho icewind1991 pganssle matsu7874 devinr528 pseitz wonderfulspam rajasekarv klispap hustfisher iq-scm

cargo-profiler's Issues

breaking PR

@regexident Did your pull request re cleanup compile successfully on your machine? I found that there were a bunch of move errors in main after you refactored the clap assembly. I reverted the merge, if you can look into it and re-submit the PR I'd really appreciate it. Travis would make this much easier, but I didn't have time yesterday to integrate it.

catch sigint

implement compilation --verbose flag

add unit tests

difficult to get maximum coverage for full application, but write some unit tests for core functions.

Doc Improvements

Answering these questions will probably just need doc improvements rather than other fixes, but I'm not sure so here goes:

Using cargo profiler callgrindwhich executable is run?
How does this work with cargo bench? Can I run cargo profiler --bench callgrind (or something like that) to get profile stats for the different benchmarks? For now I've been creating a main.rs building that (with debug symbols) and then using cargo profiler callgrind -n 20 --bin ./target/release/executable_name. I swap out the contents of main with my benchmarks to get results. This seems suboptimal. I'm probably missing something.
It's probably worth calling out specifically that -n <count> limits the shown instructions. That wasn't immediately clear.

Thanks.

Cachegrind regex parse error

Here is the cachegrind.out:
https://pastebin.com/cr8bKr0s

Detect invocation from outside of Rust project

If you try to run, say cargo profiler callgrind --release from a directory that's not a cargo project root (as in "does not contain a Cargo.toml file") you get an exception:

thread '<main>' panicked at 'Error in encoding manifest string into JSON: SyntaxError("EOF While parsing value", 1, 1)', ../src/libcore/result.rs:785
stack backtrace:
   1:        0x10b3f4c18 - std::sys::backtrace::tracing::imp::write::h9fb600083204ae7f
   2:        0x10b3f82a5 - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::hca543c34f11229ac
   3:        0x10b3f7ede - std::panicking::default_hook::hc2c969e7453d080c
   4:        0x10b3e8de0 - std::panicking::rust_panic_with_hook::hfe203e3083c2b544
   5:        0x10b3f8866 - std::panicking::begin_panic::h4889569716505182
   6:        0x10b3e9a28 - std::panicking::begin_panic_fmt::h484cd47786497f03
   7:        0x10b3f84bf - rust_begin_unwind
   8:        0x10b41ee30 - core::panicking::panic_fmt::h257ceb0aa351d801
   9:        0x10b32a4a9 - core::result::unwrap_failed::h359024e7ee7aee6d
  10:        0x10b3159a7 - cargo_profiler::real_main::hfa05bdcb6b661735
  11:        0x10b313fd3 - cargo_profiler::main::hbb5817797bfb8430
  12:        0x10b3f7acd - std::panicking::try::call::hc5e1f5b484ec7f0e
  13:        0x10b3fb39b - __rust_try
  14:        0x10b3fb335 - __rust_maybe_catch_panic
  15:        0x10b3f78f1 - std::rt::lang_start::h61f4934e780b4dfc

It would be nice if it caught these cases and printed something akin to

Error: Your Cargo.toml is missing. Are you sure you're in a Rust project?

fails to compile on macos

~/IdeaProjects/benchmarks/untitled  $ cargo profiler callgrind -n 10                           

Compiling untitled in debug mode...

Profiling untitled with callgrind...
thread 'main' panicked at 'failed to execute process: No such file or directory (os error 2)', /Users/me/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-profiler-0.1.6/src/parse/callgrind.rs:28:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Will there (or can there) also be profile-guided optimisation?

TODO mentions some "binary profiles" (assuming that's saved data after profiling). Can them also be used at compile time for optimisation?

Can you add a License?

(U) I want to be sure that I can use this in my projects. Thank you!

Am I missing the obvious? (cargo build/run integration)

Do you still have to manually build the binary you wish to profile with e.g. cargo build?

I mean, if a Cargo.toml exists I'd expect the following to work:

cargo profiler callgrind or cargo --release profiler callgrind

Failure info not passed to profiler

$ cargo profiler callgrind --bin ./prime++

Profiling prime with callgrind...

Total Instructions...0

Two issue again ;)

We're inside a rust project named prime but profiling a C++ prime++ binary - what happened to the two pluses in the displayed name? (probably integration related)
We're given no info about the failure whereas the first/last lines from valgrind output reveal the problem:

==6072==     Valgrind's memory management: out of memory:
==6072==     Whatever the reason, Valgrind cannot continue.  Sorry.

Compilation failes while normal cargo build succeeds

When I try to run cargo profiler callgrind, there is an error that compilation failed with the normal cargo build output afterwards, saying compilation succeeded. When I just run cargo build, it builds correctly.
Any output about why it failed/ some details would be really helpful, a verbose-flag or at least logging via RUST_LOG.

Any way to restrict to a certain function and its children?

I'm working on an application using SDL and I believe callgrind is exiting shortly after it starts because of all the SDL C functions being called. Is there a way to have it profile only a certain function and that function's children? Thanks!

Add support for vallgrind 3.16

I recently tried cargo-profiler again since I am doing some more rust work.
But it just kept on failing, trough some miracle thinking I somehow came to the conclusion that maybe valgrind got an update making it incompatible.
I downgraded to 3.15, and voila, it actually worked!!! back to 3.16? failed again.
Looking at the callgrind.out the output file looks just a bit different.
here are the patch notes:
note specifially "callgrind_annotate's --auto and --show-percs options now default to 'yes', because they are usually wanted.".
I don't know how to test this but that poped out to me at least as something that could bring incompatibilities.

The results still seem a bit off, and it takes AGES to run, while the benchmark is only 20 seconds long, but i am not sure what is up with that.

Better integration with cargo

cargo-profiler should run the default crate binary (main.rs), plus you could add the semantics of cargo run --bin

Help for making it work

Hello,
I can't figure out how to profile my binary.
I usually start it like that target/debug/mybin -r my/path -w something.png.

I tried call cargo-profiler like this : cargo profiler callgrind -n 15 --release -- -r scenes/suzanne_low.json -- -w test.png

I also tried to specify the binary by hand : cargo profiler callgrind -n 10 --bin target/release/render_engine -- -r scenes/suzanne_low.json -- -w prof.png

The problem is that in both cases, the program doesn't parse any arguments at all : it silently stop at roughly 600k instructions.
Can you explain to me how should I pass arguments to the binary ?
Thanks a lot !

implement cargo run example

introduce timeout for long-running applications

re #28

handle custom output directories

This is not supported in the cargo-integration. Currently we look for the binary in a target/ directory, and if the target/ directory doesn't exist, we return NoTargetDirectory error.

arg passing to the program being benchmarked not working

I have this program:

use std::thread;
use std::time::Duration;
use std::env;
fn main() {
    env::args().nth(1).unwrap();
    thread::sleep(Duration::new(100000, 0));
}

It should hang for a long time when it takes at least one command line argument. But when I run it like this it terminates in an instant:

$ cargo profiler callgrind --bin ./target/release/flow -- arg1 arg2 arg3 arg4

Expose data structs in a lib

Expose the profiling data structs in a lib so other tools can use the profiling data we gather here.

Regex error

cargo profiler crashes for me:

Profiling imag-counter with cachegrind...
error: regex error -- please file a bug.
third party subcommand `cargo-profiler` exited unsuccessfully

How to investigate?

Pass STDIN to the application

It would be nice if it would be a possibility for passing the STDIN to the application to profile.

e.g.

cat test.txt | cargo profiler callgrind
cargo profiler callgrind < test.txt

Recent cargo integration broken?

Continuing from #20, it seems running the same binary via --bin works as expected, whereas cargo profiler callgrind is able to compile but fails in debug mode and doesn't even build in release.

I'm going to use an older cargo binary to see if that helps.

Panic with "failed to execute process"

Here's what I've got:

$ RUST_BACKTRACE=1 cargo profiler callgrind --release

Compiling imag-store in release mode...

Profiling imag-store with callgrind...
thread 'main' panicked at 'failed to execute process: No such file or directory (os error 2)', /home/m/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-profiler-0.1.6/src/parse/callgrind.rs:28
stack backtrace:
   1:     0x7f3af547f85f - std::sys::backtrace::tracing::imp::write::h46e546df6e4e4fe6
   2:     0x7f3af548437b - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::h077deeda8b799591
   3:     0x7f3af5484007 - std::panicking::default_hook::heb8b6fd640571a4f
   4:     0x7f3af5473b0e - std::panicking::rust_panic_with_hook::hd7b83626099d3416
   5:     0x7f3af54845c1 - std::panicking::begin_panic::h941ea76fc945d925
   6:     0x7f3af54746ca - std::panicking::begin_panic_fmt::h30280d4dd3f149f5
   7:     0x7f3af539879f - cargo_profiler::real_main::hfc1db50b3d8f39f2
   8:     0x7f3af5395f00 - cargo_profiler::main::hbb5817797bfb8430
   9:     0x7f3af5483be8 - std::panicking::try::call::hca715a47aa047c49
  10:     0x7f3af5489f1b - __rust_try
  11:     0x7f3af5489ebe - __rust_maybe_catch_panic
  12:     0x7f3af54837a3 - std::rt::lang_start::h162055cb2e4b9fe7
  13:     0x7f3af47ae2df - __libc_start_main
  14:     0x7f3af5395de9 - _start
                        at ../sysdeps/x86_64/start.S:120
  15:                0x0 - <unknown>

Rewrite this

Hey everyone,

don't worry, this isn't unmaintained. I'm currently in the process of rewriting this due to some design issues I found with this project which I took over (no disrespect to the original creator). I think we can shave off a lot of code and also make everything quite a bit nicer to work with.

It's ongoing, please bear with me.

Is this project maintained?

The last commit shows more than a year ago. There is an open and unresponsive PR. There are issues that have been filed and not responded to. Is this project still maintained, or should I find another tool to use?

Regex error with callgrind for paths containing numbers

The simplest possible project fails for me:

% cargo init --bin hello
     Created binary (application) project
% cd hello
% cargo profiler callgrind --release

Compiling hello in release mode...

Profiling hello with callgrind...
error: Regex error -- please file a bug. In bug report, please include the original output file from profiler, e.g. from valgrind --tool=cachegrind --cachegrind-out-file=cachegrind.txt

% valgrind --tool=cachegrind --cachegrind-out-file=cachegrind.txt target/release/hello
==19861== Cachegrind, a cache and branch-prediction profiler
==19861== Copyright (C) 2002-2015, and GNU GPL'd, by Nicholas Nethercote et al.
==19861== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==19861== Command: target/release/hello
==19861== 
--19861-- warning: L3 cache found, using its data for the LL simulation.
Hello, world!
==19861== 
==19861== I   refs:      621,681
==19861== I1  misses:      2,045
==19861== LLi misses:      1,860
==19861== I1  miss rate:    0.33%
==19861== LLi miss rate:    0.30%
==19861== 
==19861== D   refs:      199,643  (141,498 rd   + 58,145 wr)
==19861== D1  misses:      4,892  (  3,346 rd   +  1,546 wr)
==19861== LLd misses:      3,590  (  2,166 rd   +  1,424 wr)
==19861== D1  miss rate:     2.5% (    2.4%     +    2.7%  )
==19861== LLd miss rate:     1.8% (    1.5%     +    2.4%  )
==19861== 
==19861== LL refs:         6,937  (  5,391 rd   +  1,546 wr)
==19861== LL misses:       5,450  (  4,026 rd   +  1,424 wr)
==19861== LL miss rate:      0.7% (    0.5%     +    2.4%  )
%

Here's the output cachegrind.txt.

Versions:

cargo 0.16.0-nightly (6e0c18c 2017-01-27)
% uname -rm
4.10.0-11-generic x86_64
valgrind-3.12.0

Better error message if a user doesn't specify any arguments

ssokolow@monolith rdbackup-excludes-indexer [master] %% cargo profiler     
error: Invalid profiler. cargo profiler currently supports callgrind and cachegrind.

Referring to "no profiler" as an invalid profiler is one of those things that only makes sense to a computer.

The error message should be one of the following in this case:

error: You must specify a profiler.
USAGE: cargo profiler <SUBCOMMAND>
The output of cargo profiler --help

Also, I noticed that your actual USAGE: doesn't use < and > to specify that the subcommand is mandatory.

I can't double-check, since I get an error in rustc-serialize when I attempt to build your git HEAD, but I think you can just add .required(true) to the subcommand argument to get a more helpful error message automatically, plus the appropriate <SUBCOMMAND> in your --help rather than [SUBCOMMAND].

(I'm not 100% certain because I use subcommands infrequently enough in my projects that the last time I defined one was back when clap didn't exist and I was working with argparse and Python.)

better error message if valgrind is not installed

as mentioned in #26, we need better error message if user has not installed valgrind.

profile forever-running target

this is a feature request..

For example I have a http server, and want cargo-profiler to terminate the target in 1 minute or respond to special key stroke

doesn't understand where bins live

 $ find src -type f
src/lib.rs
src/bin/prof_bin.rs
 $ cat src/lib.rs

pub fn libfunc() {
    println!("This is a library function");
}
 $ cat src/bin/prof_bin.rs

use cargo_profiler_example::*;

fn main() {
    libfunc();
}
 $ cargo run --bin prof_bin
   Compiling cargo-profiler-example v0.1.0 (/home/novadenizen/prog/cargo-profiler-example)
    Finished dev [unoptimized + debuginfo] target(s) in 0.30s
     Running `target/debug/prof_bin`
This is a library function
 $ cargo profiler cachegrind --bin prof_bin
error: Invalid binary. make sure binary exists.

I did an strace of the cargo profiler command, and found that the current directory is unchanged from the project directory, and it's trying to open the prof_bin binary in the current directory.

I created an empty prof_bin file in the project directory and you can see the error changes.

 $ touch prof_bin
 $ cargo profiler cachegrind --bin prof_bin

Profiling prof_bin with cachegrind...
error: Misaligned data arrays due to regex error -- please file a bug.
 $

How do I provide arguments to the binary?

I need to provide arguments to the binary or it won't work. How do I do that?

I tried using -- (like 'cargo run') but that didn't work:

RUST_BACKTRACE=1 cargo profiler cachegrind --bin ./test -- --arg1 a --arg2 b