Comments (3)
inferno-collapse-perf-origin
is vanilla inferno
inferno-collapse-perf
is using jemalloc
Using Jemalloc is most of the time faster but not significantly.
Benchmark #1: target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-iperf-stacks-pidtid-01.txt
Time (mean ± σ): 4.4 ms ± 1.7 ms [User: 3.6 ms, System: 0.8 ms]
Range (min … max): 2.3 ms … 12.0 ms 314 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Benchmark #2: target/release/inferno-collapse-perf --all ./flamegraph/test/perf-iperf-stacks-pidtid-01.txt
Time (mean ± σ): 4.4 ms ± 1.7 ms [User: 3.4 ms, System: 1.0 ms]
Range (min … max): 2.5 ms … 16.0 ms 592 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'target/release/inferno-collapse-perf --all ./flamegraph/test/perf-iperf-stacks-pidtid-01.txt' ran
1.01 ± 0.55 times faster than 'target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-iperf-stacks-pidtid-01.txt'
==> ./flamegraph/test/perf-java-stacks-01.txt <==
Benchmark #1: target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-java-stacks-01.txt
Time (mean ± σ): 2.5 ms ± 0.9 ms [User: 2.0 ms, System: 0.5 ms]
Range (min … max): 1.5 ms … 9.3 ms 925 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark #2: target/release/inferno-collapse-perf --all ./flamegraph/test/perf-java-stacks-01.txt
Time (mean ± σ): 2.3 ms ± 0.9 ms [User: 1.8 ms, System: 0.5 ms]
Range (min … max): 1.5 ms … 8.8 ms 857 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'target/release/inferno-collapse-perf --all ./flamegraph/test/perf-java-stacks-01.txt' ran
1.11 ± 0.59 times faster than 'target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-java-stacks-01.txt'
==> ./flamegraph/test/perf-numa-stacks-01.txt <==
Benchmark #1: target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-numa-stacks-01.txt
Time (mean ± σ): 2.4 ms ± 0.8 ms [User: 2.1 ms, System: 0.4 ms]
Range (min … max): 1.8 ms … 19.2 ms 736 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark #2: target/release/inferno-collapse-perf --all ./flamegraph/test/perf-numa-stacks-01.txt
Time (mean ± σ): 2.3 ms ± 0.4 ms [User: 2.0 ms, System: 0.4 ms]
Range (min … max): 2.0 ms … 7.1 ms 808 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'target/release/inferno-collapse-perf --all ./flamegraph/test/perf-numa-stacks-01.txt' ran
1.04 ± 0.41 times faster than 'target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-numa-stacks-01.txt'
==> ./flamegraph/test/perf-rust-Yamakaky-dcpu.txt <==
Benchmark #1: target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-rust-Yamakaky-dcpu.txt
Time (mean ± σ): 1.7 ms ± 0.6 ms [User: 1.4 ms, System: 0.3 ms]
Range (min … max): 1.2 ms … 5.5 ms 896 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark #2: target/release/inferno-collapse-perf --all ./flamegraph/test/perf-rust-Yamakaky-dcpu.txt
Time (mean ± σ): 1.8 ms ± 0.5 ms [User: 1.5 ms, System: 0.3 ms]
Range (min … max): 1.3 ms … 7.3 ms 1339 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-rust-Yamakaky-dcpu.txt' ran
1.06 ± 0.48 times faster than 'target/release/inferno-collapse-perf --all ./flamegraph/test/perf-rust-Yamakaky-dcpu.txt'
==> ./flamegraph/test/perf-vertx-stacks-01.txt <==
Benchmark #1: target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-vertx-stacks-01.txt
Time (mean ± σ): 9.0 ms ± 1.7 ms [User: 8.2 ms, System: 0.8 ms]
Range (min … max): 7.9 ms … 18.7 ms 249 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark #2: target/release/inferno-collapse-perf --all ./flamegraph/test/perf-vertx-stacks-01.txt
Time (mean ± σ): 8.5 ms ± 1.4 ms [User: 7.7 ms, System: 0.8 ms]
Range (min … max): 7.6 ms … 17.6 ms 341 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'target/release/inferno-collapse-perf --all ./flamegraph/test/perf-vertx-stacks-01.txt' ran
1.06 ± 0.26 times faster than 'target/release/inferno-collapse-perf-origin --all ./flamegraph/test/perf-vertx-stacks-01.txt'
from inferno.
Thanks for the check work @meven. The "gains" when they appear at all seem mostly irrelevant and insignificant, making the original suggestion irrelevant.
I won't close the issue as @jonhoo renamed it to something more useful, but either way the jemalloc track can safely be ignored.
from inferno.
I wish we were as careful with allocations in collapse-perf
as we are in flamegraph following #37, but I think it's actually a fair bit trickier there since we really do need to allocate for each distinct function name. We should be able to avoid allocations for repeated function names though; probably by keeping a single string that we write the "current" function name into, and then only copy that to its own string if we discover that we need to stick it in the stack count map.
In any case, yes, jemalloc
might help a decent amount here. That said, I'd prefer to just get rid of those allocations altogether. And given that collapse-perf
is now faster than perf script
, it's not clear to me that we're in any rush to squeeze out that performance. :p Seeing jemalloc results would certainly be interesting and tell us how important getting rid of those allocations is though!
from inferno.
Related Issues (20)
- More Firefox rendering issues HOT 1
- `inferno 0.11.8` removed sealed `CollapsePrivate` trait from public API HOT 1
- Lower level API to flamegraph renderer HOT 1
- Color diffusion mode gives less useful results in flamechart mode HOT 2
- Support for simplifying recursive function calls as stackcollapse perl scripts HOT 7
- Support for collapsing source lines from -F+srcline in `perf script` outputs HOT 1
- atty 0.2 has a potential unaligned read HOT 3
- 0.11.15 build fails on Rust 1.62 HOT 1
- Single stack detection can be wrong if the event contains multiple colon HOT 3
- `Input data ends in the middle of a stack.` when using on result of attaching HOT 1
- Differential output tooltips are confusing HOT 4
- Differential output only calculates diff correctly for leaves (most specific frames) HOT 4
- support hot/cold flamegraphs HOT 1
- Document cargo features in readme HOT 1
- flamegraph does not contain sys_enter_* calls with params HOT 3
- Documentation, especially of folded format HOT 3
- publish packages HOT 5
- Error in generated SVG: PCDATA invalid Char value (macos) HOT 5
- Dependencies versions too loose HOT 1
- wallClockProfiler support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from inferno.