Comments (5)
I'm assuming you're talking about something to help you write your own benchmarks, not just the benchmarking tool distributed with Squash. That does give you CPU time + wall clock time (although it's presented as compression/decompression rate in the HTML output, and only CPU time is used).
I've been thinking about benchmarks lately, and looking into how to do them right. Long story short, It's hard. I suggest you take a look at Aleksey Shipilev's JVMLS 2013 talk, "JVM Benchmarking". It's about Java, but the first half is pretty generic, and quite interesting. So, my current thinking is that Squash (and everyone else) should use a benchmarking library which does all that stuff for us. The only thing I've found so far is cbenchsuite, which I haven't yet had time too look at closely, so I don't know whether or not it would be suitable.
Measuring memory usage is tricky. Most libraries don't let us supply a custom malloc/free function, so AFAIK the only way to do that would be to override malloc using LD_PRELOAD. Even with overriding malloc the numbers may not be accurate. I know mcm is planning to use mmap + MAP_ANONYMOUS instead of malloc (perhaps it already does, haven't looked at the code since Mathieu Chartier mentioned that was his plan), and SHARC will emit an error if memory isn't aligned to something (IIRC 8-byte, but I'm not positive), so I would be surprised if its internal buffers didn't use posix_memalign, and I don't know if overriding malloc would catch that or not. Just asking the OS to tell you how much memory is in use before and after isn't reliable, either, because most mallocs just use sbrk, so they can't necessarily return all the memory they've allocated.
from squash.
I am interested in being able to request (by way of options) to receive basic timing information from an individual compressor invocation, even if it's an in-memory request. I'd find value in getting that information on an individual invocation of the compression, not just as part of a separately run benchmark. It seems that in the benchmarking plugin, clock_gettime () is preferred, but more generally other approaches to can be used on other platforms, so this is open to further tweaking. The request is also a placeholder to express interest in measuring information about other performance bottlenecks - for example, deducing how much memory latency or processor cache size got in the way. Compiler choice, the processor type, and available resources would all affect the resultant performance so maybe there is a tool which tries to deduce what the bottlenecks are. Thanks for the reference video on benchmarking. It seems cbenchsuite is pretty comprehensive so a subset of this could be used.
As for memory usage, it may not be easy or possible to get information reliably for all plugins, but a best effort can be of use even if it's a subset. Alternative measures can be suggested in time, too. There is an option to do a conditional preprocessor define that overrides the malloc () instances in the code to invoke one that keeps track of statistics on allocated memory. And for the other approaches like mmap or the aligned memory requirement, a different override can be used, or some plugins just might have to resort to less reliable methods of memory usage estimates.
from squash.
It seems that in the benchmarking plugin, clock_gettime () is preferred, but more generally other approaches to can be used on other platforms, so this is open to further tweaking.
Indeed. I had to tweak it recently for FreeBSD, and I'm guessing Windows does something completely different. I'm willing to try to add support for other systems as people need it.
The request is also a placeholder to express interest in measuring information about other performance bottlenecks - for example, deducing how much memory latency or processor cache size got in the way.
That would definitely be great information to have, but I have no idea how to detect it—I suspect that if there is a way it would be pretty OS-specific (and/or hardware-specific), which means a big and/or complex code base would probably be required to support it, which (at least to me) implies a separate library.
I am interested in being able to request (by way of options) to receive basic timing information from an individual compressor invocation, even if it's an in-memory request. I'd find value in getting that information on an individual invocation of the compression, not just as part of a separately run benchmark.
How do you envision that working? Options are just strings of keys values and are intended to be possible to share across multiple, possibly parallel, compression/decompression operations. With that in mind, I don't see how they can be leveraged to provide this functionality, and I don't think this is a sufficient reason to reconsider that. Think about using Squash to compress/decompress rows in a database… being able to create a single SquashOptions at initialization and simply reuse that whenever you need to compress or decompress a chunk of data is a big deal.
The only way I see to be helpful here (other than creating another version of every API call) would be to create something like GTimer (only I'd want to support CPU time in addition to wall clock) and include it in Squash. I'd probably be okay with something like that if we committed to stopping there, but if we don't then where does it end? Do we add a way to handle warm-up elegantly? Restrict turbo boost? Scheduling? Memory profiling?
Like I said, I think that stuff belongs in a separate library. I think something similar to a unit testing API (I like glib's, but I'm open to alternatives) would be ideal. I don't really know how cbenchsuite works, but if it fits the bill I'd love to use it instead of reinventing the wheel.
As for memory usage, it may not be easy or possible to get information reliably for all plugins, but a best effort can be of use even if it's a subset. Alternative measures can be suggested in time, too. There is an option to do a conditional preprocessor define that overrides the malloc () instances in the code to invoke one that keeps track of statistics on allocated memory. And for the other approaches like mmap or the aligned memory requirement, a different override can be used, or some plugins just might have to resort to less reliable methods of memory usage estimates.
This seems to be making my point about a separate library. This is all very cool stuff but it's a lot of bloat to include in Squash, and it's not really compression-specific, so why not a separate library?
from squash.
I've split the timer code out of benchmark/benchmark.c into benchmark/timer.c and benchmark/timer.h. This would be the GTimer-like API I was talking about.
It's still not part of Squash itself, but it might be at some point.
from squash.
I just pushed a commit to the benchmark program which could make memory reporting in the benchmarks feasible. Before each compress/decompress attempt, we now fork(). The main purpose of this is so we can recover (I hope) from the OOM killer terminating the process which is taking up all the memory, since that was happening on my Raspberry Pi with LZMA at higher levels. However, it may be possible to grab the maximum memory usage before we do anything and then after compression/decompression, and use the difference as the memory usage. We would have to add another fork() so we can separate compression and decompression, but I think it could work.
If we do this, it may be wise to revisit the squash_compress_file* functions, which currently prefer to mmap the input/output and use the buffer to buffer API instead whenever buffer to buffer functions are implemented in plugins instead of only when the stream functions aren't. This was done because it could provide a performance boost (since having to memcpy is less likely) for some codecs. but if the mapped files get counted in the memory usage it wouldn't really be fair.
from squash.
Related Issues (20)
- Support for PGI compilers
- Support for Lizard HOT 2
- Some variadic functions rely on undefined behavior
- Adding lzfse benchmarks HOT 2
- cmake: zlib-ng check failure HOT 7
- Make use of all zstd compression levels? HOT 2
- Pareto frontier? HOT 3
- Compression ratio ambiguity in all charts
- Include video compression dataset? HOT 1
- dynamic benchmark result with echarts scatter
- Zstd Squash Compression Benchmark contain only one level HOT 1
- Include genomics dataset
- Aligned alloc/free might have bug
- Introduce igzip which ultra fast impl compatible with zlib
- Example doesn't show how to use squash_decompress with unknown decompressed length HOT 2
- Add compression level 0 for Brotli
- Where is the blosc?
- Update ZStandard, Increase Range HOT 1
- Add zopfli (gzip) compressor
- add lzturbo
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from squash.