Comments (14)
Hey @FrancescAlted
Thanks for your issue !
A lot of things appear strange to me at first sight in these results :
- the compression levels you are referring to are blosc compression levels, right ? You only use density's chameleon otherwise ?
- the higher the compression ratio, the faster the speed... that's really odd
- even the lz4 results look odd, because lz4 is heavily asymetric and usually 10x faster at decompressing than compressing.
Do you have an idea on these ?
Otherwise thanks for the links I'll give c-blosc a try with static libraries to check out if anything's wrong.
BTW I just released the final 0.12.5 beta seconds ago (its the current master branch), it might already fix some problems.
from density.
This is what I get on OS/X with the latest dev version :
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 595.1 us, 3360.7 MB/s
memcpy(read): 218.6 us, 9149.1 MB/s
Compression level: 0
comp(write): 331.4 us, 6034.9 MB/s Final bytes: 2097168 Ratio: 1.00
decomp(read): 214.7 us, 9313.6 MB/s OK
Compression level: 1
comp(write): 2216.0 us, 902.5 MB/s Final bytes: 1204240 Ratio: 1.74
decomp(read): 537.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 2
comp(write): 2206.0 us, 906.6 MB/s Final bytes: 1204240 Ratio: 1.74
decomp(read): 699.4 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 3
comp(write): 2218.4 us, 901.5 MB/s Final bytes: 1204240 Ratio: 1.74
decomp(read): 737.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 4
comp(write): 1621.4 us, 1233.5 MB/s Final bytes: 1159184 Ratio: 1.81
decomp(read): 1165.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 5
comp(write): 1390.6 us, 1438.2 MB/s Final bytes: 1159184 Ratio: 1.81
decomp(read): 1189.5 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 6
comp(write): 949.2 us, 2106.9 MB/s Final bytes: 1136656 Ratio: 1.85
decomp(read): 1355.1 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 7
comp(write): 743.6 us, 2689.6 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 1497.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 8
comp(write): 761.6 us, 2626.1 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 1562.7 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 9
comp(write): 785.8 us, 2545.2 MB/s Final bytes: 1119824 Ratio: 1.87
decomp(read): 1980.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 9.6 s, 1751.7 MB/s
So it's very similar to your results. I'll need to check what's going on.
Can you give me a quick heads up about the way blosc operates (the logic) ? To avoid digging in your source code and to verify there's nothing incompatible with how density functions ?
from density.
Thanks for the speedy response. Yes, what blosc does is basically split the data to be compressed in small blocks (in order to use L1 as efficiently as possible, but also for leveraging multi-threading). It then applies a shuffle filter (it does not compress as such, but it helps compressors to achieve better compression ratios in many scenarios of binary data) and then pass the shuffled data to the compressor. More info about how it works in the 10 first minutes of this presentation: https://www.youtube.com/watch?v=E9q33wbPCGU
Regarding the size of the blocks (I suppose this is important for density), they are typically between 8 KB and up to around 1 MB, depending on the compression level, the data type size and the compressor that is going to be used. See the algorithm that computes block sizes here: https://github.com/FrancescAlted/c-blosc/blob/density/blosc/blosc.c#L918
Please tell me if you need more clarifications. I am eager to use DENSITY inside Blosc because I think it is a good fit, but I am trying to understand it first (then I will need to figure out how to use C89 and C99 code in the same project ;)
from density.
Oh, and regarding the question of using just Chameleon is because I am trying. If everything goes well, the idea is to use Chameleon for low compression levels and Cheetah for higher ones. Then, depending on how slow compression is, I might decide to use Lion for the highest compression level. I suppose I can use density_buffer_decompress()
for decompressing any of these, right?
from density.
Ok I got everything to work properly using the following patch applied to your density tree : https://gist.github.com/gpnuma/e159fb6b505ef9b11e00 .
Here is a test run :
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 2526.0 us, 3167.0 MB/s
memcpy(read): 1291.0 us, 6196.7 MB/s
Compression level: 0
comp(write): 1101.3 us, 7264.3 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1313.1 us, 6092.6 MB/s OK
Compression level: 1
comp(write): 2871.6 us, 2785.9 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2388.5 us, 3349.3 MB/s OK
Compression level: 2
comp(write): 2750.1 us, 2909.0 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2395.7 us, 3339.3 MB/s OK
Compression level: 3
comp(write): 2749.2 us, 2910.0 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2407.5 us, 3323.0 MB/s OK
Compression level: 4
comp(write): 2977.3 us, 2687.0 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2269.7 us, 3524.7 MB/s OK
Compression level: 5
comp(write): 3043.9 us, 2628.2 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2270.0 us, 3524.2 MB/s OK
Compression level: 6
comp(write): 4438.5 us, 1802.4 MB/s Final bytes: 3622608 Ratio: 2.32
decomp(read): 4439.0 us, 1802.2 MB/s OK
Compression level: 7
comp(write): 4256.3 us, 1879.6 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 4279.2 us, 1869.5 MB/s OK
Compression level: 8
comp(write): 4248.0 us, 1883.2 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 4408.4 us, 1814.7 MB/s OK
Compression level: 9
comp(write): 11095.0 us, 721.0 MB/s Final bytes: 1887328 Ratio: 4.44
decomp(read): 12044.7 us, 664.2 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 7.9 s, 2141.1 MB/s
I set the significant bits to 32 otherwise the data to compress isn't very interesting (it's like processing a file full of zeroes).
Compression ratios are more contained than the lz4 run (they never go below 1.84), as I saw you're using the accel parameter for lz4_fast which can lead to near zero compression but much greater speed.
Here is a sample run with snappy, which exhibits a similar - although lower (1.60) - containment in compression ratio :
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: snappy
Running suite: single
--> 4, 8388608, 8, 32, snappy
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 2402.9 us, 3329.3 MB/s
memcpy(read): 1203.4 us, 6648.0 MB/s
Compression level: 0
comp(write): 1345.3 us, 5946.4 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1285.3 us, 6224.3 MB/s OK
Compression level: 1
comp(write): 6389.5 us, 1252.1 MB/s Final bytes: 5232684 Ratio: 1.60
decomp(read): 2433.4 us, 3287.5 MB/s OK
Compression level: 2
comp(write): 4867.7 us, 1643.5 MB/s Final bytes: 5232684 Ratio: 1.60
decomp(read): 2394.4 us, 3341.1 MB/s OK
Compression level: 3
comp(write): 4901.1 us, 1632.3 MB/s Final bytes: 5232684 Ratio: 1.60
decomp(read): 2389.7 us, 3347.6 MB/s OK
Compression level: 4
comp(write): 5716.6 us, 1399.4 MB/s Final bytes: 3990010 Ratio: 2.10
decomp(read): 2806.1 us, 2850.9 MB/s OK
Compression level: 5
comp(write): 5746.6 us, 1392.1 MB/s Final bytes: 3990010 Ratio: 2.10
decomp(read): 2786.3 us, 2871.2 MB/s OK
Compression level: 6
comp(write): 6050.9 us, 1322.1 MB/s Final bytes: 3339270 Ratio: 2.51
decomp(read): 2944.6 us, 2716.8 MB/s OK
Compression level: 7
comp(write): 6181.5 us, 1294.2 MB/s Final bytes: 3012514 Ratio: 2.78
decomp(read): 3119.4 us, 2564.6 MB/s OK
Compression level: 8
comp(write): 6235.0 us, 1283.1 MB/s Final bytes: 3012514 Ratio: 2.78
decomp(read): 3143.5 us, 2544.9 MB/s OK
Compression level: 9
comp(write): 5757.8 us, 1389.4 MB/s Final bytes: 2558737 Ratio: 3.28
decomp(read): 3115.5 us, 2567.8 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 8.1 s, 2097.7 MB/s
The workaround for output buffer size I used in the aforementioned patch will be fixed in 0.12.6 as a set of function which precisely define the minimum output buffer size for compression/decompression will appear.
from density.
Oh yeah, I forgot to mention : this was compiled and run against the latest dev branch version.
Overall, if I may add, I think you should test blosc against a real file instead of synthetic data. Your current method has the advantage of creating very precise entropy levels but its drawback is that it does not represent anything real.
from density.
Hmm, something is going wrong in my machine (Ubuntu 14.10 / clang 3.5):
$ bench/bench density single 4 8388608 8 32
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: 1.1.1
Zlib: 1.2.8
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 1875.6 us, 4265.2 MB/s
memcpy(read): 1351.2 us, 5920.8 MB/s
Compression level: 0
comp(write): 1312.5 us, 6095.1 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1438.6 us, 5561.0 MB/s OK
Compression level: 1
comp(write): 55510.6 us, 144.1 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 177.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 2
comp(write): 40168.2 us, 199.2 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 170.1 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 3
comp(write): 39445.4 us, 202.8 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 167.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 4
comp(write): 28557.8 us, 280.1 MB/s Final bytes: 4895248 Ratio: 1.71
decomp(read): 157.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 5
comp(write): 21233.2 us, 376.8 MB/s Final bytes: 4895248 Ratio: 1.71
decomp(read): 173.6 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 6
comp(write): 12465.3 us, 641.8 MB/s Final bytes: 4675856 Ratio: 1.79
decomp(read): 177.4 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 7
comp(write): 8179.7 us, 978.0 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 191.6 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 8
comp(write): 8064.7 us, 992.0 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 166.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 9
comp(write): 6451.6 us, 1240.0 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 205.7 us, -0.0 MB/s FAILED. Error code: -1
OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 21.9 s, 772.1 MB/s
faltet@francesc-Latitude-E6430:~/blosc/c-blosc-francesc/build$ ldd bench/bench
linux-vdso.so.1 => (0x00007ffe2ea55000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0dfefe6000)
libblosc.so.1 => /home/faltet/blosc/c-blosc-francesc/build/blosc/libblosc.so.1 (0x00007f0dfedc2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0dfeba4000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f0dfe98b000)
libdensity.so => /usr/local/lib/libdensity.so (0x00007f0dfe46b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0dfe0a7000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0dfdd98000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0dfda91000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f0dfd87a000)
/lib64/ld-linux-x86-64.so.2 (0x00007f0dff21b000)
libspookyhash.so => /usr/local/lib/libspookyhash.so (0x00007f0dfd676000)
The above is with the dev branch. With master:
$ bench/bench density single 4 8388608 8 32
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: 1.1.1
Zlib: 1.2.8
DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 1786.6 us, 4477.7 MB/s
memcpy(read): 1331.7 us, 6007.2 MB/s
Compression level: 0
comp(write): 1306.5 us, 6123.3 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1400.4 us, 5712.8 MB/s OK
Compression level: 1
comp(write): 54855.8 us, 145.8 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 180.6 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 2
comp(write): 39616.5 us, 201.9 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 150.2 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 3
comp(write): 41280.2 us, 193.8 MB/s Final bytes: 5334032 Ratio: 1.57
decomp(read): 146.5 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 4
comp(write): 28674.1 us, 279.0 MB/s Final bytes: 4895248 Ratio: 1.71
decomp(read): 160.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 5
comp(write): 21312.7 us, 375.4 MB/s Final bytes: 4895248 Ratio: 1.71
decomp(read): 163.8 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 6
comp(write): 12716.5 us, 629.1 MB/s Final bytes: 4675856 Ratio: 1.79
decomp(read): 183.4 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 7
comp(write): 8138.4 us, 983.0 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 187.6 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 8
comp(write): 8028.2 us, 996.5 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 188.3 us, -0.0 MB/s FAILED. Error code: -1
OK
Compression level: 9
comp(write): 6376.3 us, 1254.7 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 183.5 us, -0.0 MB/s FAILED. Error code: -1
OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 22.0 s, 769.6 MB/s
So that's not any better.
from density.
Did you try to apply the patch I provided to c-blosc ?
I multiplied blocksize by 8, added a bogus size for the output buffer, and did a switch/case to select the various algorithms.
from density.
Ah, nope. I applied (part of) it here: FrancescAlted/c-blosc@f505fd8 . With this, I am no getting segfaults anymore:
$ bench/bench density single 4
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: 1.1.1
Zlib: 1.2.8
DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 504.7 us, 3962.4 MB/s
memcpy(read): 242.4 us, 8250.7 MB/s
Compression level: 0
comp(write): 277.7 us, 7203.2 MB/s Final bytes: 2097168 Ratio: 1.00
decomp(read): 217.5 us, 9194.7 MB/s OK
Compression level: 1
comp(write): 2857.1 us, 700.0 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 2587.3 us, 773.0 MB/s OK
Compression level: 2
comp(write): 2846.0 us, 702.7 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 2657.4 us, 752.6 MB/s OK
Compression level: 3
comp(write): 2844.4 us, 703.1 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 2668.1 us, 749.6 MB/s OK
Compression level: 4
comp(write): 2073.3 us, 964.6 MB/s Final bytes: 1119824 Ratio: 1.87
decomp(read): 1901.5 us, 1051.8 MB/s OK
Compression level: 5
comp(write): 2081.0 us, 961.1 MB/s Final bytes: 1119824 Ratio: 1.87
decomp(read): 1905.1 us, 1049.8 MB/s OK
Compression level: 6
comp(write): 3007.8 us, 664.9 MB/s Final bytes: 508336 Ratio: 4.13
decomp(read): 3583.5 us, 558.1 MB/s OK
Compression level: 7
comp(write): 2442.5 us, 818.8 MB/s Final bytes: 506016 Ratio: 4.14
decomp(read): 2812.5 us, 711.1 MB/s OK
Compression level: 8
comp(write): 2366.7 us, 845.0 MB/s Final bytes: 506016 Ratio: 4.14
decomp(read): 2819.0 us, 709.5 MB/s OK
Compression level: 9
comp(write): 4928.5 us, 405.8 MB/s Final bytes: 207086 Ratio: 10.13
decomp(read): 5828.6 us, 343.1 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 20.5 s, 822.2 MB/s
BTW, I am not changing the block size in benchmark because the current one (2 MB) is already a bit large for chunked datasets (for a hint on why small data chunks are important to us, see http://bcolz.blosc.org/).
Curiously enough, density works best without threading:
$ bench/bench density single 1 # use a single thread
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: 1.1.1
Zlib: 1.2.8
DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 1, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 1
********************** Running benchmarks *********************
memcpy(write): 513.8 us, 3892.3 MB/s
memcpy(read): 251.5 us, 7953.0 MB/s
Compression level: 0
comp(write): 292.3 us, 6841.5 MB/s Final bytes: 2097168 Ratio: 1.00
decomp(read): 267.0 us, 7491.4 MB/s OK
Compression level: 1
comp(write): 1974.5 us, 1012.9 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 1492.9 us, 1339.7 MB/s OK
Compression level: 2
comp(write): 1902.8 us, 1051.1 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 1507.6 us, 1326.6 MB/s OK
Compression level: 3
comp(write): 1918.5 us, 1042.5 MB/s Final bytes: 1125520 Ratio: 1.86
decomp(read): 1483.9 us, 1347.8 MB/s OK
Compression level: 4
comp(write): 1709.0 us, 1170.2 MB/s Final bytes: 1119824 Ratio: 1.87
decomp(read): 1265.1 us, 1580.9 MB/s OK
Compression level: 5
comp(write): 1706.0 us, 1172.3 MB/s Final bytes: 1119824 Ratio: 1.87
decomp(read): 1271.0 us, 1573.6 MB/s OK
Compression level: 6
comp(write): 2314.7 us, 864.0 MB/s Final bytes: 508336 Ratio: 4.13
decomp(read): 2700.3 us, 740.7 MB/s OK
Compression level: 7
comp(write): 2402.9 us, 832.3 MB/s Final bytes: 506016 Ratio: 4.14
decomp(read): 2859.0 us, 699.5 MB/s OK
Compression level: 8
comp(write): 2443.8 us, 818.4 MB/s Final bytes: 506016 Ratio: 4.14
decomp(read): 2844.4 us, 703.1 MB/s OK
Compression level: 9
comp(write): 4945.3 us, 404.4 MB/s Final bytes: 207086 Ratio: 10.13
decomp(read): 5818.2 us, 343.8 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 16.9 s, 1001.4 MB/s
Not sure exactly why.
from density.
Regarding your suggestion of testing Blosc on actual data, well, the gist of it is to work as a compressor for binary data, where zero bytes are, by far, the most common used. Also, the whole point about using the shuffle filter is to increase the probability of finding a run of zeroed bytes in buffers.
The fact is that Blosc works pretty well in practice as you can see for example in: https://www.youtube.com/watch?v=TZdqeEd7iTM or https://www.youtube.com/watch?v=kLP83HZvbfQ
from density.
That is very strange in regards to threading. On my test platform (Core i7 OS/X) here is what I get :
1 thread
$ bench/bench density single 1
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 1, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 1
********************** Running benchmarks *********************
memcpy(write): 2366.5 us, 3380.5 MB/s
memcpy(read): 1228.9 us, 6509.6 MB/s
Compression level: 0
comp(write): 1268.7 us, 6305.8 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1374.2 us, 5821.7 MB/s OK
Compression level: 1
comp(write): 8289.4 us, 965.1 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 6334.8 us, 1262.9 MB/s OK
Compression level: 2
comp(write): 8155.4 us, 980.9 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 6509.8 us, 1228.9 MB/s OK
Compression level: 3
comp(write): 8433.1 us, 948.6 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 6459.7 us, 1238.4 MB/s OK
Compression level: 4
comp(write): 6900.0 us, 1159.4 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 4903.2 us, 1631.6 MB/s OK
Compression level: 5
comp(write): 6945.7 us, 1151.8 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 4941.9 us, 1618.8 MB/s OK
Compression level: 6
comp(write): 8646.8 us, 925.2 MB/s Final bytes: 3622608 Ratio: 2.32
decomp(read): 9722.9 us, 822.8 MB/s OK
Compression level: 7
comp(write): 7820.2 us, 1023.0 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 8835.1 us, 905.5 MB/s OK
Compression level: 8
comp(write): 7845.3 us, 1019.7 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 8817.7 us, 907.3 MB/s OK
Compression level: 9
comp(write): 21697.2 us, 368.7 MB/s Final bytes: 1887328 Ratio: 4.44
decomp(read): 23950.2 us, 334.0 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 16.5 s, 1022.6 MB/s
2 threads
$ bench/bench density single 2
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 2, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 2
********************** Running benchmarks *********************
memcpy(write): 2292.8 us, 3489.3 MB/s
memcpy(read): 1232.9 us, 6488.8 MB/s
Compression level: 0
comp(write): 1088.8 us, 7347.3 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1307.0 us, 6120.7 MB/s OK
Compression level: 1
comp(write): 4619.7 us, 1731.7 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 3784.3 us, 2114.0 MB/s OK
Compression level: 2
comp(write): 4642.2 us, 1723.3 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 3688.3 us, 2169.0 MB/s OK
Compression level: 3
comp(write): 4585.2 us, 1744.7 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 3743.4 us, 2137.1 MB/s OK
Compression level: 4
comp(write): 3968.9 us, 2015.7 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2929.8 us, 2730.5 MB/s OK
Compression level: 5
comp(write): 3946.0 us, 2027.4 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2964.6 us, 2698.5 MB/s OK
Compression level: 6
comp(write): 5236.9 us, 1527.6 MB/s Final bytes: 3622608 Ratio: 2.32
decomp(read): 5659.9 us, 1413.5 MB/s OK
Compression level: 7
comp(write): 6199.0 us, 1290.5 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 6393.8 us, 1251.2 MB/s OK
Compression level: 8
comp(write): 6170.7 us, 1296.4 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 6286.6 us, 1272.5 MB/s OK
Compression level: 9
comp(write): 10581.0 us, 756.1 MB/s Final bytes: 1887328 Ratio: 4.44
decomp(read): 11585.6 us, 690.5 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 9.9 s, 1699.6 MB/s
4 threads
$ bench/bench density single 4
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
BloscLZ: 1.0.5
LZ4: 1.7.0
Snappy: unknown
Zlib: 1.2.5
DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB Number of threads: 4
********************** Running benchmarks *********************
memcpy(write): 2379.6 us, 3362.0 MB/s
memcpy(read): 1199.0 us, 6672.4 MB/s
Compression level: 0
comp(write): 1090.6 us, 7335.2 MB/s Final bytes: 8388624 Ratio: 1.00
decomp(read): 1305.6 us, 6127.5 MB/s OK
Compression level: 1
comp(write): 2906.1 us, 2752.9 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2453.8 us, 3260.3 MB/s OK
Compression level: 2
comp(write): 2772.4 us, 2885.6 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2427.6 us, 3295.4 MB/s OK
Compression level: 3
comp(write): 2786.6 us, 2870.9 MB/s Final bytes: 4566672 Ratio: 1.84
decomp(read): 2404.4 us, 3327.3 MB/s OK
Compression level: 4
comp(write): 2714.1 us, 2947.5 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2168.6 us, 3689.0 MB/s OK
Compression level: 5
comp(write): 2717.3 us, 2944.1 MB/s Final bytes: 4511568 Ratio: 1.86
decomp(read): 2152.0 us, 3717.5 MB/s OK
Compression level: 6
comp(write): 4490.2 us, 1781.7 MB/s Final bytes: 3622608 Ratio: 2.32
decomp(read): 4443.0 us, 1800.6 MB/s OK
Compression level: 7
comp(write): 4247.7 us, 1883.4 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 4253.4 us, 1880.9 MB/s OK
Compression level: 8
comp(write): 4250.4 us, 1882.2 MB/s Final bytes: 3601120 Ratio: 2.33
decomp(read): 4271.5 us, 1872.9 MB/s OK
Compression level: 9
comp(write): 11015.6 us, 726.2 MB/s Final bytes: 1887328 Ratio: 4.44
decomp(read): 12085.9 us, 661.9 MB/s OK
Round-trip compr/decompr on 7.5 GB
Elapsed time: 7.8 s, 2166.7 MB/s
So threading is visibly improving things, apart maybe for the 4-thread lion vs 2-thread.
from density.
But after further comparisons yes you're right, it seems like snappy for example scales better with multithreading (goes from 25.5s on 1-thread to 8.2s on 4-thread which is 3 times faster).
BTW there is a slight overhead in setting up a buffer in density as buffer initialization involves some malloc, that's why I had increased the blocksize and maybe that's the reason heavy multithreading is not helping a lot with small block sizes (the small overhead in setting up compression is probably what actually limits the scalability).
I'll look into this further when I get more time, there might be a way to avoid all overhead by slightly modifying the API, and I think it could be worth it in use cases like yours, so thanks for pointing it out 😄
In regards to blosc and binary data, yes I understand what you are trying to do ! The only problem with random data is that you actually deny any obvious "patterns" in non-zero data which inevitably appear when manipulating "human" data.
Since the function you're using is perfectly random, on one side you'll get a predictable number of zeroes in unpredictable order, but on the other all non-zero data will essentially be pattern-free which is not very realistic.
What I mean is that blosc could be very good with this synthetic data - I'm sure it's the case - but less performing with real data as it might "break" some non-zero patterns by "splitting" them which could lead to a compression ratio downgrade.
For example, let's say you want to compress :
ABCDEABCDEABCDEABCDEABCD (24 symbols)
A good compression algorithm will spot a pattern and go :
ABCDEABCDEABCDEABCDEABCD => that's 4 x ABCDE and 1 x ABCD => easy and efficient compression.
However if you split it in 3 blocks of 8 (blosc processing) you get : ABCDEABC DEABCDEA BCDEABCD
Now, each individual block doesn't exhibit any obvious pattern and the same compression algorithm will actually generate very poor results.
from density.
Yes, the malloc call inside density could be the root of poor threading scalability. Thanks for willing to tackle this.
Blosc does not shuffle using 8 bytes blocks by default, but rather the size of the datatype that you are compressing (2 for short int, 4 for int and float32, 8 for long int and float64 and other sizes for structs too). Using this datatype size is critical for the reasons explained in the talks above.
Regarding real data, you may want to have a look at this notebook:
http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb
where real data is being used and where you can see that compression ratio can reach a 20x in this case. Also, it can be seen that some operations takes less time (on decent modern computer) on compressed datasets than in uncompressed ones.
from density.
Needs retesting with 0.14.0
from density.
Related Issues (20)
- Extremely large compiled size HOT 2
- Division by zero in the encoder HOT 1
- decompress error HOT 1
- type qualifier on return type is meaningless
- Crashes from fuzzing HOT 2
- densityxx is a C++ version of density HOT 3
- Windows compilation HOT 3
- Use of unnamed union HOT 2
- MSVC compiler port HOT 2
- Setup continuous integration HOT 1
- Verify decompressed data consistency in benchmark
- Problems with the density_context by multiple uses HOT 23
- why don`t you make standalone binaries ? HOT 5
- Decompression error with dev branch HOT 13
- Flattening of directory hierarchy ? HOT 1
- make install target HOT 1
- Creating a C++/CLI wrapper for C#
- Rust port HOT 23
- API issue: density_decompress () requires a larger buffer than needed to store the decompressed data HOT 3
- Bug and Security
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from density.