Comments (11)
Hmm, meanwhile I somehow have the suspicion that there is some overhead per compress call and it gets a bit inefficient for small block sizes. Is it due to thread starting overhead? Is it starting / stopping threads per compress call?
from python-blosc.
No, there is a pool of threads, so this should not add too much overhead. The performance problem is a bit hairy to describe because caches are strange beasts. My experience playing with buffers is pretty much summarized in the compute_blocksize()
function (https://github.com/Blosc/c-blosc/blob/master/blosc/blosc.c#L819), and that means that 16 KB is a minimum per thread, so if you have say, 4 cores, that would mean chunksizes of 64 KB.
Caches being complex creatures also means that it is difficult to document recommendations for users other than testing with different chunksizes and number of threads. Sorry about that.
from python-blosc.
In addition it is worth mentioning, that an LZ77 compressor works by looking at previously seen data. If the blocks are small, there are boundaries which cannot be traversed by the compressor, meaning many small blocks will likely compress worse than larger ones overall. Also, there is always a fixed overhead per block in the form of a header, less blocks means less headers. So overall choosing a good blocksize is an art, hence the 'expert only' documentation.
Having said that, feel free to open a pull-request updating the docstring if you like.
Regarding the new release, it is being prepped already and should be out soon.
from python-blosc.
thanks for working on new release and for the explanations.
yes, i see that the small chunks are increasing overhead, I'll see if it makes sense to increase chunksize in attic. it would decrease overhead at other places also, but a increased chunksize might mean less deduplication because larger chunks are less likely duplicates, so it's tricky...
btw, did some benchmarks and on my test data lz4 was about as fast as no compression. a bit strange was that lz4 compression levels didn't change output size and lz4 level 9 (65s) seemed slightly faster than level 1 (69s), but both produced 3.79GB compressed data.
i was somehow wondering how much overhead constructing the bytes object it wants added (I have a memoryview as "data"):
def compress(self, data):
return blosc.compress(bytes(data), 1, cname=self.CNAME, clevel=self._get_level())
I didn't find some other way, especially not how to give it a pointer and length.
from python-blosc.
@ThomasWaldmann blosc.compress_ptr()
would not help?
http://python-blosc.blosc.org/tutorial.html#compressing-from-a-data-pointer
The example is for a numpy array, but using ctypes maybe you can make it run with strings as well.
from python-blosc.
i had a look at compress_ptr and also at memoryview's docs, but there is no (pure) python way to get the address of the data in a memoryview. I don't use numpy, but thanks for the tip about using ctypes.
But: I think there should be an easier way, most python devs won't invoke ctypes just to get at some pointer. See also the ticket I openend, maybe memoryviews could be supported better.
from python-blosc.
Agreed, supporting memoryviews would be cool. If you feel like you can contribute a PR for this, that would be fantastic.
from python-blosc.
Proposed fix for this in #81
from python-blosc.
Closing because open for too long.
from python-blosc.
Feel free to reopen if you thing the issue persists.
from python-blosc.
Here is the patch if you want to resurrect:
commit 25cd5871d5732d8c29c13d92a6381cf2ef4d515f
Author: Valentin Haenel <[email protected]>
Date: Sat Mar 28 22:52:05 2015 +0100
update docs for set_blocksize, fixes #76
diff --git a/blosc/toplevel.py b/blosc/toplevel.py
index 82932d0e36..8f51afd81c 100644
--- a/blosc/toplevel.py
+++ b/blosc/toplevel.py
@@ -98,13 +98,25 @@ def set_nthreads(nthreads):
def set_blocksize(blocksize):
"""set_blocksize(blocksize)
- Force the use of a specific blocksize. If 0, an automatic
+ Force the use of a specific blocksize in bytes. If 0, an automatic
blocksize will be used (the default).
Notes
-----
- This is a low-level function and is recommened for expert users only.
+ This is a low-level function and is recommended for expert users only.
+ Changing the blocksize can have profound effect on the performance of
+ blosc. If the blocksize is too large each block may not fit into the CPU
+ caches anymore and thereby rendering the blocking technique ineffective.
+ For example, a block may have to travel from and to memory twice, once when
+ applying the shuffle filter and a second time for doing the actual
+ compression. Also, for a large blocksize, blosc may not be able to split
+ the input, depending on it's size, which in turns means no multithreading.
+ If the blocksize is too small, the amount of constant overhead is increased
+ since each block must store a header that contains information about it's
+ compressed size. Additionally LZ77 style compressors may not reach the same
+ compression ratio as with larger blocks since their internal dictionary can
+ not be reused across block boundaries.
Examples
--------
from python-blosc.
Related Issues (20)
- Issues decompressing bytes from files HOT 1
- Replace obsolete `popen2` HOT 1
- Properly identify vendored `cpuinfo.py` version
- Blosc_ROOT cmake warning: Policy CMP0074 is not set HOT 2
- "RuntimeError: Cannot decompress" for a compressed sequence of more than 7240 zero bytes HOT 1
- Very bad compression on short inputs 1-127 bytes long HOT 5
- “python_requires” should be set with “>=3.6”, as blosc 1.10.6 is not compatible with all Python versions. HOT 2
- wrong setuptools build command
- Concatenate two blosc compressed bytes objects HOT 2
- LICENSES/BLOSC.txt HOT 4
- Rename default branch HOT 1
- Update pypi with latest blosc version HOT 3
- Wheel for Python 3.10 and Python 3.11 HOT 3
- Cannot install blosc 1.11.0 on apple M1 machine HOT 3
- decompress in fore-end HOT 1
- README link to python-blosc2 seems useful HOT 1
- __pack_tensor__ must be made portable and not depend on Python HOT 2
- __pack_tensor__ should be in the beginning of the file to avoid seeking the whole file HOT 2
- Python 3.12 compatibility HOT 6
- Numpy 2 compatibility
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-blosc.