Code Monkey home page Code Monkey logo

bitshuffle's People

Contributors

anthchirp avatar dependabot[bot] avatar dimitripapadopoulos avatar dstndstn avatar fleon-psi avatar guyuqi avatar hacktoday avatar james-s-willis avatar jrs65 avatar ketiltrout avatar kif avatar kiyo-masui avatar maropu avatar millak avatar nritsche avatar odaira avatar satarsa avatar sileht avatar simongregorebner avatar t20100 avatar toddlipcon avatar uellue avatar vasole avatar weninc avatar yayahjb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bitshuffle's Issues

bitshuffle filter plugin should not be linked with a particular version of HDF5

Hello,

According to the point 4.3 of the documentation in:

https://www.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf

it should not be necessary for a filter plugin to link to the HDF5 library. Indeed, removing the bshuf_register_h5filter function (or replacing it by its error return value to show it is not called/needed) and redefining PUSH_ERR to avoid a call to HDF5 one can use the filter for decompressing. I have not tested if the call to HDF5 into bshuf_h5_set_local is actually needed and I do not know its function.

It would be desirable that the bitshuffle HDF5 filter plugin behaves the same way other filters (ex. LZ4) that do not bind to a particular version of HDF5.

Thanks for your time,

Armando

[question] New release

Hi everyone!

Is there any plan for a new release? it seems the last release was on Nov 5, 2018

Provide option or call option allowing bitshuffle with compression enabled to abort when output would exceed input size in bytes

Allow to request from bitshuffle when embedded eg. lz4 compression is enabled to abort when size of compressed output exceeds size of raw input data. This would make bitshuffle filter with lz4 compression enabled allow to behave conformant to the description of the H5Z_FLAG_OPTIONAL flag described in the The [Defining and Querying the Filter Pipeline] (https://docs.hdfgroup.org/hdf5/develop/_f_i_l_t_e_r.html) section in the libhdf5 manual.


Values for flags                Description

H5Z_FLAG_OPTIONAL  If this bit is set then the filter is optional.  If the filter fails (see below) during an H5Dwrite() operation
                                        then the filter is just excluded from the pipeline for the chunk for which it failed; the filter will not
                                        participate in the pipeline during an H5Dread() of the chunk.
 	                               This is commonly used for compression filters: if the compression result would be larger than the input
                                       then the compression filter returns failure and the uncompressed data is stored in the file. If this bit is
                                       clear and a filter fails then the H5Dwrite() or H5Dread() also fails.

At least for me it would to me be a more natural and thus expected behaviour that data is only stored compressed in hdf5 file when there is actually a benefit in terms of size from compression. Further i do not consider it the applications task to decide whether to compress a dataset or not to compress a dataset. On the application level that would always be a a wild guess whether the data will be compressible and thus likely will need less bytes when stored in compressed form compared to its uncompressed representation. This decision can only be made when actually compressing the data and figuring whether the extra bytes necessary for header, housekeeping, code-tables and other necessary bits would cause the resulting chunk be smaller than the input or cause it to exceed the input size.

An Example: an array a=numpy.np.array([2,3],dtype=np.float32) covers in non compressed form exactly 8 bytes excluding metadata. When compressing with bitshuffle + lz4 the 8 bytes of data end up in the hdf5 file with the following storage layout as reported by h5dump

      STORAGE_LAYOUT {
         CHUNKED ( 2 )
         SIZE 20 (0.400:1 COMPRESSION)
      }

If i do read that correctly this means the compressed array is expanded by a factor of 2.5 to 20 bytes so one raw byte covers in compressed output 2.5 bytes. I admit this example is very artificial to demonstrate but poorly compressible data may even when preprocessed by bitshuffle filter cover more space in the hdf5 file excluding metadata after application of compression filter compared to when stored in its original form.
For example of the gzip filter. The HZ5_filter_deflate function which implements the actual filter implemented in Hz5_Defllate.c from lines 155 down, allocates an output buffer having the same nbytes size as the input data and when libz compress2 returns Z_BUFF_ERROR indicating that all outputbuffer has been used while still some bytes to be processed and stored remain than compression is aborted as the nbytes are exceeded.

I guess that there are special situations where compression has to be used or its use makes sense independent whether the data bytes stored in a dataset are compressible or not. Therefore having a choice whether bithsuflle filter with enabled embedded compression, lz4 or any other supported, should abort when input size is exceeded by compressed output including necessary header and housekeeping bytes required by the filter or should emit compressed output in any case.

error while running bitshuffle dynamic plugin

Hello,
I get the same error (h5py/h5py#923 (comment)) when running sample code from 'h5ex_d_bshuf.c' with following environment:

HDF5: 1.12.0 (built with cmake)
BitShuffle:
a. master built with above HDF5 or
b. release h5pl-1.12.0-win64
Visual Studio 2019, x64
Windows 10

Followed all the instruction given in https://portal.hdfgroup.org/display/support/HDF5+Filter+Plugins
other plugin filters like lzf and lz4 are working as expected.

Here is the error stack for reference...

BitShuffle filter is available for encoding and decoding.
....Create dataset ................
HDF5-DIAG: Error detected in HDF5 (1.12.0) thread 0:
#000: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Pocpl.c line 1027 in H5Pget_filter_by_id2(): can't find object for ID
major: Object atom
minor: Unable to find atom information (already closed?)
#1: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Pint.c line 4015 in H5P_object_verify(): property list is not a member of the class
major: Property lists
minor: Unable to register new atom
#2: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Pint.c line 3965 in H5P_isa_class(): not a property list
major: Invalid arguments to routine
minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.0) thread 0:
#000: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5D.c line 151 in H5Dcreate2(): unable to create dataset
major: Dataset
minor: Unable to initialize object
#1: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5VLcallback.c line 1869 in H5VL_dataset_create(): dataset create failed
major: Virtual Object Layer
minor: Unable to create file
#2: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5VLcallback.c line 1835 in H5VL__dataset_create(): dataset create failed
major: Virtual Object Layer
minor: Unable to create file
#3: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5VLnative_dataset.c line 75 in H5VL__native_dataset_create(): unable to create dataset
major: Dataset
minor: Unable to initialize object
#4: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Dint.c line 411 in H5D__create_named(): unable to create and link to dataset
major: Dataset
minor: Unable to initialize object
#5: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5L.c line 1804 in H5L_link_object(): unable to create new link to object
major: Links
minor: Unable to initialize object
#6: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5L.c line 2045 in H5L__create_real(): can't insert link
major: Links
minor: Unable to insert object
#7: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Gtraverse.c line 855 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#8: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Gtraverse.c line 630 in H5G__traverse_real(): traversal operator failed
major: Symbol table
minor: Callback failed
#9: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5L.c line 1851 in H5L__link_cb(): unable to create object
major: Links
minor: Unable to initialize object
#10: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Oint.c line 2522 in H5O_obj_create(): unable to open object
major: Object header
minor: Can't open object
#11: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Doh.c line 301 in H5O__dset_create(): unable to create dataset
major: Dataset
minor: Unable to initialize object
#12: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Dint.c line 1305 in H5D__create(): unable to set local filter parameters
major: Dataset
minor: Unable to initialize object
#13: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Z.c line 935 in H5Z_set_local(): local filter parameters not set
major: Data filters
minor: Error from filter 'set local' callback
#14: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Z.c line 864 in H5Z_prepare_prelude_callback_dcpl(): unable to apply filter
major: Data filters
minor: Error from filter 'can apply' callback
#15: C:\Projects\HDF5\CMake-hdf5-1.12.0\hdf5-1.12.0\src\H5Z.c line 779 in H5Z_prelude_callback(): error during user callback
major: Data filters
minor: Error from filter 'set local' callback
failed to create dataset.

Kindly suggest the solution to make the bitshuffle algorithm work in my enviornment.
Thanks.

error: conflicting types for ‘uint64_t’

HI,

when trying to install imagecodecs using python3 -m pip install --user --global-option=build_ext --global-option="-I/work/fawx493/.software/include" --global-option="-L/work/fawx493/.software/lib64" imagecodecs I receive an error: conflicting types for ‘uint64_t’ which I believe might be related to bitshuffle.

More precisely, I obtain

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O3 -march=native -O3 -march=native -I/sw/env/gcc-8.3.0/openmpi/3.1.4/include -I/sw/compiler/gcc-8.3.0/include -I/usr/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/ncurses -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/db4 -O3 -march=native -O3 -march=native -I/sw/env/gcc-8.3.0/openmpi/3.1.4/include -I/sw/compiler/gcc-8.3.0/include -I/usr/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/ncurses -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/db4 -O3 -march=native -O3 -march=native -I/sw/env/gcc-8.3.0/openmpi/3.1.4/include -I/sw/compiler/gcc-8.3.0/include -I/usr/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/ncurses -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/db4 -fPIC -Iimagecodecs -Ibitshuffle-0.3.5 -I/work/fawx493/.software/include -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/python3.6 -I/sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/python3.6/site-packages/numpy/core/include -c imagecodecs/_bitshuffle.c -o build/temp.linux-x86_64-3.6/imagecodecs/_bitshuffle.o
    In file included from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0,
                     from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                     from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                     from imagecodecs/_bitshuffle.c:598:
    /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/python3.6/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
     #warning "Using deprecated NumPy API, disable it with " \
      ^
    In file included from bitshuffle-0.3.5/bitshuffle.h:32:0,
                     from imagecodecs/_bitshuffle.c:602:
    bitshuffle-0.3.5/bitshuffle_core.h:40:31: error: conflicting types for ‘uint64_t’
       typedef unsigned long long  uint64_t;
                                   ^
    In file included from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include/stdint.h:9:0,
                     from /usr/include/inttypes.h:27,
                     from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/python3.6/pyport.h:6,
                     from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/python3.6/Python.h:53,
                     from imagecodecs/_bitshuffle.c:4:
    /usr/include/stdint.h:55:27: note: previous declaration of ‘uint64_t’ was here
     typedef unsigned long int uint64_t;
                               ^
    In file included from bitshuffle-0.3.5/bitshuffle.h:32:0,
                     from imagecodecs/_bitshuffle.c:602:
    bitshuffle-0.3.5/bitshuffle_core.h:41:31: error: conflicting types for ‘int64_t’
       typedef long long           int64_t;
                                   ^
    In file included from /usr/include/stdlib.h:314:0,
                     from /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include/python3.6/Python.h:34,
                     from imagecodecs/_bitshuffle.c:4:
    /usr/include/sys/types.h:197:1: note: previous declaration of ‘int64_t’ was here
     __intN_t (64, __DI__);
     ^
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /sw/link/python/3.6.8/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/work/tmp/node001.2020-05-18-202016.fawx493.15709/pip-install-aae6jh56/imagecodecs/setup.py'"'"'; __file__='"'"'/work/tmp/node001.2020-05-18-202016.fawx493.15709/pip-install-aae6jh56/imagecodecs/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' build_ext -I/work/fawx493/.software/include -L/work/fawx493/.software/lib64 install --record /work/tmp/node001.2020-05-18-202016.fawx493.15709/pip-record-w5odwqxj/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/fawx493/.local/include/python3.6/imagecodecs Check the logs for full command output.

Any advice on how to avoid this conflicting type error?
Thanks!

sdist is incomplete

when running python setup.py sdist the produced tarball is missing the header files.

Drop hdf5 requirement for core library

Some people might want to use Bitshuffle without installing HDF5 or h5py. It should be pretty straight forward to make the HDF5 related bits an "extra".

openMP parrallelization for top level routines.

This shouldn't be very hard to implement. Bitshuffle is already much faster than disk speed, but this will be useful for memory applications and for cases where the underlying file is cached.

ABI compatibility of the C library and versioned soname

Hello folks! We are attempting to package the Bitshuffle C library under the src/ directory for Gentoo because it is an indirect dependency of Log4j 2.17 (gentoo/gentoo#23438). As distribution maintainers, we would like to have a versioned soname for the C library that indicates its ABI compatibility (e.g. libbitshuffle.so.0). We are not the only herd who would like a versioned soname; on Fedora, it is mandatory (https://docs.fedoraproject.org/en-US/packaging-guidelines/#_downstream_so_name_versioning).

To help us package the C library for various GNU/Linux distributions so more users will be able to install and use it easier, would you please start assigning versioned sonames for this library's releases? On behalf of our users and fellow maintainers, I appreciate your effort to focus on the C library's ABI compatibility. Thank you very much!

Bitshuffle with Anaconda importation problem - Library not loaded

I have the following problem when using bitshuffle with Anaconda.
The following importation from bitshuffle import h5 returns the following error.

.../anaconda/lib/python2.7/site-packages/bitshuffle/h5.so, 2): Library not loaded: libhdf5.8.dylib
  Referenced from: .../anaconda/lib/python2.7/site-packages/bitshuffle/h5.so
  Reason: image not found

bitshuffle was correctly installed from git source.

My system:

  • Mac OS X 10.9
  • Python 2.7.9
  • Anaconda 2.1.0 (x86_64)

Do you have a quick fix for it?

Build issue on OS X

I get a build issue when trying to get this going on OS X.

$ python setup.py install
...
ld: warning: directory not found for option '-L/opt/local/lib'
duplicate symbol _bshuf_H5Filter in:
    build/temp.macosx-10.9-x86_64-2.7/bitshuffle/h5.o
    build/temp.macosx-10.9-x86_64-2.7/src/bshuf_h5filter.o
ld: 1 duplicate symbol for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: command 'gcc' failed with exit status 1

Any ideas what's going on?

-Ofast option prevents compilation

Hello,

We are using gcc 4.4.7 under Centos 6.7 64-bit, and while I was trying to install the latest code (pulled this morning 2016-03-23) with "pip install bitshuffle", I got the following:

gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC ->DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=2 -DBSHUF_VERSION_POINT=1 ->I/home/aishimaj/autodataset/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ ->I/beamline/python/include/python2.7 -c bitshuffle/ext.c -o build/temp.linux-x86_64-2.7/bitshuffle/ext.o ->Ofast -march=native -std=c99 -fopenmp -fno-strict-aliasing
cc1: error: invalid option argument ‘-Ofast’
error: command 'gcc' failed with exit status 1

Another github issue shows that an alternative optimization argument can be used - macs3-project/MACS#91

I followed their directions for the change at the very bottom and it worked for me.

Basically, any configuration that does not require us to poke around with the setup file would be appreciated!

Jun Aishima
Australian Synchrotron/Monash University

License issues

I am the author of the Blosc compression library (http://blosc.org), and I have just seen your work on bitshuffle, which I find pretty nice. I might be interested in including your code in Blosc, but bitshuffling choosing GPL chokes with MIT used in Blosc.

Would you like to change bitshuffling license so that parts of your code could be easily included in other MIT/BSD projects?

Thanks!

Single byte dtypes

Apologies for the naive question, I have large arrays with int8 dtype but where most values are 0, 1 or 2. Can bitshuffle improve compression of arrays with a single-byte dtype?

Select AVX, SSE, or non-optimized code at runtime

For cases where users are not compiling bitshuffle themself, but rather using it in a shipped product, it's difficult to know ahead of time that the target architecture will support these instruction sets.

To get optimal runtime performance, bitshuffle should auto-detect the runtime architecture and pick the appropriate algorithm, either by an indirect branch, or by using gcc's 'ifunc' feature.

Ubuntu 16.04 install issue

I am having trouble getting bitshuffle to work on a new 16.04 machine. When I run a test, I get the following error:

In [11]: h5.create_dataset(f, "range", shape, dtype, chunks,
    ...:                   filter_pipeline=(32008, 32000),
    ...:                   filter_flags=(h5z.FLAG_MANDATORY, h5z.FLAG_MANDATORY),
    ...:                   filter_opts=None)
/usr/local/bin/ipython:4: DeprecationWarning: numpy boolean negative, the `-` operator, is deprecated, use the `~` operator or the logical_not function instead.
  import re
HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:
  #000: H5Pocpl.c line 1039 in H5Pget_filter_by_id2(): can't find object for ID
    major: Object atom
    minor: Unable to find atom information (already closed?)
  #001: H5Pint.c line 3817 in H5P_object_verify(): property list is not a member of the class
    major: Property lists
    minor: Unable to register new atom
  #002: H5Pint.c line 3767 in H5P_isa_class(): not a property list
    major: Invalid arguments to routine
    minor: Inappropriate type

I am using python 2.7.12, h5py v2.6.0, HDF5 v1.10.0-patch1. I received the same error message when using using the stock libhdf5 from apt (apt-get install libhdf5-serial-dev). Bitshuffle was installed with and without the --h5plugin flag

Any idea what is causing this, or a plan of attack?

Compound bitshuffle/LZF compression function.

Would be great if we could make a function that both bitshuffles and LZF compresses an array, along with the reverse operation.

It also makes sense to add a .npy like header to the buffer so that it can be transparently decompressed.

Consolidate SSE optimized shuffle with Blosc.

Both Bitshuffle and Blosc/c-blosc@b37ca0b implement optimized (SSE2) versions of shuffle for 16, 32, and 64 bit element sizes. In Bitshuffle these routines are bshuf_trans_byte_elem_*. The operation counts for the two implementations appear to be the same be we should check which versions are fastest and consolidate them.

Bitshuffle also has optimized code for if the elements size is a multiple of 32 or 64 bits, which is useful for compound data types and could benefit Blosc.

In HDF5 filter, copy leftover bytes.

Right now the filter is not invertable if nbytes % elem_size, which could happen if someone mistakenly put filters in the wrong order. Fix this with a trailing memcpy.

Can't build bitshuffle on macOs 10.15 with clang 11.0.0

When the github macos runner bumped clang to 11.0.0, we were not able to build bitshuffle anymore in https://github.com/kotekan/kotekan/pull/870/checks?check_run_id=1334216419

The build seems to fail because of -Werr, but I can't see where bitshuffle is set to build with that option. If possible -Werr should only be used for development and CI, since it breaking the build here is only annoying, not helpful.

Run git clone https://github.com/kiyo-masui/bitshuffle.git bitshuffle
  git clone https://github.com/kiyo-masui/bitshuffle.git bitshuffle
  cd bitshuffle && git pull
  python3 setup.py install --h5plugin --h5plugin-dir=/usr/local/opt/[email protected]/lib/plugin
  shell: /bin/bash -e {0}
  env:
    IMG_CORE: docker.pkg.github.com/kotekan/kotekan/kotekan-core
    IMG_IWYU: docker.pkg.github.com/kotekan/kotekan/kotekan-iwyu
    PYTEST_TIMEOUT: 60
    HDF5_DIR: /usr/local/opt/[email protected]/
##[debug]/bin/bash -e /Users/runner/work/_temp/3ce6be04-01dd-4a18-a75f-3bf4f540bdd6.sh
Cloning into 'bitshuffle'...
warning: Pulling without specifying how to reconcile divergent branches is
discouraged. You can squelch this message by running one of the following
commands sometime before your next pull:

  git config pull.rebase false  # merge (the default strategy)
  git config pull.rebase true   # rebase
  git config pull.ff only       # fast-forward only

You can replace "git config" with "git config --global" to set a default
preference for all repositories. You can also pass --rebase, --no-rebase,
or --ff-only on the command line to override the configured default per
invocation.

Already up to date.
Can't find hdf5 with pkg-config fallback to static config
Can't find hdf5 with pkg-config fallback to static config
Can't find hdf5 with pkg-config fallback to static config
Can't find hdf5 with pkg-config fallback to static config
running install
running build
running build_py
creating build
creating build/lib.macosx-10.15-x86_64-3.8
creating build/lib.macosx-10.15-x86_64-3.8/bitshuffle
copying bitshuffle/__init__.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle
creating build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/test_h5plugin.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/test_h5filter.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/make_regression_tdata.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/__init__.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/test_ext.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
copying bitshuffle/tests/test_regression.py -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests
creating build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests/data
copying bitshuffle/tests/data/regression_0.1.3.h5 -> build/lib.macosx-10.15-x86_64-3.8/bitshuffle/tests/data
running build_ext
Compiling bitshuffle/ext.pyx because it changed.
Compiling bitshuffle/h5.pyx because it changed.
[1/2] Cythonizing bitshuffle/ext.pyx
[2/2] Cythonizing bitshuffle/h5.pyx
building 'bitshuffle.ext' extension
creating build/temp.macosx-10.15-x86_64-3.8
creating build/temp.macosx-10.15-x86_64-3.8/bitshuffle
creating build/temp.macosx-10.15-x86_64-3.8/src
creating build/temp.macosx-10.15-x86_64-3.8/lz4
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -I/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include -I/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=6 -Isrc/ -Ilz4/ -I/Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/numpy-1.19.3-py3.8-macosx-10.15-x86_64.egg/numpy/core/include -I/usr/local/include -I/usr/local/opt/[email protected]/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8 -c bitshuffle/ext.c -o build/temp.macosx-10.15-x86_64-3.8/bitshuffle/ext.o
In file included from bitshuffle/ext.c:653:
In file included from /Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/numpy-1.19.3-py3.8-macosx-10.15-x86_64.egg/numpy/core/include/numpy/arrayobject.h:4:
In file included from /Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/numpy-1.19.3-py3.8-macosx-10.15-x86_64.egg/numpy/core/include/numpy/ndarrayobject.h:12:
In file included from /Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/numpy-1.19.3-py3.8-macosx-10.15-x86_64.egg/numpy/core/include/numpy/ndarraytypes.h:1822:
/Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/numpy-1.19.3-py3.8-macosx-10.15-x86_64.egg/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: "Using deprecated NumPy API, disable it with "          "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
#warning "Using deprecated NumPy API, disable it with " \
 ^
bitshuffle/ext.c:2260:16: error: implicit declaration of function 'bshuf_using_NEON' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
  __pyx_t_1 = (bshuf_using_NEON() != 0);
               ^
bitshuffle/ext.c:8557:3: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
  0, /*tp_print*/
  ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:260:5: note: 'tp_print' has been explicitly marked deprecated here
    Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
    ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/pyport.h:515:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
                                                     ^
bitshuffle/ext.c:8663:3: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
  0, /*tp_print*/
  ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:260:5: note: 'tp_print' has been explicitly marked deprecated here
    Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
    ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/pyport.h:515:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
                                                     ^
bitshuffle/ext.c:14480:5: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
    0,
    ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:260:5: note: 'tp_print' has been explicitly marked deprecated here
    Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
    ^
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/include/python3.8/pyport.h:515:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
                                                     ^
4 warnings and 1 error generated.
/Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/Cython-0.29.21-py3.8-macosx-10.15-x86_64.egg/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /Users/runner/work/kotekan/kotekan/bitshuffle/bitshuffle/ext.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)
/Users/runner/work/kotekan/kotekan/bitshuffle/.eggs/Cython-0.29.21-py3.8-macosx-10.15-x86_64.egg/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /Users/runner/work/kotekan/kotekan/bitshuffle/bitshuffle/h5.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)
error: command 'clang' failed with exit status 1
Error: Process completed with exit code 1.

Unresolved external symbols when building with MSVC on Windows 10

OS: 64-bit Windows 10.0.19045 Build 19045
Python: Python 3.10.11 [MSC v.1929 64 bit (AMD64)] on win32
Compiler: MSVC 14.38

I'm attempting to build bitshuffle from source on Windows 10. The full output from executing python setup.py install in the Developer Command Prompt for VS 2022 is in the attached log.txt, but I believe the relevant section to be:

   Creating library build\temp.win32-cpython-310\Release\bitshuffle\ext.cp310-win_amd64.lib and object build\temp.win32-cpython-310\Release\bitshuffle\ext.cp310-win_amd64.exp
ext.obj : error LNK2001: unresolved external symbol __imp__PyBaseObject_Type
ext.obj : error LNK2001: unresolved external symbol __imp__PyGC_Enable
...
build\lib.win32-cpython-310\bitshuffle\ext.cp310-win_amd64.pyd : fatal error LNK1120: 160 unresolved externals
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.38.33130\\bin\\HostX86\\x86\\link.exe' failed with exit code 1120

From some searching, this appears to be an issue with the compiler not being able to locate the python library at linking time. I've added the path to my python installations libs directory to the LIBPATH system environment variable, and added the path to python's include directory to the INCLUDE environment variable, but the issue persists. I've checked from the command prompt that the paths in these system environment variables are appended to the environment variables set up by the developer command prompt, so I'm not sure why the library isn't found during linking.

log.txt

Build on OSX Mavericks

Hi -- checked out the 0.2.1 code on my Mac, and it failed to install with the error

clang: warning: argument unused during compilation: '-fopenmp'
clang -bundle -undefined dynamic_lookup -L/usr/local/lib -L/usr/local/opt/sqlite/lib build/temp.macosx-10.10-x86_64-2.7/bitshuffle/ext.o build/temp.macosx-10.10-x86_64-2.7/src/bitshuffle.o build/temp.macosx-10.10-x86_64-2.7/lz4/lz4.o -L/opt/local/lib -L/usr/local/lib -lgomp -o build/lib.macosx-10.10-x86_64-2.7/bitshuffle/ext.so
ld: warning: directory not found for option '-L/opt/local/lib'
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: command 'clang' failed with exit status 1

Which according to this StackOverflow post is because gcc now symlinks to clang, which does not support OpenMP. This will likely be fixed in the future, there is work toward this effort, but the easiest path for now is probably to use gcc-4.9 (which you can install with homebrew).

The to get it to compile I needed to add a flag -Wa,-q and set the CC/CXX environments, which I did by editing setup.py:

COMPILE_FLAGS = ['-Ofast', '-march=native', '-std=c99', '-fopenmp', '-Wa,-q']
# Cython breaks strict aliasing rules.
COMPILE_FLAGS += ["-fno-strict-aliasing"]
#COMPILE_FLAGS = ['-Ofast', '-march=core2', '-std=c99', '-fopenmp']
os.environ["CC"] = "gcc-4.9" 
os.environ["CXX"] = "g++-4.9"

Not sure what adding this flag would do to other OS environments. I guess there are two options: make mac users use gcc-4.9, or check for macs with clang and disable OpenMP.

Add Anaconda Recipe

It would be good to have a anaconda recipe available to be able to create an anaconda package for bitshuffle

Document on-disk representation of bitshuffled data

I got some way reverse-engineering the format so that I can do the bitshuffle independently of lz4 in my application but kept stubbing my toes - some clear documentation on how it is used would be very useful for non-canonical implementations.

For example: it would appear that the on disk representation takes the form of

BE uint32_t compressed_block_size <compressed block> BE uint32_t compressed_block_size <compressed block> BE uint32_t compressed_block_size <compressed block> ...

where <compressed_block> is the result of previously compressing 8192 bytes, then there is a partial block which is smaller, finally a (looks like) verbatim uncompressed teeny bit at the end which is some residual. I could try compressing and then unpacking arbitrary bit patterns to resolve this but it feels like some canonical definition of the on-disk format (beyond, of course, reading the source code) would be a useful addition to this library.

Install C libraries and headers.

Could add a 'cinstall' command to the setup.py that does not install the python module but instead installs the C library and headers.

Issues with importing bitshuffle's h5 module since PR #81

Since 6899f5d I have had issues with running caput and dias builds on github actions.

pip install --no-binary=h5py h5py results in a segmentation fault. If I changed that to pip install h5py, there are no Segmentation Fault, but I still have issues with loading modules that import h5. Regrettably, I opted into pinning to an older bitshuffle version in the meantime instead of putting more time into this.

This is limited information, but I am available to help with reproducing. In particular, any new PR with caput and dias, that installs bitshuffle from the master branch will reproduce the issue.

CUDA kernels ...

I started working on some code to call nvcomp for LZ4 decompression and then do the bitshuffle on a GPU. For now I only looked at decompression and the data types we use (8,16,32 bit) and the code is far from optimal. But it might be interesting for some of you ?

Many thanks for sharing your work here - it has had a really big impact for our synchrotron X-ray experiments.

Wheel on PyPi doesn't work on systems without AVX512

When installing the binary wheel from pypi on my laptop, the decompress_lz4 function crashes at the instruction VPBROADCASTD somewhere in lz4 code, which was added with AVX512 (my laptop support up to AVX2 only). Would it be possible to either have a generic binary or build multiple and dynamically choose the right Python extension that is compatible with the current CPU?

Related to conda-forge/bitshuffle-feedstock#7

Installing from source via pip install --no-binary "bitshuffle" bitshuffle results in a working bitshuffle install.

`pip install bitshuffle` fails ... on python3.7

This has to do with the cythonized files shipped while some internal struct from Python changed in recent version, preventing the C-file to build.

Setting a build dependency on Cython and using it to transcode the pyx files to c at the build-time is a safe option.

Do you think it is possible to provide a manylinux wheel for bitshuffle ?

Drop the LZ4 dependency from bitshuffle.c

Currently, bitshuffle.c has the lz4 dependency, that is, it includes lz4.h.
This bit-shuffling technique is very useful for other LZ-variant libraries such as gzip and snappy, so many other developers of these kinds could exploit bit-shuffling functionality only. However, the dependency possibly makes these developers got stuck because of the unnecessary lz4 dependency.
I tried to make codes to remove this and the codes is here.
Could you give me any comment on this?

dtype argument issue

I'm trying to decompress a numpy array and if I send in the dtype from the array itself I get a TypeError: an integer is required. If I send dtype as an integer representing the size in bytes needed by my original data type, it fails TypeError: data type not understood.

So far I was able to fix it by creating a wrapper around the dtype like so:

class DTypeWrapper(object):
    def __init__(self, dtype):
        self.dtype = dtype

    def __getattr__(self, item):
        if item == 'itemsize':
            return np.zeros(1, dtype=self.dtype).itemsize
        return self.dtype.__getattr__(item)

def bslz4_decompress(data, shape, dtype):
    nelems = reduce((lambda x, y: x * y), shape)
    dec_data = bitshuffle.decompress_lz4(data, (nelems,), DTypeWrapper(dtype))
    return dec_data.reshape(shape)

ValueError: Unknown compression filter number: 32008

Hello

I met "ValueError: Unknown compression filter number: 32008" when program running in the code block as followed:

dataset = f.create_dataset(
    "data",
    (100, 100, 100),
    compression=bitshuffle.h5.H5FILTER,
    compression_opts=(block_size, bitshuffle.h5.H5_COMPRESS_LZ4),
    dtype='float32',
    )

I know "bitshuffle.h5.H5FILTER" equals 32008, but
Is there any quick solution for this problem?
Thanks for your time.

Ossic

Current version incompatible with poetry

The content of pyproject.toml is incompatible with poetry and makes the installation fall back on using setup.py. This also fails as Cython is not installed in the environment by poetry before running the setup.

For example, running in another module poetry add /path/to/v0.4.2/bitshuffle causes the error

  PackageInfoError

  Unable to determine package info for path: /tmp/pypoetry-git-bitshuffleo19yfitc

  Fallback egg_info generation failed.

  Command ['/tmp/tmp0azgu769/.venv/bin/python', 'setup.py', 'egg_info'] errored with the following return code 1, and output:
  Traceback (most recent call last):
    File "setup.py", line 13, in <module>
      from Cython.Compiler.Main import default_options
  ModuleNotFoundError: No module named 'Cython'

  at /usr/local/lib/python3.7/site-packages/poetry/inspection/info.py:503 in _pep517_metadata
      499│                     venv.run("python", "setup.py", "egg_info")
      500│                     return cls.from_metadata(path)
      501│                 except EnvCommandError as fbe:
      502│                     raise PackageInfoError(
    → 503│                         path, "Fallback egg_info generation failed.", fbe
      504│                     )
      505│                 finally:
      506│                     os.chdir(cwd.as_posix())
      507│
r

Can't install via pip: No such file or directory: 'requirements.txt'

I tried to install the latest v0.3.3 on an Ubuntu 16.04, with the following error:

$ sudo pip3 instal bitshuffle
Collecting bitshuffle
  Downloading bitshuffle-0.3.3.tar.gz (232kB)
    100% |████████████████████████████████| 235kB 5.3MB/s
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-_fovqf52/bitshuffle/setup.py", line 272, in <module>
        with open('requirements.txt') as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'

Zstd + bitshuffle

For the purpose of writing data from high performance X-ray detectors, I have recently tested replacing LZ4 with Zstd in Bitshuffle code. At least for my test data compression factors were 20% better, but performance is suffering - interestingly Zstd "likes" larger block sizes. Results are available here:

https://aca.scitation.org/doi/full/10.1063/1.5143480
https://aca.scitation.org/doi/suppl/10.1063/1.5143480/suppl_file/20200126_supporting+material.pdf --> see Table S2 for block size comparison

Changes were simple - just adding functions with call to Zstd routines instead of LZ4 + adding one constant for HDF5 plugin, see:
https://github.com/fleon-psi/bitshuffle

If you think this is worth including in mainstream Bitshuffle code, I'd be happy to make pull request.

Debugging corrupted bitshuffle data

Hi @kiyo-masui, we have some SETI data stored with bitshuffle compression, and a small number of files appear to have become corrupted. (Here is one, FYI: https://bldata.berkeley.edu/blpd30_datax2/blc03_guppi_59132_36704_HIP111595_0078.rawspec.0002.h5)

h5py is happy to open the file, but barfs if you try and access the bitshuffled dataset:

In [3]: a = h5py.File('blc03_guppi_59132_36704_HIP111595_0078.rawspec.0002.h5', 'r')
In [4]: a['data']
Out[4]: <HDF5 dataset "data": shape (279, 1, 65536), type "<f4">

In [5]: d = a['data'][:]
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-5-fee15ce54759> in <module>
----> 1 d = a['data'][:]

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/opt/anaconda3/lib/python3.8/site-packages/h5py/_hl/dataset.py in __getitem__(self, args)
    571         mspace = h5s.create_simple(mshape)
    572         fspace = selection.id
--> 573         self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
    574
    575         # Patch up the output for NumPy

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5d.pyx in h5py.h5d.DatasetID.read()

h5py/_proxy.pyx in h5py._proxy.dset_rw()

h5py/_proxy.pyx in h5py._proxy.H5PY_H5Dread()

OSError: Can't read data (filter returned failure during read)

Do you think this file is recoverable (or partly recoverable)? Is there any way to turn on extra debug info in bitshuffle to help diagnose why it fails, and/or can bitshuffle skip over 'bad' chunks?

Build issue on Ubuntu 16.04

Trying to install on Ubuntu 16.04.03:

$ apt-cache install libhdf5-dev libhdf5-serial-dev
...
$ pip install bitshuffle
Collecting bitshuffle
  Using cached bitshuffle-0.3.2.tar.gz
Requirement already satisfied: numpy in /home/davor/ch_util_venv/lib/python2.7/site-packages (from bitshuffle)
Requirement already satisfied: h5py in /home/davor/ch_util_venv/lib/python2.7/site-packages (from bitshuffle)
Requirement already satisfied: Cython in /home/davor/ch_util_venv/lib/python2.7/site-packages (from bitshuffle)
Requirement already satisfied: setuptools>=0.7 in /home/davor/ch_util_venv/lib/python2.7/site-packages (from bitshuffle)
Requirement already satisfied: six in /home/davor/ch_util_venv/lib/python2.7/site-packages (from h5py->bitshuffle)
Building wheels for collected packages: bitshuffle
  Running setup.py bdist_wheel for bitshuffle ... error
  Complete output from command /home/davor/ch_util_venv/bin/python2 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-QVwGSd/bitshuffle/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpGNehbMpip-wheel- --python-tag cp27:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-2.7
  creating build/lib.linux-x86_64-2.7/bitshuffle
  copying bitshuffle/__init__.py -> build/lib.linux-x86_64-2.7/bitshuffle
  creating build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/test_h5plugin.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/test_regression.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/test_h5filter.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/make_regression_tdata.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/test_ext.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  copying bitshuffle/tests/__init__.py -> build/lib.linux-x86_64-2.7/bitshuffle/tests
  creating build/lib.linux-x86_64-2.7/bitshuffle/tests/data
  copying bitshuffle/tests/data/regression_0.2.0.h5 -> build/lib.linux-x86_64-2.7/bitshuffle/tests/data
  copying bitshuffle/tests/data/regression_0.1.4.h5 -> build/lib.linux-x86_64-2.7/bitshuffle/tests/data
  copying bitshuffle/tests/data/regression_0.2.1.h5 -> build/lib.linux-x86_64-2.7/bitshuffle/tests/data
  copying bitshuffle/tests/data/regression_0.1.3.h5 -> build/lib.linux-x86_64-2.7/bitshuffle/tests/data
  running build_ext

  #################################
  # Compiling with OpenMP support #
  #################################

  building 'bitshuffle.ext' extension
  creating build/temp.linux-x86_64-2.7
  creating build/temp.linux-x86_64-2.7/bitshuffle
  creating build/temp.linux-x86_64-2.7/src
  creating build/temp.linux-x86_64-2.7/lz4
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -I/home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c bitshuffle/ext.c -o build/temp.linux-x86_64-2.7/bitshuffle/ext.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  In file included from /home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1809:0,
                   from /home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
                   from /home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                   from bitshuffle/ext.c:535:
  /home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
   #warning "Using deprecated NumPy API, disable it by " \
    ^
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -I/home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c src/bitshuffle.c -o build/temp.linux-x86_64-2.7/src/bitshuffle.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -I/home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c src/bitshuffle_core.c -o build/temp.linux-x86_64-2.7/src/bitshuffle_core.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -I/home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c src/iochain.c -o build/temp.linux-x86_64-2.7/src/iochain.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -I/home/davor/ch_util_venv/local/lib/python2.7/site-packages/numpy/core/include -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c lz4/lz4.c -o build/temp.linux-x86_64-2.7/lz4/lz4.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/bitshuffle/ext.o build/temp.linux-x86_64-2.7/src/bitshuffle.o build/temp.linux-x86_64-2.7/src/bitshuffle_core.o build/temp.linux-x86_64-2.7/src/iochain.o build/temp.linux-x86_64-2.7/lz4/lz4.o -lgomp -o build/lib.linux-x86_64-2.7/bitshuffle/ext.so
  building 'bitshuffle.h5' extension
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DBSHUF_VERSION_MAJOR=0 -DBSHUF_VERSION_MINOR=3 -DBSHUF_VERSION_POINT=2 -Isrc/ -Ilz4/ -I/usr/include/python2.7 -c bitshuffle/h5.c -o build/temp.linux-x86_64-2.7/bitshuffle/h5.o -O3 -ffast-math -march=native -std=c99 -fno-strict-aliasing -fopenmp
  In file included from bitshuffle/h5.c:247:0:
  src/bshuf_h5filter.h:36:18: fatal error: hdf5.h: No such file or directory
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

  ----------------------------------------
  Failed building wheel for bitshuffle
  Running setup.py clean for bitshuffle
Failed to build bitshuffle

I do have hdf5.h in /usr/include/hdf5/serial/hdf5.h. I as able to work around this by installing with CFLAGS=-I/usr/include/hdf5/serial pip install bitshuffle, but then I ran into linking error finding -lhdf5. That's because I don't have libhdf5.so, I have libhdf5_serial.so, and I don't have libhdf5_hl.so but libhdf5_serial_hl.so (both in /usr/lib/x86_64-linux-gnu). I worked around that by simply symlinking ln -s libhdf5_serial.so libhdf5.so, etc.

Decompression slow downs for "too many" threads

It seems the openmp locks and (dynamic,1) overhead can become important for machines with large numbers of cores. For decompression, I could see some improvements using static scheduling:

20230206_bench

Perhaps there is a better way to overcome this problem? Anyway, I will try to send you a pull request for dicussion.

About LZ4 and AVX, bitshuffle

Hi, I am carrying out research on the block data transmission using lz4 compression, and in the process of that, I get to know bit shuffle. If I use bit shuffle and compress lz4, is the compression speed faster? And in bit shuffle, I saw that SSE and AVX instructions are included. Do you use AVX instructions to speed up bit shuffling? Thanks for read it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.