Code Monkey home page Code Monkey logo

python-isal's People

Contributors

cielavenir avatar jan666 avatar marcelm avatar rhpvorderman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

python-isal's Issues

Release 1.0.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Use blocks_output_buffer

@animalize made a more intelligent output buffer for python. It utilizes python lists internally.

The old buffer simply uses a PyMem_Resize call. This has the disadvantage of reallocating the memory every time it is called. When the buffer is grown this way, the data at the beginning is copied quite a lot of times. This means that using larger buffer sizes runs into a limit at some point.

The blocks_output_buffer creates a python list with all the buffer blocks. These are created only once. In the end these blocks are all joined together, requiring only one memcpy call per block. This is not only theoretically faster, but it also turns out to be that way in practice.

More tests are needed.

The stdlib gzip and zlib tests need to be replicated to be sure this library works correctly in all use cases.

Release 0.10.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 1.1.0

Release checklist

  • Check outstanding issues on Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Possible regression: reading gzip files generates a CRC check failed error in version 0.11.0

Hello @rhpvorderman,
Some months ago I reported a bug in the decompression of Gzip files (#60), and today while using cutadapt in a different computer it happened again. I remembered the previous time and checked the compressed files, and found that "zcat" and "gzip -t" were not giving any errors, so I suspected of isal.

In my personal computer I have installed version 0.8.1 which work fine with the files, so without changing anything else I tried installing the next isal versions one by one, and found that the the files are decompressed fine with isal versions 0.9.0 and 0.10.0, but breaks on last version 0.11.0:

fossandon@ubuntu:~/Documents/download$ pip3 install isal==0.11.0
Defaulting to user installation because normal site-packages is not writeable
Collecting isal==0.11.0
  Using cached isal-0.11.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
Installing collected packages: isal
  Attempting uninstall: isal
    Found existing installation: isal 0.10.0
    Uninstalling isal-0.10.0:
      Successfully uninstalled isal-0.10.0
Successfully installed isal-0.11.0

fossandon@ubuntu:~/Documents/download$ cutadapt -a "AACTTTYARCAAYGGATCTC;max_error_rate=0.1;min_overlap=20" -A "TGATCCYTCCGCAGGT;max_error_rate=0.5;min_overlap=16" --pair-adapters --pair-filter any --cores 2 --output 136727_R1.fastq --paired-output 136727_R2.fastq 136727_S159_L001_R1_001.fastq.gz 136727_S159_L001_R2_001.fastq.gz
This is cutadapt 3.4 with Python 3.6.9
Command line parameters: -a AACTTTYARCAAYGGATCTC;max_error_rate=0.1;min_overlap=20 -A TGATCCYTCCGCAGGT;max_error_rate=0.5;min_overlap=16 --pair-adapters --pair-filter any --cores 2 --output 136727_R1.fastq --paired-output 136727_R2.fastq 136727_S159_L001_R1_001.fastq.gz 136727_S159_L001_R2_001.fastq.gz
Processing reads on 2 cores in paired-end mode ...
ERROR: Traceback (most recent call last):
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 556, in run
    dnaio.read_paired_chunks(f, f2, self.buffer_size)):
  File "/home/fossandon/Documents/Github_repos/dnaio/src/dnaio/chunks.py", line 118, in read_paired_chunks
    bufend1 = f.readinto(memoryview(buf1)[start1:]) + start1  # type: ignore
  File "/usr/lib/python3.6/gzip.py", line 276, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.6/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/home/fossandon/.local/lib/python3.6/site-packages/isal/igzip.py", line 265, in read
    self._read_eof()
  File "/usr/lib/python3.6/gzip.py", line 501, in _read_eof
    hex(self._crc)))
OSError: CRC check failed 0x8b1f001a != 0xd2f5dc20

ERROR: Traceback (most recent call last):
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 626, in run
    raise e
OSError: CRC check failed 0x8b1f001a != 0xd2f5dc20

Traceback (most recent call last):
  File "/home/fossandon/.local/bin/cutadapt", line 8, in <module>
    sys.exit(main_cli())
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 848, in main_cli
    main(sys.argv[1:])
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 913, in main
    stats = r.run()
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 825, in run
    raise e
OSError: CRC check failed 0x8b1f001a != 0xd2f5dc20

Inspecting the changes in the last release, I found that a couple of lines added in 0.8.1 fix were modified:

Could it be that the modification caused a regression??

I shared the the files pair that caused the error in this folder, so you can reproduce it on your end:
https://drive.google.com/drive/folders/1iOqvXbDQQd8NDtnZhzutmOxx4wUONO-k?usp=sharing

Best regards,

Building wheels for isal error

('python=3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit '
'(Intel)]')
'os=Windows-10-10.0.19041-SP0'
'numpy=1.21.0'

Processor: Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz, 2712 Mhz, 4 Core(s), 8 Logical Processor(s)

Traceback

Building wheels for collected packages: isal
Building wheel for isal (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for isal (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [7 lines of output]
running bdist_wheel
running build
running build_py
running build_ext
cythoning src/isal/isal_zlib.pyx to src/isal\isal_zlib.c
cythoning src/isal/igzip_lib.pyx to src/isal\igzip_lib.c
error: [WinError 2] The system cannot find the file specified
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for isal
Failed to build isal
ERROR: Could not build wheels for isal, which is required to install pyproject.toml-based projects

Reading gzip files generates a CRC check failed error (version >= 0.7.0)

Hello @rhpvorderman,
Yesterday, it happened to me and other bioinformaticians that the program that we were using (cutadapt) crashed unexpectedly when trying to open some gzipped files, which was the first time something like this happened: marcelm/cutadapt#520

fossandon@ubuntu:~/Documents/download$ cutadapt -a 'AACTTTYARCAAYGGATCTC;max_error_rate=0.1;min_overlap=20' -A 'TGATCCYTCCGCAGGT;max_error_rate=0.5;min_overlap=16' --pair-adapters --pair-filter any --cores 2 --output 94477_R1.fastq --paired-output 94477_R2.fastq 94477_S175_L001_R1_001.fastq.gz 94477_S175_L001_R2_001.fastq.gz
This is cutadapt 3.3 with Python 3.6.9
Command line parameters: -a AACTTTYARCAAYGGATCTC;max_error_rate=0.1;min_overlap=20 -A TGATCCYTCCGCAGGT;max_error_rate=0.5;min_overlap=16 --pair-adapters --pair-filter any --cores 2 --output 94477_R1.fastq --paired-output 94477_R2.fastq 94477_S175_L001_R1_001.fastq.gz 94477_S175_L001_R2_001.fastq.gz
Processing reads on 2 cores in paired-end mode ...
[ 8<---------] 00:00:03        88,831 reads  @     26.0 µs/read;   2.31 M reads/minuteERROR: Traceback (most recent call last):
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 556, in run
    dnaio.read_paired_chunks(f, f2, self.buffer_size)):
  File "/home/fossandon/.local/lib/python3.6/site-packages/dnaio/chunks.py", line 118, in read_paired_chunks
    bufend1 = f.readinto(memoryview(buf1)[start1:]) + start1  # type: ignore
  File "/usr/lib/python3.6/gzip.py", line 276, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.6/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.6/gzip.py", line 454, in read
    self._read_eof()
  File "/usr/lib/python3.6/gzip.py", line 501, in _read_eof
    hex(self._crc)))
OSError: CRC check failed 0x88b1f != 0x6fe5d9e4

ERROR: Traceback (most recent call last):
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 626, in run
    raise e
OSError: CRC check failed 0x88b1f != 0x6fe5d9e4

Traceback (most recent call last):
  File "/home/fossandon/.local/bin/cutadapt", line 8, in <module>
    sys.exit(main_cli())
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 848, in main_cli
    main(sys.argv[1:])
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 913, in main
    stats = r.run()
  File "/home/fossandon/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 825, in run
    raise e
OSError: CRC check failed 0x88b1f != 0x6fe5d9e4

But using zcat and "gzip -t" on the files does not return any error, and they can be decompressed fine with "gzip -d", even running the same cutadapt command in different environments (python 3.6 and 3.8 were tested too) with the same version resulted in a crash for some environments and not for others. It took a long search and tests with a collegue, until we figure out that the key difference between crashing and not crashing was the version installed of the isal dependency (which uses the latest version when creating a docker image)... Using versions 0.8.0 and 0.7.0 generate the CRC error, but using 0.6.1 and 0.5.0 did not, so it seems the bug was introduced in 0.7.0, and keeping the intermediate dependencies the same but reverting isal to 0.6.1 allow it to work:

299	3047	0.0	8	2963 57 11 6 5 2 3
300	8	0.0	8	0 0 0 0 0 3 4 0 1
301	15028	0.0	8	0 14646 270 64 24 15 8 0 1


WARNING:
    One or more of your adapter sequences may be incomplete.
    Please see the detailed output above.
fossandon@ubuntu:~/Documents/temp$ pip3 list | egrep "cutadapt|dnaio|isal|xopen"
cutadapt              3.3
dnaio                 0.5.0               /home/fossandon/.local/lib/python3.6/site-packages
isal                  0.6.1
xopen                 1.1.0

In my case, I was processing a folder where all gzipped files came from a source where they were created at the same time, but only a portion consistently crashed and the others not. So to help you have a test case, I uploaded the files pair that I was using with the cutadapt example above, so you can reproduce it on your own, I couldn't find smaller ones that reproduced this error.
https://drive.google.com/drive/folders/1eTmLbd9WINctLb48pzn57_Ohp1amwZah?usp=sharing

Best regards,

Release 0.7.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 0.1.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 0.4.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 1.0.1

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Installation on Apple M1 not possible

I'm trying to install isal with pip on my MacBook Pro 2021 (M1 Pro Chip on macOS Monterey 12.1)

The installation fails because of multiple errors. The first one being:
In file included from erasure_code/aarch64/ec_aarch64_dispatcher.c:29:
/var/folders/yh/q11mmnrx4bj7m21b0l255mg40000gn/T/tmpmnogoka1/include/aarch64_multibinary.h:34:10: fatal error: asm/hwcap.h' file not found
#include <asm/hwcap.h>
^~~~~~~~~~~~~
1 error generated.

After that there are a bunch of other errors. I've added the full log as an attachment.

Is there anything I can do about that?

Cheers
Marco

pip_install_isal_apple_m1.txt

Increase coverage to 100%

Yes, that is more than CPython's gzip module. No, that does not say anything about quality. But the number should be as high as possible. It would be a shame of 1.0.0 only lives for a few days until 1.0.1 has to come along.

Cannot build when TMPDIR is mounted as noexec

We are copying isal source to TMPDIR and running autogen.sh inside TMPDIR, but if TMPDIR is mounted as noexec, autogen.sh will raise Error 13 Permission Denied. (In the specific environment) as a workaround we needed to sudo mount -o remount,exec /tmp /tmp.

If possible please fix setup.py to avoid this problem.

Release 0.8.1

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Arm64 Support

ISA-L can run on ARM64, python can run on ARM64. Python-isal should therefore be able to run on ARM64 as well. Currently there is no equipment and no CI environment available to properly test this.

If someone where to provide a self-hosted github runner to run ARM tests on this issue could be tackled.

Release 0.9.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Improve output method

As of now it is distinct from zlibmodule.c. It is not a very efficient use of the buffer. zlibmodule.c's method however is slower. As implemented now on the buffer branch.

Look for opprotunities to streamline this proces and make it more shared across the functions that need it.

Release 0.11.1

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Windows support

It should be possible to support windows. This is a work in progress.
Current difficulties:

  • There is no conda package for isa-l yet that runs on windows
    • igzip.exe will not build on windows due to POSIX dependencies (utime.h, getopt.h)
  • isa-l.h is not build when compiling isa-l using nmake on windows. This file is used in python-isal.

Release GIL inside library

The goal of the issue is to begin discussion about GIL manipulation in the library.

Currently, all the code inside the library is run in single thread since Python requires to acquire GIL to work with Python objects. As example this blocks using isal library in case of multiple concurrent threads each working with compressed file.

Have you consider releasing lock before executing I-SAL library functions that works with internal state (like isal_deflate)? I didn't fully checked all the details of I-SAL library and may miss some potential problems.

This is also one of the difference from built-in gzip implementation since it releases lock before deflate call and crc calculation.

Release 0.3.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 0.2.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Release 0.5.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Temporary prefix is not removed when installing libisal from source

The build prefix is removed, but the temporary prefix is not. This is a result of the setup.py implementation where libisal is only build once. Otherwise it would have been trivial to remove the prefix.
The temporary install prefix is 1.4 MB big. This is only an issue for installations from the source distribution.

Release 0.11.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Simpler source build

Instead of using autoconf and automake, simply make -f Makefile.unx also creates the necessary files for static linking. This requires a change of the script.

Release 0.6.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Create a test pypi package from the main branch. (Instructions.)
  • Install the packages from the test pypi repository to see if they work.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote.
  • Push tested packages to pypi.
  • merge main branch back into develop.
  • Add updated version number to develop.
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Some questions about Arm64 and isa-l_crypto

  • Do you plan add isa-l_crypto binding ? It is crypto algorithm of isa-l .
    ISA-L_crypto includes multi-buffer optimization for crypto algorithm and some special hash algorithm.

  • Do you plan test this project on Arm64 platform ?
    ISA-L for Arm64 has been done now. I just want to know if you plan to test it on Arm64 .
    Travis CI can support Arm64 platform. You can check isa-l/.travis.yml to know how to add Arm64 support

_GzipFileReader is slow

See following benchmarks between igzip and python -m isal.igzip. Both are tested in the same conda environment. Therefore both use the same compile settings and have the same version of isa-l.

$ hyperfine -w 3 -r 10 'python -m isal.igzip -c ~/test/big2.fastq > /dev/null'
Benchmark #1: python -m isal.igzip -c ~/test/big2.fastq > /dev/null
  Time (mean ± σ):      4.894 s ±  0.028 s    [User: 4.758 s, System: 0.134 s]
  Range (min … max):    4.856 s …  4.949 s    10 runs
 
$ hyperfine -w 3 -r 10 'igzip -c ~/test/big2.fastq > /dev/null'
Benchmark #1: igzip -c ~/test/big2.fastq > /dev/null
  Time (mean ± σ):      4.732 s ±  0.020 s    [User: 4.520 s, System: 0.211 s]
  Range (min … max):    4.699 s …  4.756 s    10 runs
 
$ hyperfine -w 3 -r 10 'python -m isal.igzip -cd ~/test/big2.fastq.gz > /dev/null'
Benchmark #1: python -m isal.igzip -cd ~/test/big2.fastq.gz > /dev/null
  Time (mean ± σ):      3.479 s ±  0.025 s    [User: 3.398 s, System: 0.080 s]
  Range (min … max):    3.432 s …  3.510 s    10 runs
 
$ hyperfine -w 3 -r 10 'igzip -cd ~/test/big2.fastq.gz > /dev/null'
Benchmark #1: igzip -cd ~/test/big2.fastq.gz > /dev/null
  Time (mean ± σ):      2.872 s ±  0.029 s    [User: 2.808 s, System: 0.063 s]
  Range (min … max):    2.811 s …  2.914 s    10 runs

Compression: 4.894 / 4.732 = 1,034. 3,4% overhead when using python instead of a pure C implementation. That is quite good. Especially considering the portability of python-isal (works on windows, where igzip does not).
Decompression: 3.479 / 2.872 = 1,211. 21,1 % overhead when using python instead of the pure C implementation. Very bad! This is probably due to the overhead of juggling with the unconsumed_tail. Which means that a lot of bytes are converted from python to C and back again repeatedly without result.

Release 0.8.0

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Add igzip_lib.pyx with access to the igzip_lib api

This will have:

  • compress
  • decompress
  • compress_stateless
  • decompress_stateless
  • IgzipCompressor
  • IgzipDecompressor

With the compressor and decompressor object resembling those from bz2 and lzma modules. These can be used in igzip to reduce the overhead caused by unconsumed_tail.
Also igzip_lib has many more fine-grained settings for headers and trailers, which can be used to great effect (i.e. write no header, but do write a trailer). This can be used in igzip.compress and igzip.decompress.

Release 0.6.1

Release checklist

  • Check outstanding issues on JIRA and Github.
  • Check latest documentation looks fine.
  • Create a release branch.
    • Set version to a stable number.
    • Change current development version in CHANGELOG.rst to stable version.
    • Change the version in __init__.py
  • Merge the release branch into main.
  • Created an annotated tag with the stable version number. Include changes
    from CHANGELOG.rst.
  • Push tag to remote. This triggers the wheel/sdist build on github CI.
  • merge main branch back into develop.
  • Add updated version number to develop. (setup.py and src/isal/__init__.py)
  • Build the new tag on readthedocs. Only build the last patch version of
    each minor version. So 1.1.1 and 1.2.0 but not 1.1.0, 1.1.1 and 1.2.0.
  • Create a new release on github.
  • Update the package on conda-forge.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.