Code Monkey home page Code Monkey logo

python-xxhash's Introduction

python-xxhash

Github Actions Status

Latest Version

Supported Python versions

License

xxhash is a Python binding for the xxHash library by Yann Collet.

Installation

$ pip install xxhash

You can also install using conda:

$ conda install -c conda-forge python-xxhash

Installing From Source

$ pip install --no-binary xxhash xxhash

Prerequisites

On Debian/Ubuntu:

$ apt-get install python-dev gcc

On CentOS/Fedora:

$ yum install python-devel gcc redhat-rpm-config

Linking to libxxhash.so

By default python-xxhash will use bundled xxHash, we can change this by specifying ENV var XXHASH_LINK_SO:

$ XXHASH_LINK_SO=1 pip install --no-binary xxhash xxhash

Usage

Module version and its backend xxHash library version can be retrieved using the module properties VERSION AND XXHASH_VERSION respectively.

>>> import xxhash
>>> xxhash.VERSION
'2.0.0'
>>> xxhash.XXHASH_VERSION
'0.8.0'

This module is hashlib-compliant, which means you can use it in the same way as hashlib.md5.

update() -- update the current digest with an additional string
digest() -- return the current digest value
hexdigest() -- return the current digest as a string of hexadecimal digits
intdigest() -- return the current digest as an integer
copy() -- return a copy of the current xxhash object
reset() -- reset state

md5 digest returns bytes, but the original xxh32 and xxh64 C APIs return integers. While this module is made hashlib-compliant, intdigest() is also provided to get the integer digest.

Constructors for hash algorithms provided by this module are xxh32() and xxh64().

For example, to obtain the digest of the byte string b'Nobody inspects the spammish repetition':

>>> import xxhash
>>> x = xxhash.xxh32()
>>> x.update(b'Nobody inspects')
>>> x.update(b' the spammish repetition')
>>> x.digest()
b'\xe2);/'
>>> x.digest_size
4
>>> x.block_size
16

More condensed:

>>> xxhash.xxh32(b'Nobody inspects the spammish repetition').hexdigest()
'e2293b2f'
>>> xxhash.xxh32(b'Nobody inspects the spammish repetition').digest() == x.digest()
True

An optional seed (default is 0) can be used to alter the result predictably:

>>> import xxhash
>>> xxhash.xxh64('xxhash').hexdigest()
'32dd38952c4bc720'
>>> xxhash.xxh64('xxhash', seed=20141025).hexdigest()
'b559b98d844e0635'
>>> x = xxhash.xxh64(seed=20141025)
>>> x.update('xxhash')
>>> x.hexdigest()
'b559b98d844e0635'
>>> x.intdigest()
13067679811253438005

Be careful that xxh32 takes an unsigned 32-bit integer as seed, while xxh64 takes an unsigned 64-bit integer. Although unsigned integer overflow is defined behavior, it's better not to make it happen:

>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=0).hexdigest()
'f7a35af8'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=2**32).hexdigest()
'f7a35af8'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=1).hexdigest()
'd8d4b4ba'
>>> xxhash.xxh32('I want an unsigned 32-bit seed!', seed=2**32+1).hexdigest()
'd8d4b4ba'
>>>
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=0).hexdigest()
'd4cb0a70a2b8c7c1'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=2**64).hexdigest()
'd4cb0a70a2b8c7c1'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=1).hexdigest()
'ce5087f12470d961'
>>> xxhash.xxh64('I want an unsigned 64-bit seed!', seed=2**64+1).hexdigest()
'ce5087f12470d961'

digest() returns bytes of the big-endian representation of the integer digest:

>>> import xxhash
>>> h = xxhash.xxh64()
>>> h.digest()
b'\xefF\xdb7Q\xd8\xe9\x99'
>>> h.intdigest().to_bytes(8, 'big')
b'\xefF\xdb7Q\xd8\xe9\x99'
>>> h.hexdigest()
'ef46db3751d8e999'
>>> format(h.intdigest(), '016x')
'ef46db3751d8e999'
>>> h.intdigest()
17241709254077376921
>>> int(h.hexdigest(), 16)
17241709254077376921

Besides xxh32/xxh64 mentioned above, oneshot functions are also provided, so we can avoid allocating XXH32/64 state on heap:

xxh32_digest(bytes, seed=0)
xxh32_intdigest(bytes, seed=0)
xxh32_hexdigest(bytes, seed=0)
xxh64_digest(bytes, seed=0)
xxh64_intdigest(bytes, seed=0)
xxh64_hexdigest(bytes, seed=0)
>>> import xxhash
>>> xxhash.xxh64('a').digest() == xxhash.xxh64_digest('a')
True
>>> xxhash.xxh64('a').intdigest() == xxhash.xxh64_intdigest('a')
True
>>> xxhash.xxh64('a').hexdigest() == xxhash.xxh64_hexdigest('a')
True
>>> xxhash.xxh64_hexdigest('xxhash', seed=20141025)
'b559b98d844e0635'
>>> xxhash.xxh64_intdigest('xxhash', seed=20141025)
13067679811253438005L
>>> xxhash.xxh64_digest('xxhash', seed=20141025)
'\xb5Y\xb9\x8d\x84N\x065'
In [1]: import xxhash

In [2]: %timeit xxhash.xxh64_hexdigest('xxhash')
268 ns ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [3]: %timeit xxhash.xxh64('xxhash').hexdigest()
416 ns ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

XXH3 hashes are available since v2.0.0 (xxHash v0.8.0), they are:

Streaming classes:

xxh3_64
xxh3_128

Oneshot functions:

xxh3_64_digest(bytes, seed=0)
xxh3_64_intdigest(bytes, seed=0)
xxh3_64_hexdigest(bytes, seed=0)
xxh3_128_digest(bytes, seed=0)
xxh3_128_intdigest(bytes, seed=0)
xxh3_128_hexdigest(bytes, seed=0)

And aliases:

xxh128 = xxh3_128
xxh128_digest = xxh3_128_digest
xxh128_intdigest = xxh3_128_intdigest
xxh128_hexdigest = xxh3_128_hexdigest

Caveats

SEED OVERFLOW

xxh32 takes an unsigned 32-bit integer as seed, and xxh64 takes an unsigned 64-bit integer as seed. Make sure that the seed is greater than or equal to 0.

ENDIANNESS

As of python-xxhash 0.3.0, digest() returns bytes of the big-endian representation of the integer digest. It used to be little-endian.

DONT USE XXHASH IN HMAC

Though you can use xxhash as an HMAC hash function, but it's highly recommended not to.

xxhash is NOT a cryptographic hash function, it is a non-cryptographic hash algorithm aimed at speed and quality. Do not put xxhash in any position where cryptographic hash functions are required.

Copyright (c) 2014-2024 Yue Du - https://github.com/ifduyue

Licensed under BSD 2-Clause License

python-xxhash's People

Contributors

blakev avatar cgohlke avatar charmoniumq avatar hmaarrfk avatar ifduyue avatar methane avatar mgorny avatar pitrou avatar regaldude avatar sfgagnon avatar xyb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-xxhash's Issues

Python 3.3.1 ImportError: undefined symbol: Py_InitModule

After doing pip install I got ImportError: /usr/local/lib/python3.3/dist-packages/xxhash.cpython-33m.so: undefined symbol: Py_InitModule

Here's all the info, copied from terminal in Ubuntu 13.04 --

jh@jh-5253G ~ $ python3 -m pip install xxhash
Downloading/unpacking xxhash
Downloading xxhash-0.0.1.tar.bz2
Running setup.py egg_info for package xxhash

Installing collected packages: xxhash
Running setup.py install for xxhash
building 'xxhash' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.3m -c python-xxhash.c -o build/temp.linux-x86_64-3.3/python-xxhash.o -std=c99 -O3 -Wall -W -Wundef -DVERSION="0.0.1" -DXXHASH_VERSION="r35"
python-xxhash.c: In function ‘xxh32’:
python-xxhash.c:35:34: warning: unused parameter ‘self’ [-Wunused-parameter]
python-xxhash.c: In function ‘xxh64’:
python-xxhash.c:48:34: warning: unused parameter ‘self’ [-Wunused-parameter]
python-xxhash.c: In function ‘initxxhash’:
python-xxhash.c:68:5: warning: implicit declaration of function ‘Py_InitModule’ [-Wimplicit-function-declaration]
python-xxhash.c:68:24: warning: initialization makes pointer from integer without a cast [enabled by default]
python-xxhash.c:72:1: warning: control reaches end of non-void function [-Wreturn-type]
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.3m -c xxhash/xxhash.c -o build/temp.linux-x86_64-3.3/xxhash/xxhash.o -std=c99 -O3 -Wall -W -Wundef -DVERSION="0.0.1" -DXXHASH_VERSION="r35"
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wno-unused-result -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.3/python-xxhash.o build/temp.linux-x86_64-3.3/xxhash/xxhash.o -o build/lib.linux-x86_64-3.3/xxhash.cpython-33m.so

Successfully installed xxhash
Cleaning up...

jh@jh-5253G ~ $ python3
Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xxhash
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /usr/local/lib/python3.3/dist-packages/xxhash.cpython-33m.so: undefined symbol: Py_InitModule
>>>

[end]

Pickable Interface to save state

Hello, Is it possible to make xxhash pickable ? how ?

It really will be great if it was pickable, that way we could've save hash state and resume hashing later on

I spend a few hours on this but no luck

Musllinux wheels

Hi! I'm work with Home Assistant OS and now I'm working with your library. Home Assistant OS is build on Alpine, which is musllinux system. And on OS i can not build your library without wheels.
Could you explain, why in your 'pyproject.toml' you skip musllinux implementation?
If it possible to delete this rule and build wheels for musllinux?

How to use xxhash to generate hash of a video file

I have a video file sample.mp4 I want to generate hash for this file using xxhash in python3 how can I do that? Because in your examples I am not getting how can i use this thing to encrypt the files.

Hash a large file in small blocks

@ifduyue
Hi.

This is a feature request.

For a couple of weeks I have been playing with ewencp's pyhashxx,
with which I used code like this:

import pyhashxx

chunk_size = 4096 * 1024

def hash_by_chunk( path ):

    hasher = pyhashxx.Hashxx()

    with open( path, 'rb' ) as fil1 :
        while True :
            buf = fil1.read( chunk_size )
            if not buf : break
            hasher.update( buf )

    return hasher.digest()

For files too big to fit in cache, the code above might be essential.
Even with small files, the above code runs faster for me than this:
hash = pyhashxx.hashxx( open( path, 'rb' ).read() )

Now I'm stuck, because python-xxhash has only the latter kind of function,
but ewencp's wrapper is older and does not have the 64-bit capablity of xxHash r35.
(Ewen has not commited to pyhashxx for more than a year.)

I'm starting to build a backup app that uses xxHash to
(a) check for duplicate files (b) check that the copy is good,
and I must allow for files that are bigger than available RAM.

So I wonder if you can add a 'hasher' class to python-xxhash?
(Is one present in Yann's xxHash, or did Ewen build it entirely in Python?)

-- jon 77

Install fails on Mac 10.11.6 with Python 2.7.12

pip install xxhash                                                                                                                                                                                      
Collecting xxhash
  Using cached xxhash-0.6.1.tar.gz
Building wheels for collected packages: xxhash
  Running setup.py bdist_wheel for xxhash ... error
  Complete output from command /usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/private/var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-build-kpCdEy/xxhash/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/tmpE_HmQppip-wheel- --python-tag cp27:
  running bdist_wheel
  running build
  running build_ext
  building 'xxhash' extension
  creating build
  creating build/temp.macosx-10.11-x86_64-2.7
  creating build/temp.macosx-10.11-x86_64-2.7/xxhash
  clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DVERSION=0.6.1 -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c python-xxhash.c -o build/temp.macosx-10.11-x86_64-2.7/python-xxhash.o -std=c99 -O3 -Wall -W -Wundef -Wno-error=declaration-after-statement
  python-xxhash.c:102:44: warning: unused parameter 'type' [-Wunused-parameter]
  static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                             ^
  python-xxhash.c:102:60: warning: unused parameter 'args' [-Wunused-parameter]
  static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                             ^
  python-xxhash.c:102:76: warning: unused parameter 'kwargs' [-Wunused-parameter]
  static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                                             ^
  python-xxhash.c:309:56: warning: unused parameter 'self' [-Wunused-parameter]
  static PyObject *PYXXH32_get_block_size(PYXXH32Object *self, void *closure)
                                                         ^
  python-xxhash.c:309:68: warning: unused parameter 'closure' [-Wunused-parameter]
  static PyObject *PYXXH32_get_block_size(PYXXH32Object *self, void *closure)
                                                                     ^
  python-xxhash.c:315:40: warning: unused parameter 'self' [-Wunused-parameter]
  PYXXH32_get_digest_size(PYXXH32Object *self, void *closure)
                                         ^
  python-xxhash.c:315:52: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH32_get_digest_size(PYXXH32Object *self, void *closure)
                                                     ^
  python-xxhash.c:321:33: warning: unused parameter 'self' [-Wunused-parameter]
  PYXXH32_get_name(PYXXH32Object *self, void *closure)
                                  ^
  python-xxhash.c:321:45: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH32_get_name(PYXXH32Object *self, void *closure)
                                              ^
  python-xxhash.c:331:45: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH32_get_seed(PYXXH32Object *self, void *closure)
                                              ^
  python-xxhash.c:367:10: warning: missing field 'get' initializer [-Wmissing-field-initializers]
      {NULL}  /* Sentinel */
           ^
  python-xxhash.c:427:1: warning: missing field 'tp_free' initializer [-Wmissing-field-initializers]
  };
  ^
  python-xxhash.c:447:44: warning: unused parameter 'type' [-Wunused-parameter]
  static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                             ^
  python-xxhash.c:447:60: warning: unused parameter 'args' [-Wunused-parameter]
  static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                             ^
  python-xxhash.c:447:76: warning: unused parameter 'kwargs' [-Wunused-parameter]
  static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                                             ^
  python-xxhash.c:654:56: warning: unused parameter 'self' [-Wunused-parameter]
  static PyObject *PYXXH64_get_block_size(PYXXH64Object *self, void *closure)
                                                         ^
  python-xxhash.c:654:68: warning: unused parameter 'closure' [-Wunused-parameter]
  static PyObject *PYXXH64_get_block_size(PYXXH64Object *self, void *closure)
                                                                     ^
  python-xxhash.c:660:40: warning: unused parameter 'self' [-Wunused-parameter]
  PYXXH64_get_digest_size(PYXXH64Object *self, void *closure)
                                         ^
  python-xxhash.c:660:52: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH64_get_digest_size(PYXXH64Object *self, void *closure)
                                                     ^
  python-xxhash.c:666:33: warning: unused parameter 'self' [-Wunused-parameter]
  PYXXH64_get_name(PYXXH64Object *self, void *closure)
                                  ^
  python-xxhash.c:666:45: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH64_get_name(PYXXH64Object *self, void *closure)
                                              ^
  python-xxhash.c:676:45: warning: unused parameter 'closure' [-Wunused-parameter]
  PYXXH64_get_seed(PYXXH64Object *self, void *closure)
                                              ^
  python-xxhash.c:712:10: warning: missing field 'get' initializer [-Wmissing-field-initializers]
      {NULL}  /* Sentinel */
           ^
  python-xxhash.c:772:1: warning: missing field 'tp_free' initializer [-Wmissing-field-initializers]
  };
  ^
  clang: error: unable to execute command: Segmentation fault: 11
  clang: error: clang frontend command failed due to signal (use -v to see invocation)
  Apple LLVM version 7.3.0 (clang-703.0.31)
  Target: x86_64-apple-darwin15.6.0
  Thread model: posix
  InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
  clang: note: diagnostic msg: PLEASE submit a bug report to http://developer.apple.com/bugreporter/ and include the crash backtrace, preprocessed source, and associated run script.
  clang: note: diagnostic msg:
  ********************

  PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
  Preprocessed source(s) and associated run script(s) are located at:
  clang: note: diagnostic msg: /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/python-xxhash-2633e3.c
  clang: note: diagnostic msg: /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/python-xxhash-2633e3.sh
  clang: note: diagnostic msg:

  ********************
  error: command 'clang' failed with exit status 254

  ----------------------------------------
  Failed building wheel for xxhash
  Running setup.py clean for xxhash
Failed to build xxhash
Installing collected packages: xxhash
  Running setup.py install for xxhash ... error
    Complete output from command /usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/private/var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-build-kpCdEy/xxhash/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-LXei0a-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_ext
    building 'xxhash' extension
    creating build
    creating build/temp.macosx-10.11-x86_64-2.7
    creating build/temp.macosx-10.11-x86_64-2.7/xxhash
    clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DVERSION=0.6.1 -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c python-xxhash.c -o build/temp.macosx-10.11-x86_64-2.7/python-xxhash.o -std=c99 -O3 -Wall -W -Wundef -Wno-error=declaration-after-statement
    python-xxhash.c:102:44: warning: unused parameter 'type' [-Wunused-parameter]
    static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                               ^
    python-xxhash.c:102:60: warning: unused parameter 'args' [-Wunused-parameter]
    static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                               ^
    python-xxhash.c:102:76: warning: unused parameter 'kwargs' [-Wunused-parameter]
    static PyObject *PYXXH32_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                                               ^
    python-xxhash.c:309:56: warning: unused parameter 'self' [-Wunused-parameter]
    static PyObject *PYXXH32_get_block_size(PYXXH32Object *self, void *closure)
                                                           ^
    python-xxhash.c:309:68: warning: unused parameter 'closure' [-Wunused-parameter]
    static PyObject *PYXXH32_get_block_size(PYXXH32Object *self, void *closure)
                                                                       ^
    python-xxhash.c:315:40: warning: unused parameter 'self' [-Wunused-parameter]
    PYXXH32_get_digest_size(PYXXH32Object *self, void *closure)
                                           ^
    python-xxhash.c:315:52: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH32_get_digest_size(PYXXH32Object *self, void *closure)
                                                       ^
    python-xxhash.c:321:33: warning: unused parameter 'self' [-Wunused-parameter]
    PYXXH32_get_name(PYXXH32Object *self, void *closure)
                                    ^
    python-xxhash.c:321:45: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH32_get_name(PYXXH32Object *self, void *closure)
                                                ^
    python-xxhash.c:331:45: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH32_get_seed(PYXXH32Object *self, void *closure)
                                                ^
    python-xxhash.c:367:10: warning: missing field 'get' initializer [-Wmissing-field-initializers]
        {NULL}  /* Sentinel */
             ^
    python-xxhash.c:427:1: warning: missing field 'tp_free' initializer [-Wmissing-field-initializers]
    };
    ^
    python-xxhash.c:447:44: warning: unused parameter 'type' [-Wunused-parameter]
    static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                               ^
    python-xxhash.c:447:60: warning: unused parameter 'args' [-Wunused-parameter]
    static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                               ^
    python-xxhash.c:447:76: warning: unused parameter 'kwargs' [-Wunused-parameter]
    static PyObject *PYXXH64_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
                                                                               ^
    python-xxhash.c:654:56: warning: unused parameter 'self' [-Wunused-parameter]
    static PyObject *PYXXH64_get_block_size(PYXXH64Object *self, void *closure)
                                                           ^
    python-xxhash.c:654:68: warning: unused parameter 'closure' [-Wunused-parameter]
    static PyObject *PYXXH64_get_block_size(PYXXH64Object *self, void *closure)
                                                                       ^
    python-xxhash.c:660:40: warning: unused parameter 'self' [-Wunused-parameter]
    PYXXH64_get_digest_size(PYXXH64Object *self, void *closure)
                                           ^
    python-xxhash.c:660:52: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH64_get_digest_size(PYXXH64Object *self, void *closure)
                                                       ^
    python-xxhash.c:666:33: warning: unused parameter 'self' [-Wunused-parameter]
    PYXXH64_get_name(PYXXH64Object *self, void *closure)
                                    ^
    python-xxhash.c:666:45: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH64_get_name(PYXXH64Object *self, void *closure)
                                                ^
    python-xxhash.c:676:45: warning: unused parameter 'closure' [-Wunused-parameter]
    PYXXH64_get_seed(PYXXH64Object *self, void *closure)
                                                ^
    python-xxhash.c:712:10: warning: missing field 'get' initializer [-Wmissing-field-initializers]
        {NULL}  /* Sentinel */
             ^
    python-xxhash.c:772:1: warning: missing field 'tp_free' initializer [-Wmissing-field-initializers]
    };
    ^
    clang: error: unable to execute command: Segmentation fault: 11
    clang: error: clang frontend command failed due to signal (use -v to see invocation)
    Apple LLVM version 7.3.0 (clang-703.0.31)
    Target: x86_64-apple-darwin15.6.0
    Thread model: posix
    InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
    clang: note: diagnostic msg: PLEASE submit a bug report to http://developer.apple.com/bugreporter/ and include the crash backtrace, preprocessed source, and associated run script.
    clang: note: diagnostic msg:
    ********************

    PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
    Preprocessed source(s) and associated run script(s) are located at:
    clang: note: diagnostic msg: /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/python-xxhash-bdca50.c
    clang: note: diagnostic msg: /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/python-xxhash-bdca50.sh
    clang: note: diagnostic msg:

    ********************
    error: command 'clang' failed with exit status 254

    ----------------------------------------
Command "/usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/private/var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-build-kpCdEy/xxhash/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-LXei0a-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/qq/sw11557n15v08vlf_tyy15zh0000gn/T/pip-build-kpCdEy/xxhash/

Wheel for new MacOS M1 arch

Is it possible to compile the wheels for the new MacOS M1 ARM chip? (E.g. xxhash-2.0.2-cp3.9-cp3.9-macosx_11_0_arm64.whl)

The current binaries are incompatible unfortunately.

Failed to create from source on MacOSX Big Sur (11.1)

When trying to compile from source on MacOSX 11.1 I get the following error:

Running setup.py install for xxhash ... error
    ERROR: Command errored out with exit status 1:
     command: /Users/arjan/Development/py-substrate-interface/venv/bin/python3.9 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/2h/jl7dn8js7pbcqbm644pg0q700000gn/T/pip-install-5cdfodud/xxhash_d64eeb944ea843cb8f9ade53b9481153/setup.py'"'"'; __file__='"'"'/private/var/folders/2h/jl7dn8js7pbcqbm644pg0q700000gn/T/pip-install-5cdfodud/xxhash_d64eeb944ea843cb8f9ade53b9481153/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/2h/jl7dn8js7pbcqbm644pg0q700000gn/T/pip-record-x3l8gs60/install-record.txt --single-version-externally-managed --compile --install-headers /Users/arjan/Development/py-substrate-interface/venv/include/site/python3.9/xxhash
         cwd: /private/var/folders/2h/jl7dn8js7pbcqbm644pg0q700000gn/T/pip-install-5cdfodud/xxhash_d64eeb944ea843cb8f9ade53b9481153/
    Complete output (16 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-11-x86_64-3.9
    creating build/lib.macosx-11-x86_64-3.9/xxhash
    copying xxhash/__init__.py -> build/lib.macosx-11-x86_64-3.9/xxhash
    running build_ext
    building '_xxhash' extension
    creating build/temp.macosx-11-x86_64-3.9
    creating build/temp.macosx-11-x86_64-3.9/deps
    creating build/temp.macosx-11-x86_64-3.9/deps/xxhash
    creating build/temp.macosx-11-x86_64-3.9/src
    clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/include -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/usr/local/opt/bzip2/include -I/usr/local/opt/zlib/include -Ideps/xxhash -I/usr/local/include -I/usr/local/opt/[email protected]/include -I/usr/local/opt/sqlite/include -I/usr/local/opt/tcl-tk/include -I/Users/arjan/Development/py-substrate-interface/venv/include -I/usr/local/Cellar/[email protected]/3.9.1_6/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c deps/xxhash/xxhash.c -o build/temp.macosx-11-x86_64-3.9/deps/xxhash/xxhash.o
    clang-10: error: invalid version number in 'MACOSX_DEPLOYMENT_TARGET=11'
    error: command '/usr/local/opt/llvm/bin/clang' failed with exit code 1

128 bits ?

I am trying the https://www.shawntabrizi.com/substrate/querying-substrate-storage-via-rpc/#querying-runtime-storage manual in Python.

Look what they have decided to use in polkadot/substrate:

Note: Note that we specified to use the 128 bit version of XXHash.

I am trying to replicate all their JavaScript stuff into Python:

import xxhash # pip install xxhash

U8a = bytes("Sudo Key", encoding='utf8')
print (list(U8a))

print (xxhash.xxh32(U8a).hexdigest()) 
print (xxhash.xxh64(U8a).hexdigest())
print (xxhash.xxh128(U8a).hexdigest())

but ... what I get is:

[83, 117, 100, 111, 32, 75, 101, 121]
ae5549fb
2ed2ce1a873aa650
AttributeError: module 'xxhash' has no attribute 'xxh128'

Not sure I understand this correctly, but it looks as if
xxhash( _ , 128) == xxh128( _ )
right?

Could you please implement that for us? Thank you very much!

Versions:

xxhash.VERSION 1.3.0 
xxhash.XXHASH_VERSION 0.6.5

Python 3.10 wheel

Feature request: build Python 3.10 wheel for all supported platforms (manylinux2010 / manylinux1 / Windows / macOSX 10; PyPy v3.10 isn't released, and macOS 11 and musllinux aren't yet supported). This will allow installation in environments without a C compiler

Current workaround for Debian-based Linux: apt install gcc libc-dev --no-install-recommends gcc libc-dev

Edit: as far as I can tell, cibuildwheel will automatically build a Python 3.10 wheel on next release

CFFI can't hash unicode strings

This works in the normal version:

xxh64(u'test')
<xxhash.xxh64 object at 0x80073e880>

and fails in the CFFI one:

xxh64(u'test')
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/pypy-5.6/site-packages/xxhash/cffi.py", line 59, in init
self.update(input)
File "/usr/local/pypy-5.6/site-packages/xxhash/cffi.py", line 65, in update
lib.XXH64_update(self.xxhash_state, input, len(input))
TypeError: initializer for ctype 'void *' must be a str or list or tuple, not unicode

Version 1.4.0 doesn't build with Docker python:3.6.8-alpine (xxh3.h missing)

The docker file below produces an error with latest 1.4.0. If version is 1.3.0, the build succeeds.

FROM python:3.6.8-alpine

RUN apk update
RUN apk add --no-cache python-dev g++
RUN pip3 install --upgrade pip
RUN pip3 install xxhash
#RUN pip3 install xxhash==1.3.0

ENTRYPOINT ["/bin/sh", "-c", "echo Hello"]

A subset of the logging:

 Running setup.py install for xxhash: started
    Running setup.py install for xxhash: finished with status 'error'
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zihw2dts/xxhash/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zihw2dts/xxhash/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-zbzdmx6s/install-record.txt --single-version-externally-managed --compile
         cwd: /tmp/pip-install-zihw2dts/xxhash/
    Complete output (20 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.6
    creating build/lib.linux-x86_64-3.6/xxhash
    copying xxhash/__init__.py -> build/lib.linux-x86_64-3.6/xxhash
    running build_ext
    building '_xxhash' extension
    creating build/temp.linux-x86_64-3.6
    creating build/temp.linux-x86_64-3.6/src
    creating build/temp.linux-x86_64-3.6/deps
    creating build/temp.linux-x86_64-3.6/deps/xxhash
    gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC -Ideps/xxhash -I/usr/local/include/python3.6m -c src/_xxhash.c -o build/temp.linux-x86_64-3.6/src/_xxhash.o
    gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC -Ideps/xxhash -I/usr/local/include/python3.6m -c deps/xxhash/xxhash.c -o build/temp.linux-x86_64-3.6/deps/xxhash/xxhash.o
    deps/xxhash/xxhash.c:1115:10: fatal error: xxh3.h: No such file or directory
     #include "xxh3.h"
              ^~~~~~~~
    compilation terminated.
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zihw2dts/xxhash/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zihw2dts/xxhash/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-zbzdmx6s/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.

   deps/xxhash/xxhash.c:1115:10: fatal error: xxh3.h: No such file or directory
     #include "xxh3.h"
              ^~~~~~~~
    compilation terminated.

Unexpectedly Poor Performance on AMD Ryzen 9 7950X?

I benchmarked a variety of Python operations between an i7 5960x (released late 2014) and an AMD Ryzen 9 7950X (released late 2022). In general, the single threaded performance of the 7950X is typically much higher than the 5960X given the improvements that have been made during that timeframe. However, I noticed that xxhash doesn't see the same speedup. Here's a comparison of a variety of Python functions and the time they took to run on both machines (Python 3.8, xxhash==1.4.3):

image

The 3rd and 4th column are the time in seconds for each machine to run each operation a given number of times, either measured with time.time() or time.perf_counter().

The code for measuring the performance of xxhash was as follows:

import xxhash
import time

hard = 10
datadict = {}

def timer(num_attempts = 10):
  def decor(func):
    def wrapper(*args):
      for timefunc, string in [(time.time, "time()"), (time.perf_counter, "perf_counter()")]:
        times = []
        for attempt_num in range(num_attempts):
          start = timefunc() # start counting time using the given timing function
          result = func(*args) # the code we're benchmarking
          end = timefunc()
          duration = end - start
          times.append(duration)
        average_time = sum(times)/len(times) # take the average of the times over num_attempts
        datadict[func.__name__ + "~" + string] = average_time
      return result
    return wrapper
  return decor

@timer(hard)
def xxhash_4kimg_5ktimes(filebits):
  j = 0
  for i in range(5000):
    j = xxhash.xxh64(filebits).hexdigest()
  return j
  
print("Timing hashing")
with open('2.jpg', 'rb') as afile:
  filebits = afile.read()
print("\txxhash:")
j = xxhash_4kimg_5ktimes(filebits)

for key, val in datadict:
  print(key, val)

('2.jpg' is a 2.94 MB jpeg file, 3840x2160 resolution). I'm on Windows 10.

Any idea why the speedup isn't higher, as many of the other functions I tested were? On average I was getting at least a 2x speedup for most Python functions, but only got 1.08x for xxhash. It was the only one that performed so poorly.

Fails to install via dependency on linux

Hi,

I'm having an issue installing latest 0.5.0 release on Debian Linux 8.3.

Installation fails, mentioning the following macosx package (that was added on downloads page and was not present in previous version)
xxhash-0.5.0.macosx-10.10-x86_64.tar.gz (md5)
built for Darwin-14.5.0 "dumb" binary

pip -V => pip 1.5.6, upgrading doesn't help as well.

Reproduction steps:

#!/bin/bash
virtualenv venv
mkdir -p dep
echo "from setuptools import setup" > dep/setup.py
echo "setup(name='dep', setup_requires=['xxhash'])" >> dep/setup.py
echo './dep' > requirements.txt
venv/bin/pip install -U --exists-action=s -r requirements.txt

Stack trace

Running setup.py (path:/tmp/pip-p45yo0-build/setup.py) egg_info for package from file:///vagrant/xxhash-repro/dep
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip-p45yo0-build/setup.py", line 2, in <module>
        setup(name='dep', setup_requires=['xxhash'])
      File "/usr/lib/python2.7/distutils/core.py", line 111, in setup
        _setup_distribution = dist = klass(attrs)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/dist.py", line 262, in __init__
        self.fetch_build_eggs(attrs['setup_requires'])
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/dist.py", line 287, in fetch_build_eggs
        replace_conflicting=True,
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 631, in resolve
        dist = best[req.key] = env.best_match(req, ws, installer)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 874, in best_match
        return self.obtain(req, installer)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 886, in obtain
        return installer(requirement)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/dist.py", line 338, in fetch_build_egg
        return cmd.easy_install(req)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 636, in easy_install
        return self.install_item(spec, dist.location, tmpdir, deps)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 666, in install_item
        dists = self.install_eggs(spec, download, tmpdir)
      File "/vagrant/xxhash-repro/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 842, in install_eggs
        os.path.abspath(dist_filename)
    distutils.errors.DistutilsError: Couldn't find a setup script in /tmp/easy_install-gjY5iT/xxhash-0.5.0.macosx-10.10-x86_64.tar.gz
    Complete output from command python setup.py egg_info:

no new updates?

more than 6 months have passed without any updates... is there nothing else to do?

Request: add a reset() method?

Hi,

I think it would be great if there's a reset() method for xxh32/64 objects, which clears the slate so the next update() call begins anew. This shouldn't hurt hashlib compatibility I think.

What do you think?

--cong.

Cannot import xxhash: `ModuleNotFoundError: No module named 'xxhash._xxhash`

After the latest release 2.0.2, importing xxhash from Python 3.10.0 on Windows 10 x64 produces the following error:

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import xxhash
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Extra\Python3\lib\site-packages\xxhash\__init__.py", line 1, in <module>
    from ._xxhash import (
ModuleNotFoundError: No module named 'xxhash._xxhash'

The 2.0.1 release still works fine. There must have been some change in the packaging that's causing this.

Memory leak in the CFFI version?

Running this program:

from xxhash import xxh32, xxh64
import time

for a in xrange(100):
  st=time.time()
  for i in xrange(10**6):
    xxh64(str(i))
  print (a, time.time()-st)
time.sleep(60)

Uses 4,8 GiB (RES) memory at the end.

The non-CFFI version with cpython remains at 7712K, while with pypy it tops at 68668K.

Build fails on Python 3.4 on Windows 7 x64

Trying to build on Windows 7 x64 with Python 3 I get the following error:

python-xxhash.c
python-xxhash.c(363) : error C2099: initializer is not a constant
python-xxhash.c(687) : error C2099: initializer is not a constant
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 10.0\\VC\\Bin\\amd64\\cl.exe' failed with exit status 2

Feature request: add xxhash to Pyodide

This would allow libraries that depend on xxhash to run in JupyterLite, py-script, etc.

Packages are added to Pyodide by explicitly compiling them in; there's a large directory of those packages, and they're each governed by a meta.yaml and a small test. Here is an example of a meta.yaml file: it has a URL to the source tarball of the latest version of the package and provides environment variables (or other helpers) for the build process. Whenever a new source tarball is pushed to PyPI, a bot writes its own PR to update this meta.yaml.

Missing Python 3.9 wheel

Hi, I saw the Python 3.9 binary is missing on PyPI, will this be release any time soon?

Thanks!

Release the GIL

As far as I know the standard hashlib algorithms release the GIL before computing the hash.
Is it worthwhile to release the GIL for xxhash as well or is it too fast and the cost of acquiring & releasing the GIL too great for benefiting from parallelism?

setup.py fails trying to read README.rst

| DEBUG: Executing shell function do_compile
| Traceback (most recent call last):
|   File "setup.py", line 35, in <module>
|     long_description=open('README.rst', 'r').read(),
|   File "/home/workdir/build/tmp/sysroots/x86_64-linux/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
|     return codecs.ascii_decode(input, self.errors)[0]
| UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 6178: ordinal not in range(128)
| ERROR: python3 setup.py build_ext execution failed.
| WARNING: /home/workdir/build/tmp/work/cortexa9hf-vfp-neon-poky-linux-gnueabi/python3-xxhash/1.4.1-r1/temp/run.do_compile.1237:1 exit 1 from
|   exit 1

It's niche, but the setup fails in a build system where the default encoding is ascii. Specifically the ± character isn't there. The fix is to explicitly set the encoding= value to utf-8; which is the sane default on all modern systems.

setup.py

    long_description=open('README.rst', 'r').read(),
# -->
    long_description=open('README.rst', 'r', encoding='utf-8').read(),

Blender 3.4.1: ModuleNotFoundError: No module named 'xxhash._xxhash'

While using xxhash 3.2.0 from https://pypi.org/project/xxhash/ in Blender 3.4.1 errors are thrown, trying to enable some modules.

Example:

Traceback (most recent call last):
  File "...\AppData\Roaming\Blender Foundation\Blender\3.4\scripts\addons\cad_mesh_dimensions.py", line 47, in <module>
    import xxhash
  File "E:\Programme\Blender\3.4\python\lib\xxhash\__init__.py", line 1, in <module>
    from ._xxhash import (
ModuleNotFoundError: No module named 'xxhash._xxhash'

The modules can't be used.

As I found by searches, xxhash 3.0.0 mus be used, but can't be found on https://pypi.org

Having trouble installing the package in conda environment with python 3.7.4 (AssertionError due to word in capital letters)))

In a fresh python 3.7.4 conda environment:

conda create -n new python=3.7.4
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.7.5
  latest version: 4.7.11

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/acruz/miniconda3/envs/new

  added / updated specs:
    - python=3.7.4


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  ca-certificates    pkgs/main/linux-64::ca-certificates-2019.5.15-1
  certifi            pkgs/main/linux-64::certifi-2019.6.16-py37_1
  libedit            pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
  libffi             pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0
  ncurses            pkgs/main/linux-64::ncurses-6.1-he6710b0_1
  openssl            pkgs/main/linux-64::openssl-1.1.1c-h7b6447c_1
  pip                pkgs/main/linux-64::pip-19.1.1-py37_0
  python             pkgs/main/linux-64::python-3.7.4-h265db76_0
  readline           pkgs/main/linux-64::readline-7.0-h7b6447c_5
  setuptools         pkgs/main/linux-64::setuptools-41.0.1-py37_0
  sqlite             pkgs/main/linux-64::sqlite-3.29.0-h7b6447c_0
  tk                 pkgs/main/linux-64::tk-8.6.8-hbc83047_0
  wheel              pkgs/main/linux-64::wheel-0.33.4-py37_0
  xz                 pkgs/main/linux-64::xz-5.2.4-h14c3975_4
  zlib               pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate new
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Then try to install xxhash

conda activate new
pip install xxhash

7.4.0
Collecting xxhash
  Using cached https://files.pythonhosted.org/packages/4a/f9/f83b3ab3bd1bf50ae7b6c21f9fa28107df38c9283c721ce40688ea443eb9/xxhash-1.3.0.tar.gz
Building wheels for collected packages: xxhash
  Building wheel for xxhash (setup.py) ... error
  ERROR: Complete output from command /home/acruz/miniconda3/envs/new/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-g77w7dxw/xxhash/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-yy4xo0ku --python-tag cp37:
  ERROR: 7.4.0
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/xxhash
  copying xxhash/__init__.py -> build/lib.linux-x86_64-3.7/xxhash
  running build_ext
  building 'cpython' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/xxhash
  creating build/temp.linux-x86_64-3.7/deps
  creating build/temp.linux-x86_64-3.7/deps/xxhash
  gcc -pthread -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Ideps/xxhash -I/home/acruz/miniconda3/envs/new/include/python3.7m -c xxhash/cpython.c -o build/temp.linux-x86_64-3.7/xxhash/cpython.o
  gcc -pthread -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Ideps/xxhash -I/home/acruz/miniconda3/envs/new/include/python3.7m -c deps/xxhash/xxhash.c -o build/temp.linux-x86_64-3.7/deps/xxhash/xxhash.o
  gcc -pthread -shared -L/home/acruz/miniconda3/envs/new/lib -Wl,-rpath=/home/acruz/miniconda3/envs/new/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/xxhash/cpython.o build/temp.linux-x86_64-3.7/deps/xxhash/xxhash.o -o build/lib.linux-x86_64-3.7/xxhash/cpython.cpython-@PYVERNODOTS@m-x86_64-linux-gnu.so
  installing to build/bdist.linux-x86_64/wheel
  running install
  running install_lib
  creating build/bdist.linux-x86_64
  creating build/bdist.linux-x86_64/wheel
  creating build/bdist.linux-x86_64/wheel/xxhash
  copying build/lib.linux-x86_64-3.7/xxhash/cpython.cpython-@PYVERNODOTS@m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel/xxhash
  copying build/lib.linux-x86_64-3.7/xxhash/__init__.py -> build/bdist.linux-x86_64/wheel/xxhash
  running install_egg_info
  running egg_info
  writing xxhash.egg-info/PKG-INFO
  writing dependency_links to xxhash.egg-info/dependency_links.txt
  writing top-level names to xxhash.egg-info/top_level.txt
  reading manifest file 'xxhash.egg-info/SOURCES.txt'
  reading manifest template 'MANIFEST.in'
  warning: no previously-included files matching '__pycache__' found anywhere in distribution
  warning: no previously-included files matching '*.py[co]' found anywhere in distribution
  writing manifest file 'xxhash.egg-info/SOURCES.txt'
  Copying xxhash.egg-info to build/bdist.linux-x86_64/wheel/xxhash-1.3.0-py3.7.egg-info
  running install_scripts
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-install-g77w7dxw/xxhash/setup.py", line 51, in <module>
      ext_modules=ext_modules,
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
      return distutils.core.setup(**attrs)
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 230, in run
      impl_tag, abi_tag, plat_tag = self.get_tag()
    File "/home/acruz/miniconda3/envs/new/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 179, in get_tag
      assert tag == supported_tags[0], "%s != %s" % (tag, supported_tags[0])
    AssertionError: ('cp37', 'cp@pyvernodots@m', 'linux_x86_64') != ('cp37', 'cp@PYVERNODOTS@m', 'linux_x86_64')
  ----------------------------------------
  ERROR: Failed building wheel for xxhash
  Running setup.py clean for xxhash
Failed to build xxhash
Installing collected packages: xxhash
  Running setup.py install for xxhash ... done
Successfully installed xxhash-1.3.0

CFFI advantages

It's a question rather than an issue.
Is CFFI faster than cpython version? If not, what are the advantages of using CFFI version?

Cannot hash data larger than 2GB

>>> xxhash.xxh64(b'x' * 2**30)
<xxhash.xxh64 at 0x7f8e528d14f0>
>>> xxhash.xxh64(b'x' * 2**31)
Traceback (most recent call last):
  File "<ipython-input-31-0560f5d338db>", line 1, in <module>
    xxhash.xxh64(b'x' * 2**31)
OverflowError: size does not fit in an int

Hashes not matching other implimentations

Hi, I'm trying to use your library but am not having any luck with matching the results to any other implimentations.
Below is a sample output from a hash list.

<hash>
       <file>NC2775__49_513_____t1__PN.WAV</file>
       <size>197162746</size>
       <lastmodificationdate>2021-06-20T13:50:06Z</lastmodificationdate>
       <xxhash64be>09ec3f21e70327c4</xxhash64be>
       <hashdate>2021-06-20T13:51:42Z</hashdate>
 </hash>

When I use xxhsum from terminal I get a matching hash

$ xxhsum -V
xxhsum 0.8.0 by Yann Collet
compiled as 64-bit x86_64 + SSE2 little endian with Apple Clang 11.0.0 (clang-1100.0.33.17)
$ xxhsum /Users/me/tmp/xxhash_tests/NC2775__49_513_____t1__PN.WAV
09ec3f21e70327c4  /Users/me/tmp/xxhash_tests/NC2775__49_513_____t1__PN.WAV

But when I try the same using this library I get

>>> xxhash.XXHASH_VERSION
'0.8.0'
>>> x = xxhash.xxh64('/Users/me/tmp/xxhash_tests/NC2775__49_513_____t1__PN.WAV')
>>> x.hexdigest()
'8c928708393d033a'
>>> x.digest()
b'\x8c\x92\x87\x089=\x03:'

Am i doing something wrong? or is there a bug somewhere, I've also tested xxh32 against xxh32sum and got different results. Any help would be much appreciated. Thanks.

New relase on conda-forge

The conda-forge release of xxhash is almost 2 years old; would be great to update it to the latest v1.3.0.

Ascii representation of the hash

Could you please provide a way to show directly the ascii representation of the hash as the output from xxhash binary ?

Or at least show an example of code to display it as the binary ?

Here is an example to show what I meant :

import xxhash
import binascii
fname="sample.txt"
with open(fname,'rb') as f:
h1=xxhash.xxh3_128_digest(f.read(), seed=0)
print(binascii.b2a_hex(h1).decode(),fname)

xxhash 3.0.0 fails to build on msys/mingw pip

I'm using my python install from msys/mingw, specifically the package mingw-w64-ucrt-x86_64-python.

I was able to build and install every xxhash version up to 2.0.2, but 3.0.0 fails now, with the output:

Collecting xxhash
  Downloading xxhash-3.0.0.tar.gz (74 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.3/74.3 KB 1.4 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: xxhash
  Building wheel for xxhash (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for xxhash (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [11 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.mingw_x86_64_ucrt-3.9
      creating build\lib.mingw_x86_64_ucrt-3.9\xxhash
      copying xxhash\version.py -> build\lib.mingw_x86_64_ucrt-3.9\xxhash
      copying xxhash\__init__.py -> build\lib.mingw_x86_64_ucrt-3.9\xxhash
      running build_ext
      building '_xxhash' extension
      error: --plat-name must be one of ('win32', 'win-amd64', 'win-arm32', 'win-arm64')
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for xxhash
Failed to build xxhash
ERROR: Could not build wheels for xxhash, which is required to install pyproject.toml-based projects

My python version, as per output from python:
Python 3.9.10 (main, Jan 25 2022, 17:58:24) [GCC UCRT 11.2.0 64 bit (AMD64)] on win32
pip version:
22.0.3

Compatible pip tags as per output from "pip debug --version":
Compatible tags: 33
  cp39-cp39-mingw_x86_64_ucrt
  cp39-abi3-mingw_x86_64_ucrt
  cp39-none-mingw_x86_64_ucrt
  cp38-abi3-mingw_x86_64_ucrt
  cp37-abi3-mingw_x86_64_ucrt
  cp36-abi3-mingw_x86_64_ucrt
  cp35-abi3-mingw_x86_64_ucrt
  cp34-abi3-mingw_x86_64_ucrt
  cp33-abi3-mingw_x86_64_ucrt
  cp32-abi3-mingw_x86_64_ucrt
  py39-none-mingw_x86_64_ucrt
  py3-none-mingw_x86_64_ucrt
  py38-none-mingw_x86_64_ucrt
  py37-none-mingw_x86_64_ucrt
  py36-none-mingw_x86_64_ucrt
  py35-none-mingw_x86_64_ucrt
  py34-none-mingw_x86_64_ucrt
  py33-none-mingw_x86_64_ucrt
  py32-none-mingw_x86_64_ucrt
  py31-none-mingw_x86_64_ucrt
  py30-none-mingw_x86_64_ucrt
  cp39-none-any
  py39-none-any
  py3-none-any
  py38-none-any
  py37-none-any
  py36-none-any
  py35-none-any
  py34-none-any
  py33-none-any
  py32-none-any
  py31-none-any
  py30-none-any

Need to deal with PyLong overflow

>>> xxhash.xxh64('a', 2**80).hexdigest()
'd24ec4f1a98c6e5b'
>>> xxhash.xxh64('a', 2**256).hexdigest()
'd24ec4f1a98c6e5b'
>>> xxhash.xxh64('a', 0).hexdigest()
'd24ec4f1a98c6e5b'

byte order of hex- and byte-strings

Until last week i was using python-xxhash 0.1.1
and i used these functions in python3:

n_hash = hasher.digest()
hexhash = format( n_hash, "016X" )
bs_hash = n_hash.to_bytes( 8, 'big' )

Today, with version 0.2.0, i did this:

n_hash = hasher.intdigest()
hexhash1 = format( n_hash, "016X" )
hexhash2 = hasher.hexdigest().upper()
print( hexhash1 )
print( hexhash2 )

and found that the bytes of the 2 hex strings are in opposite order.

Checking with calculator on ubuntu 14.04, in programmer mode,
i entered the hash as decimal, and noted the hex value shown
below the entry box; it agrees with the byte order that python3 got,
the reverse of what xxhash got.

Bytestrings are also in opposite order:
if i do bs_hash = n_hash.to_bytes( 8, 'little' )
python's result agrees with xxhash, but for python
to get the bytes in the same order as it got the hex,
it needs n_hash.to_bytes( 8, 'big' )

Using big-endian order might be right,
as although our CPUs are little-endian,
humans read numbers in big-endian order.

Many other hash libraries produce hex --
which way around do they do it?

I've made a lot of backups using big-endian hex,
so i'll have to keep on using big-endian.

-- jh

cffi

Hi,

Using this module from pypy is very slow compared to cpython.
Would you mind rewriting it with cffi?

Thanks,

Bug: hashlib parity

For parity with hashlib in Python 3.8/9, xxh32/xxh64/etc need to accept a keyword argument string, and a boolean keyword argument usedforsecurity.

https://github.com/python/typeshed/blob/04c74640f049a658f0099f4785452f5f46ec518b/stdlib/3/hashlib.pyi#L7

In addition, the .name attribute on the streaming hashes all return Python strings with an extra null terminator, which I believe is a bug stemming from using sizeof("XXH32"), etc.

I can write a PR for the latter bug later this week, along with a PR for type stubs that I'm working on.

'BufferedReader' does not have the buffer interface

Hello!

     Would it be possible to xxhash a file opened as "read-only binary"? This is my current code:

with open("/home/shadowrylander/include.md", "rb") as file:
    xxhash.xxh32(file).hexdigest()

but it gives:

      1 with open("/home/shadowrylander/include.md", "rb") as file:
----> 2     xxhash.xxh32(file).hexdigest()

TypeError: 'BufferedReader' does not have the buffer interface

     How would I go about fixing this?

Thank you kindly for the help!

PyPy CFFI ffi.sizeof(bytearray) returns -1 in _get_buffer

# docker run -ti --rm pypy:2
Python 2.7.13 (ab0b9caf307d, Apr 24 2018, 18:04:42)
[PyPy 6.0.0 with GCC 6.2.0 20160901] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> import cffi
>>>> ffi = cffi.FFI()
>>>> ba = bytearray(b'abc')
>>>> cdata = ffi.from_buffer(ba)
>>>> cdata
<cdata 'char[]' buffer len 3 from 'bytearray' object>
>>>> ffi.sizeof(cdata)
-1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.