google / farmhash Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/farmhash
License: MIT License
Automatically exported from code.google.com/p/farmhash
License: MIT License
FarmHash, a family of hash functions. Version 1.1 Introduction ============ A general overview of hash functions and their use is available in the file Understanding_Hash_Functions in this directory. It may be helpful to read it before using FarmHash. FarmHash provides hash functions for strings and other data. The functions mix the input bits thoroughly but are not suitable for cryptography. See "Hash Quality," below, for details on how FarmHash was tested and so on. We provide reference implementations in C++, with a friendly MIT license. All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others. Recommended Usage ================= Our belief is that the typical hash function is mostly used for in-memory hash tables and similar. That use case allows hash functions that differ on different platforms, and that change from time to time. For this, I recommend using wrapper functions in a .h file with comments such as, "may change from time to time, may differ on different platforms, and may change depending on NDEBUG." Some projects may also require a forever-fixed, portable hash function. Again we recommend using wrapper functions in a .h, but in this case the comments on them would be very different. We have provided a sample of these wrapper functions in src/farmhash.h. Our hope is that most people will need nothing more than src/farmhash.h and src/farmhash.cc. Those two files are a usable and relatively portable library. (One portability snag: if your compiler doesn't have __builtin_expect then you may need to define FARMHASH_NO_BUILTIN_EXPECT.) For those that prefer using a configure script (perhaps because they want to "make install" later), FarmHash has one, but for many people it's best to ignore it. Note that the wrapper functions such as Hash() in src/farmhash.h can select one of several hash functions. The selection is done at compile time, based on your machine architecture (e.g., sizeof(size_t)) and the availability of vector instructions (e.g., SSE4.1). To get the best performance from FarmHash, one will need to think a bit about when to use compiler flags that allow vector instructions and such: -maes, -msse4.2, -mavx, etc., or their equivalents for other compilers. Those are the g++ flags that make g++ emit more types of machine instructions than it otherwise would. For example, if you are confident that you will only be using FarmHash on systems with SSE4.2 and/or AES, you may communicate that to the compiler as explained in src/farmhash.cc. If not, use -maes, -mavx, etc., when you can, and the appropriate choices will be made by via conditional compilation in src/farmhash.cc. It may be beneficial to try -O3 or other compiler flags as well. I also have found feedback-directed optimization (FDO) to improve the speed of FarmHash. The "configure" script: creating config.h ========================================= We provide reference implementations of several FarmHash functions, written in C++. The build system is based on autoconf. It defaults the C++ compiler flags to "-g -O2", which may or may not be best. If you are planning to use the configure script, I generally recommend trying this first, unless you know that your system lacks AVX and/or AESNI: ./configure CXXFLAGS="-g -mavx -maes -O3" make all check If that fails, you can retry with -mavx and/or -maes removed, or with -mavx replaced by -msse4.1 or -msse4.2. Please see below for thoughts on cross-platform testing, if that is a concern. Finally, if you want to install a library, you may use make install Some useful flags for configure include: --enable-optional-builtin-expect: This causes __builtin_expect to be optional. If you don't use this flag, the assumption is that FarmHash will be compiled with compilers that provide __builtin_expect. In practice, some FarmHash variants may be slightly faster if __builtin_expect is available, but it isn't very important and affects speed only. Further Details =============== The above instructions will produce a single source-level library that includes multiple hash functions. It will use conditional compilation, and perhaps GCC's multiversioning, to select among the functions. In addition, "make all check" will create an object file using your chosen compiler, and test it. The object file won't necessarily contain all the code that would be used if you were to compile the code on other platforms. The downside of this is obvious: the paths not tested may not actually work if and when you try them. The FarmHash developers try hard to prevent such problems; please let us know if you find bugs. To aid your cross-platform testing, for each relevant platform you may compile your program that uses farmhash.cc with the preprocessor flag FARMHASHSELFTEST equal to 1. This causes a FarmHash self test to run at program startup; the self test writes output to stdout and then calls std::exit(). You can see this in action by running "make check": see src/farm-test.cc for details. There's also a trivial workaround to force particular functions to be used: modify the wrapper functions in hash.h. You can prevent choices being made via conditional compilation or multiversioning by choosing FarmHash variants with names like farmhashaa::Hash32, farmhashab::Hash64, etc.: those compute the same hash function regardless of conditional compilation, multiversioning, or endianness. Consult their comments and ifdefs to learn their requirements: for example, they are not all guaranteed to work on all platforms. Known Issues ============ 1) FarmHash was developed with little-endian architectures in mind. It should work on big-endian too, but less work has gone into optimizing for those platforms. To make FarmHash work properly on big-endian platforms you may need to modify the wrapper .h file and/or your compiler flags to arrange for FARMHASH_BIG_ENDIAN to be defined, though there is logic that tries to figure it out automatically. 2) FarmHash's implementation is fairly complex. 3) The techniques described in dev/INSTRUCTIONS to let hash function developers regenerate src/*.cc from dev/* are hacky and not so portable. Hash Quality ============ We like to test hash functions with SMHasher, among other things. SMHasher isn't perfect, but it seems to find almost any significant flaw. SMHasher is available at http://code.google.com/p/smhasher/ SMHasher is designed to pass a 32-bit seed to the hash functions it tests. For our functions that accept a seed, we use the given seed directly (padded with zeroes as needed); for our functions that don't accept a seed, we hash the concatenation of the given seed and the input string. Some minor flaws in 32-bit and 64-bit functions are harmless, as we expect the primary use of these functions will be in hash tables. We may have gone slightly overboard in trying to please SMHasher and other similar tests, but we don't want anyone to choose a different hash function because of some minor issue reported by a quality test. If your setup is similar enough to mine, it's easy to use SMHasher and other tools yourself via the "builder" in the dev directory. See dev/INSTRUCTIONS. (Improvements to that directory are a relatively low priority, and code there is never going to be as portable as the other parts of FarmHash.) For more information ==================== http://code.google.com/p/farmhash/ [email protected] Please feel free to send us comments, questions, bug reports, or patches.
so that it can be called on demand as part of testsuite
does not have <immintrin.h>
see my fixes at rurban/smhasher@0d0a40e
We are building TensorFlow on big endian system and farmhash is being downloaded as external dependency.
Few tests are failing on our system ( s390x platform) as it's not getting detected as big endian by farmhash. We tried setting the macro FARMHASH_BIG_ENDIAN for s390x in farmhash.cc file and with this change the tests are passing .
The farmhash/README mentions a known issue for big endian that
to make FarmHash work properly on big-endian platforms you may need to modify the wrapper .h file.
Is farmhash.cc right place to add a macro for our platform?
(We have seen macros used in config.h for big endian but doesn't seem to work for s390x )
The configure script generates an incorrect Makefile, which tries to run aclocal-1.14 (which I don't have, my system has aclocal-1.15). Fixed it by running autoreconf.
Debian Sid now ships automake-1.15
.
./configure
works for me, however the subsequent make all
failed.
$ make
cd . && automake-1.14 --foreign
/bin/bash: line 4: automake-1.14: command not found
Makefile:378: recipe for target 'Makefile.in' failed
make: *** [Makefile.in] Error 1
$ apt list automake\* -a
Listing... Done
automake/unstable,now 1:1.15-5 all [installed,automatic]
automake1.11/unstable,now 1:1.11.6-4 all [installed]
We had raised an issue in google/farmhash master earlier for big endian. However due to restructuring in the code via latest commits, the support for s390x needs to be explicitly added.
filename: farmhash/src/farmhash.h
#endif
- #elif defined(__OpenBSD__) || defined(__NetBSD__) || defined(__FreeBSD__) || defined(__DragonFly__)
+ #elif defined(__OpenBSD__) || defined(__NetBSD__) || defined(__FreeBSD__) || defined(__DragonFly__) || defined(__s390x__)
#include <sys/endian.h>
Please see this question on StackOverflow.
I'm getting the following error on installation.
./node_modules/farmhash/index.js
Module not found: Can't resolve './build/Release/farmhash' in '/path/to/my_project/node_modules/farmhash'
Steps to recreate.
At the command line, I did
yarn add farmhash
Then imported using
import farmhash from 'farmhash';
Then tried to use with
const hash = farmhash.hash32('test');
I checked my node_modules directory to verify it installed to the correct location: node_node_modules/farmhash/index.js
What am I doing wrong and how can I correct this error?
All buildlogs can be found here:
https://buildd.debian.org/status/package.php?p=farmhash&suite=experimental
Farmhash failed to build on s390x and mips etc.
It looks like farm-test rule runs twice, concurrently. "make all check", "make -j4 all", "make -j4 check" all succeed. Only "make -j4 all check" fails.
Below is a log:
make all-recursive
Making check in src
make[1]: Entering directory /home/saito/src_master/__build__/third_party/farmha sh-1.1.0/__tmp__/farmhash-1.1.0' Making all in src make[1]: Entering directory
/home/saito/src_master/build/third_party/farmha
sh-1.1.0/tmp/farmhash-1.1.0/src'
/bin/bash ../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I..
-g -msse4.2 -maes -O3 -MT farmhash.lo -MD -MP -MF .deps/farmhash.Tpo -c -o far
mhash.lo farmhash.cc
g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farm-test.o -MD -MP -
MF .deps/farm-test.Tpo -c -o farm-test.o farm-test.cc
make[2]: Entering directory /home/saito/src_master/__build__/third_party/farmha sh-1.1.0/__tmp__/farmhash-1.1.0/src' /bin/bash ../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmhash.lo -MD -MP -MF .deps/farmhash.Tpo -c -o far mhash.lo farmhash.cc g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farm-test.o -MD -MP - MF .deps/farm-test.Tpo -c -o farm-test.o farm-test.cc libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -fPIC -DPIC -o .libs/farmha sh.o libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -fPIC -DPIC -o .libs/farmha sh.o libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -o farmhash.o >/dev/null 2>& 1 libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -o farmhash.o >/dev/null 2>& 1 mv -f .deps/farm-test.Tpo .deps/farm-test.Po mv -f .deps/farm-test.Tpo .deps/farm-test.Po mv: cannot stat ‘.deps/farm-test.Tpo’: No such file or directory make[2]: *** [farm-test.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f .deps/farmhash.Tpo .deps/farmhash.Plo /bin/bash ../libtool --tag=CXX --mode=link g++ -g -msse4.2 -maes -O3 -o li bfarmhash.la -rpath /home/saito/src_master/__build__/third_party/farmhash-1.1.0/ lib farmhash.lo mv -f .deps/farmhash.Tpo .deps/farmhash.Plo mv: cannot stat ‘.deps/farmhash.Tpo’: No such file or directory make[2]: *** [farmhash.lo] Error 1 make[2]: Leaving directory
/home/saito/src_master/build/third_party/farmhash-1.1.0/tmp/farmhash-1.1.0/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/saito/src_master/build/third_party/farmhash-1.1.0/tmp/farmhash-1.1.0'
make: *** [all] Error 2
make: *** Waiting for unfinished jobs....
In my smhasher I just added a FreeBSD 12.1 smoker, and noticed that FarmHash (64 and 128 bit) fails the hash verification tests. I.e. FarmHash is not really portable on FreeBSD.
Hello,
I have a crash which seems to be related to farmhash here: flutter/flutter#35201
Do you have any information how this could be addressed?
The comments say that the version is "1.1", but it isn't tagged.
Charlaix limozeen
It looks like farm-test rule runs twice, concurrently. "make all check", "make -j4 all", "make -j4 check" all succeeds. Only "make -j4 all check" fails.
Below is a log:
make all-recursive
Making check in src
make[1]: Entering directory /home/saito/src_master/__build__/third_party/farmha sh-1.1.0/__tmp__/farmhash-1.1.0' Making all in src make[1]: Entering directory
/home/saito/src_master/build/third_party/farmha
sh-1.1.0/tmp/farmhash-1.1.0/src'
/bin/bash ../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I..
-g -msse4.2 -maes -O3 -MT farmhash.lo -MD -MP -MF .deps/farmhash.Tpo -c -o far
mhash.lo farmhash.cc
g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farm-test.o -MD -MP -
MF .deps/farm-test.Tpo -c -o farm-test.o farm-test.cc
make[2]: Entering directory /home/saito/src_master/__build__/third_party/farmha sh-1.1.0/__tmp__/farmhash-1.1.0/src' /bin/bash ../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmhash.lo -MD -MP -MF .deps/farmhash.Tpo -c -o far mhash.lo farmhash.cc g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farm-test.o -MD -MP - MF .deps/farm-test.Tpo -c -o farm-test.o farm-test.cc libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -fPIC -DPIC -o .libs/farmha sh.o libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -fPIC -DPIC -o .libs/farmha sh.o libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -o farmhash.o >/dev/null 2>& 1 libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -msse4.2 -maes -O3 -MT farmha sh.lo -MD -MP -MF .deps/farmhash.Tpo -c farmhash.cc -o farmhash.o >/dev/null 2>& 1 mv -f .deps/farm-test.Tpo .deps/farm-test.Po mv -f .deps/farm-test.Tpo .deps/farm-test.Po mv: cannot stat ‘.deps/farm-test.Tpo’: No such file or directory make[2]: *** [farm-test.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f .deps/farmhash.Tpo .deps/farmhash.Plo /bin/bash ../libtool --tag=CXX --mode=link g++ -g -msse4.2 -maes -O3 -o li bfarmhash.la -rpath /home/saito/src_master/__build__/third_party/farmhash-1.1.0/ lib farmhash.lo mv -f .deps/farmhash.Tpo .deps/farmhash.Plo mv: cannot stat ‘.deps/farmhash.Tpo’: No such file or directory make[2]: *** [farmhash.lo] Error 1 make[2]: Leaving directory
/home/saito/src_master/build/third_party/farmhash-1.1.0/tmp/farmhash-1.1.0/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/saito/src_master/build/third_party/farmhash-1.1.0/tmp/farmhash-1.1.0'
make: *** [all] Error 2
make: *** Waiting for unfinished jobs....
What steps will reproduce the problem?
1. Try compiling farmhash on a 64-bit Mac using Xcode 6.x
What is the expected output? What do you see instead?
I expect it to compile clean. Instead I get a ton of warnings:
farmhash.cc:995:26: Implicit conversion loses integer precision: 'size_t' (aka
'unsigned long') to 'uint32_t' (aka 'unsigned int')
What version of the product are you using? On what operating system?
farmhash-1.1.0
Please provide any additional information below.
Not that anybody would try to hash 4+ gigabytes of data, but if they do, will
it still work?
If you feel that losing precision is intentional, then a bunch of
static_cast<uint32_t>(len) would do the trick.
Original issue reported on code.google.com by [email protected]
on 30 Jun 2015 at 5:56
FYI, farmhash was submitted to vcpkg
port. You are welcome to review and comment here
Since both Highway hash and Cityhash have 256-bit variants, would Farmhash ever get its own 256-bit hash?
MacOS Mojave 10.14.2
XCODE Version 10.1 (10B61)
run npm i
got the follwing error
$ npm i
> [email protected] install /xxx/> [email protected] install /Users/zhangliao/Dev_Code/jingGangShan/node_modules/farmhash
> node-gyp rebuild
CXX(target) Release/obj.target/farmhash-legacy/src/upstream/farmhash-legacy.o
In file included from ../src/upstream/farmhash-legacy.cc:23:
../src/upstream/farmhash.h:49:10: warning: non-portable path to file '<String.h>'; specified path differs in case from file name on disk
[-Wnonportable-include-path]
#include <string.h> // for memcpy and memset
^~~~~~~~~~
<String.h>
In file included from ../src/upstream/farmhash-legacy.cc:23:
In file included from ../src/upstream/farmhash.h:49:
In file included from /usr/local/include/string.h:26:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/string:477:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/string_view:176:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/__string:56:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:641:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:70:9: error: no member named 'memcpy' in the global
namespace
using ::memcpy;
~~^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:71:9: error: no member named 'memmove' in the global
namespace
using ::memmove;
~~^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:72:9: error: no member named 'strcpy' in the global
namespace
using ::strcpy;rm node_modules/farmhash
> node-gyp rebuild
CXX(target) Release/obj.target/farmhash-legacy/src/upstream/farmhash-legacy.o
In file included from ../src/upstream/farmhash-legacy.cc:23:
../src/upstream/farmhash.h:49:10: warning: non-portable path to file '<String.h>'; specified path differs in case from file name on disk
[-Wnonportable-include-path]
#include <string.h> // for memcpy and memset
^~~~~~~~~~
<String.h>
In file included from ../src/upstream/farmhash-legacy.cc:23:
In file included from ../src/upstream/farmhash.h:49:
In file included from /usr/local/include/string.h:26:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/string:477:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/string_view:176:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/__string:56:
In file included from /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:641:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:70:9: error: no member named 'memcpy' in the global
namespace
using ::memcpy;
~~^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:71:9: error: no member named 'memmove' in the global
namespace
using ::memmove;
~~^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cstring:72:9: error: no member named 'strcpy' in the global
namespace
using ::strcpy;
https://github.com/google/farmhash/blob/master/config.sub
e.g.
#! /bin/sh
# Configuration validation subroutine script.
<<<<<<< HEAD
# Copyright 1992-2016 Free Software Foundation, Inc.
timestamp='2016-06-20'
=======
# Copyright 1992-2015 Free Software Foundation, Inc.
timestamp='2015-08-20'
>>>>>>> 96e46782298bdf50ac214ab1920e4eb49f7a0cda
"using namespace std;" in farmhash.cc causes the following error:
you should remove the "using namespace std" from the source code.
lib/hash/farmhash.cc:2006:14: error: reference to 'data' is ambiguous
memcpy(data + i, &u, 1); // uint8_t -> char
^
lib/hash/farmhash.cc:1990:13: note: candidate found by name lookup is 'data'
static char data[kDataSize];
^
/scratch-nvme/saito/bazel/_bazel_ysaito/e1a7c710473372ee150f48ad758734b2/external/grail_toolchain/include/c++/v1/iterator:1876:22: note: candidate found by name lookup is 'std::__1::data'
constexpr const _Ep* data(initializer_list<_Ep> __il) noexcept { return __il.begin(); }
As far as I can see, there's no documentation on how the thing actually works.
The source code is rather complex. Are there any Google-internal docs? If so,
is there any chance of releasing them? Implementations in different languages
would be so much easier to write...
Original issue reported on code.google.com by [email protected]
on 26 Jan 2015 at 1:14
Hi,
Compiling the library for MinGW 64, it comes to this like and it fails:
Line 308 in 2f0e005
Error: "endian.h: No such file or directory"
I am using just the last MinGW 7.3.0 64
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.