avaneev / komihash Goto Github PK
View Code? Open in Web Editor NEWVery fast, high-quality hash function, discrete-incremental and streamed hashing-capable (non-cryptographic, inline C/C++) 26GB/s + PRNG
License: MIT License
Very fast, high-quality hash function, discrete-incremental and streamed hashing-capable (non-cryptographic, inline C/C++) 26GB/s + PRNG
License: MIT License
Is it bytes? Bits? Also seeing odd numbers as length is weird.
I am trying to understand what happens when I hash an 8 byte message.
As far as I can tell, kh_lpu64ec_l3 accesses memory outside my message. Is that correct? If yes, then that is undefined behavior and can lead to an IllegalAccess exception.
If I have no need for cryptographic properties, what is the reason for the elaborate "final byte" processing? What harm would come from padding with all zeros or all ones or any other constant?
I couldn't find any test vectors for komihash.
Could you please publish some so that I (and others) can verify implementations and ports are correct?
When doing discrete-incremental hashing, I feels that first round of komihash can be skipped. Let's call it komihash_inner
uint64_t komihash_inner(const void* const Msg0, size_t MsgLen,
uint64_t *Seed1, uint64_t *Seed5 ) {
//...
}
and komihash
can become
uint64_t komihash( const void* const Msg0, size_t MsgLen,
const uint64_t UseSeed ) {
const uint8_t* Msg = (const uint8_t*) Msg0;
// The seeds are initialized to the first mantissa bits of PI.
uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
uint64_t r1h, r2h;
KOMIHASH_PREFETCH( Msg );
KOMIHASH_HASHROUND(); // Required for Perlin Noise.
komihash_inner(Msg0, MsgLen, &seed1, &seed5);
}
Then we can introduce struct DescreteKomihash
and komihash_discrete_hash
struct DescreteKomihash {
uint64_t Seed1;
uint64_t Seed5;
}
uint64_t komihash_discrete_hash(struct DescreteKomihash *d, const void* const Msg0, size_t MsgLen) {
komihash_inner(Msg0, MsgLen, &d->seed1, &d->seed5);
return d->seed1;
}
where DescretKomihash
will be initialized with a seed with similar steps in komihash
.
Such construct reduce a multiply operation for each call.
But I don't have knowledge in how to design a good hash function, I have no idea if it's safe do it in this way. Any comment or advise?
Thank you for the project!
Would be nice to have something like this (like xxhash):
komi_state_t* komi_createState(void);
komi_errorcode komi_freeState(komi_state_t* statePtr);
komi_errorcode komi_reset(komi_state_t* statePtr, uint64_t seed);
komi_errorcode komi_update(komi_state_t* statePtr, const void* input, size_t length);
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.