avaneev / komihash Goto Github PK

Very fast, high-quality hash function, discrete-incremental and streamed hashing-capable (non-cryptographic, inline C/C++) 26GB/s + PRNG

License: MIT License

C 100.00%

hash hashing hashing-algorithm hash-function hash-table prng prng-algorithms pseudo-random pseudorandom pseudo-random-generator

komihash's People

Contributors

Stargazers

Watchers

Forkers

icyfox168168 cybernetics themucha rangercyh clayne

komihash's Issues

What's the unit of horizontal axis in readme graph

Is it bytes? Bits? Also seeing odd numbers as length is weird.

Undefined behavior?

I am trying to understand what happens when I hash an 8 byte message.

As far as I can tell, kh_lpu64ec_l3 accesses memory outside my message. Is that correct? If yes, then that is undefined behavior and can lead to an IllegalAccess exception.

If I have no need for cryptographic properties, what is the reason for the elaborate "final byte" processing? What harm would come from padding with all zeros or all ones or any other constant?

No KATs published

I couldn't find any test vectors for komihash.

Could you please publish some so that I (and others) can verify implementations and ports are correct?

Is it safe to skip first round of komihash when performing discrete-incremental hashsing

When doing discrete-incremental hashing, I feels that first round of komihash can be skipped. Let's call it komihash_inner

uint64_t komihash_inner(const void* const Msg0, size_t MsgLen,
        uint64_t *Seed1, uint64_t *Seed5  ) {
   //...
}

and komihash can become

uint64_t komihash( const void* const Msg0, size_t MsgLen,
	const uint64_t UseSeed ) {
   	const uint8_t* Msg = (const uint8_t*) Msg0;

	// The seeds are initialized to the first mantissa bits of PI.

	uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
	uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
	uint64_t r1h, r2h;

	KOMIHASH_PREFETCH( Msg );

	KOMIHASH_HASHROUND(); // Required for Perlin Noise.
        komihash_inner(Msg0, MsgLen, &seed1, &seed5);
}

Then we can introduce struct DescreteKomihash and komihash_discrete_hash

struct DescreteKomihash {
        uint64_t Seed1;
        uint64_t Seed5;
}

uint64_t komihash_discrete_hash(struct DescreteKomihash *d, const void* const Msg0, size_t MsgLen) {
        komihash_inner(Msg0, MsgLen, &d->seed1, &d->seed5);
        return d->seed1;
}

where DescretKomihash will be initialized with a seed with similar steps in komihash.

Such construct reduce a multiply operation for each call.

But I don't have knowledge in how to design a good hash function, I have no idea if it's safe do it in this way. Any comment or advise?

Feature request: streaming API

Thank you for the project!

Would be nice to have something like this (like xxhash):

komi_state_t* komi_createState(void);
komi_errorcode komi_freeState(komi_state_t* statePtr);
komi_errorcode komi_reset(komi_state_t* statePtr, uint64_t seed);
komi_errorcode komi_update(komi_state_t* statePtr, const void* input, size_t length);

avaneev / komihash Goto Github PK

komihash's People

Contributors

Stargazers

Watchers

Forkers

komihash's Issues

What's the unit of horizontal axis in readme graph

Undefined behavior?

No KATs published

Is it safe to skip first round of komihash when performing discrete-incremental hashsing

Feature request: streaming API

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent