Fast and good hash functions can be composed from fast permutations with lesser qualities, like
To yield a hashing function with superior qualities, like demonstrated with PCG for random number generation.
This is in fact also the recipe rrxmrrxmsx_0 and murmur hash are using, knowingly or unknowingly.
I personally found
uint64_t xorshift(const uint64_t& n,int i){
return n^(n>>i);
}
uint64_t hash(const uint64_t& n){
uint64_t p = 0x5555555555555555ull; // pattern of alternating 0 and 1
uint64_t c = 17316035218449499591ull;// random uneven integer constant;
return c*xorshift(p*xorshift(n,32),32);
}
to be good enough.
A good hash function should
Let's first look at the identity function. It satisfies 1. but not 2. :
Input bit n determines output bit n with a correlation of 100% (red) and no others, they are therefore blue, giving a perfect red line across.
A xorshift(n,32) is not much better, yielding one and half a line. Still satisfying 1., because it is invertible with a second application.
A multiplication with an unsigned integer ("Knuth's multiplicative method") is much better, cascading more strongly and flipping more output bits with a probability of 0.5, which is what you want, in green. It satisfies 1. as for each uneven integer there is a multiplicative inverse.
Combining the two gives the following output, still satisfying 1. as the composition of two bijective functions yields another bijective function.
A second application of multiplication and xorshift will yield the following:
Or you can use Galois field multiplications like GHash, they have become reasonably fast on modern CPUs and have superior qualities in one step.
uint64_t const inline gfmul(const uint64_t& i,const uint64_t& j){
__m128i I{};I[0]^=i;
__m128i J{};J[0]^=j;
__m128i M{};M[0]^=0xb000000000000000ull;
__m128i X = _mm_clmulepi64_si128(I,J,0);
__m128i A = _mm_clmulepi64_si128(X,M,0);
__m128i B = _mm_clmulepi64_si128(A,M,0);
return A[0]^A[1]^B[1]^X[0]^X[1];
}