[c] What integer hash function are good that accepts an integer hash key?

Fast and good hash functions can be composed from fast permutations with lesser qualities, like

  • multiplication with an uneven integer
  • binary rotations
  • xorshift

To yield a hashing function with superior qualities, like demonstrated with PCG for random number generation.

This is in fact also the recipe rrxmrrxmsx_0 and murmur hash are using, knowingly or unknowingly.

I personally found

uint64_t xorshift(const uint64_t& n,int i){
  return n^(n>>i);
}
uint64_t hash(const uint64_t& n){
  uint64_t p = 0x5555555555555555ull; // pattern of alternating 0 and 1
  uint64_t c = 17316035218449499591ull;// random uneven integer constant; 
  return c*xorshift(p*xorshift(n,32),32);
}

to be good enough.

A good hash function should

  1. be bijective to not loose information, if possible and have the least collisions
  2. cascade as much and as evenly as possible, i.e. each input bit should flip every output bit with probability 0.5.

Let's first look at the identity function. It satisfies 1. but not 2. :

identity function

Input bit n determines output bit n with a correlation of 100% (red) and no others, they are therefore blue, giving a perfect red line across.

A xorshift(n,32) is not much better, yielding one and half a line. Still satisfying 1., because it is invertible with a second application.

xorshift

A multiplication with an unsigned integer ("Knuth's multiplicative method") is much better, cascading more strongly and flipping more output bits with a probability of 0.5, which is what you want, in green. It satisfies 1. as for each uneven integer there is a multiplicative inverse.

knuth

Combining the two gives the following output, still satisfying 1. as the composition of two bijective functions yields another bijective function.

knuth•xorshift

A second application of multiplication and xorshift will yield the following:

proposed hash

Or you can use Galois field multiplications like GHash, they have become reasonably fast on modern CPUs and have superior qualities in one step.

   uint64_t const inline gfmul(const uint64_t& i,const uint64_t& j){           
     __m128i I{};I[0]^=i;                                                          
     __m128i J{};J[0]^=j;                                                          
     __m128i M{};M[0]^=0xb000000000000000ull;                                      
     __m128i X = _mm_clmulepi64_si128(I,J,0);                                      
     __m128i A = _mm_clmulepi64_si128(X,M,0);                                      
     __m128i B = _mm_clmulepi64_si128(A,M,0);                                      
     return A[0]^A[1]^B[1]^X[0]^X[1];                                              
   }

Examples related to c

conflicting types for 'outchar' Can't compile C program on a Mac after upgrade to Mojave Program to find largest and second largest number in array Prime numbers between 1 to 100 in C Programming Language In c, in bool, true == 1 and false == 0? How I can print to stderr in C? Visual Studio Code includePath "error: assignment to expression with array type error" when I assign a struct field (C) Compiling an application for use in highly radioactive environments How can you print multiple variables inside a string using printf?

Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to hash

php mysqli_connect: authentication method unknown to the client [caching_sha2_password] What is Hash and Range Primary Key? How to create a laravel hashed password Hashing a file in Python PHP salt and hash SHA256 for login password Append key/value pair to hash with << in Ruby Are there any SHA-256 javascript implementations that are generally considered trustworthy? How do I generate a SALT in Java for Salted-Hash? What does hash do in python? Hashing with SHA1 Algorithm in C#