Though djb2
, as presented on stackoverflow by cnicutar, is almost certainly better, I think it's worth showing the K&R hashes too:
1) Apparently a terrible hash algorithm, as presented in K&R 1st edition (source)
unsigned long hash(unsigned char *str)
{
unsigned int hash = 0;
int c;
while (c = *str++)
hash += c;
return hash;
}
2) Probably a pretty decent hash algorithm, as presented in K&R version 2 (verified by me on pg. 144 of the book); NB: be sure to remove % HASHSIZE
from the return statement if you plan on doing the modulus sizing-to-your-array-length outside the hash algorithm. Also, I recommend you make the return and "hashval" type unsigned long
instead of the simple unsigned
(int).
unsigned hash(char *s)
{
unsigned hashval;
for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31*hashval;
return hashval % HASHSIZE;
}
Note that it's clear from the two algorithms that one reason the 1st edition hash is so terrible is because it does NOT take into consideration string character order, so hash("ab")
would therefore return the same value as hash("ba")
. This is not so with the 2nd edition hash, however, which would (much better!) return two different values for those strings.
The GCC C++11 hashing functions used for unordered_map
(a hash table template) and unordered_set
(a hash set template) appear to be as follows.
Code:
// Implementation of Murmur hash for 32-bit size_t.
size_t _Hash_bytes(const void* ptr, size_t len, size_t seed)
{
const size_t m = 0x5bd1e995;
size_t hash = seed ^ len;
const char* buf = static_cast<const char*>(ptr);
// Mix 4 bytes at a time into the hash.
while (len >= 4)
{
size_t k = unaligned_load(buf);
k *= m;
k ^= k >> 24;
k *= m;
hash *= m;
hash ^= k;
buf += 4;
len -= 4;
}
// Handle the last few bytes of the input array.
switch (len)
{
case 3:
hash ^= static_cast<unsigned char>(buf[2]) << 16;
[[gnu::fallthrough]];
case 2:
hash ^= static_cast<unsigned char>(buf[1]) << 8;
[[gnu::fallthrough]];
case 1:
hash ^= static_cast<unsigned char>(buf[0]);
hash *= m;
};
// Do a few final mixes of the hash.
hash ^= hash >> 13;
hash *= m;
hash ^= hash >> 15;
return hash;
}