[c++] Hash function for a string

We are currently dealing with hash function in my class. Our instructor asked us to a hash function on the internet to compare to the two we have used in our code.

The first one:

int HashTable::hash (string word)   
// POST: the index of entry is returned
{       int sum = 0;
        for (int k = 0; k < word.length(); k++)
            sum = sum + int(word[k]);
        return  sum % SIZE; 
}

Second:

int HashTable::hash (string word)
{
   int seed = 131; 
   unsigned long hash = 0;
   for(int i = 0; i < word.length(); i++)
   {
      hash = (hash * seed) + word[i];
   }
   return hash % SIZE;
}

Where SIZE is 501 (The size of the hash table) and the input is coming from a text file of 20,000+ words.

I saw this question with a few code examples but wasn't exactly sure what to be looking for in a hash function. If I understand correctly, in my case, a hash takes an input (string) and does a math calculation to assign the string a number and inserts it in a table. This process is done to increase the speed of searching the list?

If my logic is sound, does anyone have a good example or a resource showing a different hash function that involves a string? Or even the process of writing my own efficient hash function.

This question is related to c++ string hash hashtable

The answer is


Hash functions for algorithmic use have usually 2 goals, first they have to be fast, second they have to evenly distibute the values across the possible numbers. The hash function also required to give the all same number for the same input value.

if your values are strings, here are some examples for bad hash functions:

  1. string[0] - the ASCII characters a-Z are way more often then others
  2. string.lengh() - the most probable value is 1

Good hash functions tries to use every bit of the input while keeping the calculation time minimal. If you only need some hash code, try to multiply the bytes with prime numbers, and sum them.


-- The way to go these days --

Use SipHash. For your own protection.

-- Old and Dangerous --

unsigned int RSHash(const std::string& str)
{
    unsigned int b    = 378551;
    unsigned int a    = 63689;
    unsigned int hash = 0;

    for(std::size_t i = 0; i < str.length(); i++)
    {
        hash = hash * a + str[i];
        a    = a * b;
    }

    return (hash & 0x7FFFFFFF);
 }

 unsigned int JSHash(const std::string& str)
 {
      unsigned int hash = 1315423911;

      for(std::size_t i = 0; i < str.length(); i++)
      {
          hash ^= ((hash << 5) + str[i] + (hash >> 2));
      }

      return (hash & 0x7FFFFFFF);
 }

Ask google for "general purpose hash function"


Use boost::hash

#include <boost\functional\hash.hpp>

...

std::string a = "ABCDE";
size_t b = boost::hash_value(a);

Java's String implements hashCode like this:

public int hashCode()

Returns a hash code for this string. The hash code for a String object is computed as

     s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.) 

So something like this:

int HashTable::hash (string word) {
    int result = 0;
    for(size_t i = 0; i < word.length(); ++i) {
        result += word[i] * pow(31, i);
    }
    return result;
}

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to hash

php mysqli_connect: authentication method unknown to the client [caching_sha2_password] What is Hash and Range Primary Key? How to create a laravel hashed password Hashing a file in Python PHP salt and hash SHA256 for login password Append key/value pair to hash with << in Ruby Are there any SHA-256 javascript implementations that are generally considered trustworthy? How do I generate a SALT in Java for Salted-Hash? What does hash do in python? Hashing with SHA1 Algorithm in C#

Examples related to hashtable

How do I encode a JavaScript object as JSON? Hash table runtime complexity (insert, search and delete) Looping through a hash, or using an array in PowerShell Hash function for a string hash function for string iterating through Enumeration of hastable keys throws NoSuchElementException error How do HashTables deal with collisions? Good Hash Function for Strings Iterating over and deleting from Hashtable in Java What happens when a duplicate key is put into a HashMap?