Implementing a HashMap in C

Question

How to go about creating a Hashmap in C from scratch as is present in C   STL  What parameters would be taken into consideration and how would you test the hashmap  As in  what would the benchmark test cases be which you would run before you could say that your hashmap is complete

User · Accepted Answer

Well if you know the basics behind them  it shouldn t be too hard    Generally you create an array called  buckets  that contain the key and value  with an optional pointer to create a linked list   When you access the hash table with a key  you process the key with a custom hash function which will return an integer  You then take the modulus of the result and that is the location of your array index or  bucket   Then you check the unhashed key with the stored key  and if it matches  then you found the right place    Otherwise  you ve had a  collision  and must crawl through the linked list and compare keys until you match   note some implementations use a binary tree instead of linked list for collisions    Check out this fast hash table implementation   https   attractivechaos wordpress com 2009 09 29 khash-h

User · Answer

There are other mechanisms to handle overflow than the simple minded linked list of overflow entries which e g  wastes a lot of memory   Which mechanism to use depends among other things on if you can choose the hash function and possible pick more than one  to implement e g  double hashing to handle collisions   if you expect to often add items or if the map is static once filled  if you intend to remove items or not       The best way to implement this is to first think about all these parameters and then not code it yourself but to pick a mature existing implementation   Google has a few good implementations -- e g  http   code google com p google-sparsehash

User · Answer

The best approach depends on the expected key distribution and number of collisions  If relatively few collisions are expected  it really doesn t matter which method is used  If lots of collisions are expected  then which to use depends on the cost of rehashing or probing vs  manipulating the extensible bucket data structure   But here is source code example of An Hashmap Implementation in C

User · Answer

The primary goal of a hashmap is to store a data set and provide near constant time lookups on it using a unique key   There are two common styles of hashmap implementation    Separate chaining  one with an array of buckets  linked lists  Open addressing  a single array allocated with extra space so index collisions may be resolved by placing the entry in an adjacent slot    Separate chaining is preferable if the hashmap may have a poor hash function  it is not desirable to pre-allocate storage for potentially unused slots  or entries may have variable size   This type of hashmap may continue to function relatively efficiently even when the load factor exceeds 1 0   Obviously  there is extra memory required in each entry to store linked list pointers   Hashmaps using open addressing have potential performance advantages when the load factor is kept below a certain threshold  generally about 0 7  and a reasonably good hash function is used   This is because they avoid potential cache misses and many small memory allocations associated with a linked list  and perform all operations in a contiguous  pre-allocated array   Iteration through all elements is also cheaper   The catch is hashmaps using open addressing must be reallocated to a larger size and rehashed to maintain an ideal load factor  or they face a significant performance penalty   It is impossible for their load factor to exceed 1 0   Some key performance metrics to evaluate when creating a hashmap would include    Maximum load factor Average collision count on insertion Distribution of collisions  uneven distribution  clustering  could indicate a poor hash function  Relative time for various operations  put  get  remove of existing and non-existing entries    Here is a flexible hashmap implementation I made   I used open addressing and linear probing for collision resolution   https   github com DavidLeeds hashmap

[c] Implementing a HashMap in C

Examples related to c

Examples related to data-structures

Examples related to hashmap