Is there any advantage of using map over unordered map in case of trivial keys

Question

A recent talk about unordered map in C   made me realize that I should use unordered map for most cases where I used map before  because of the efficiency of lookup   amortized O 1  vs  O log n     Most times I use a map  I use either int or std  string as the key type  hence  I ve got no problems with the definition of the hash function  The more I thought about it  the more I came to realize that I can t find any reason of using a std  map over a std  unordered map in the case of keys with simple types -- I took a look at the interfaces  and didn t find any significant differences that would impact my code    Hence the question  is there any real reason to use std  map over std  unordered map in the case of simple types like int and std  string   I m asking from a strictly programming point of view -- I know that it s not fully considered standard  and that it may pose problems with porting   Also  I expect that one of the correct answers might be  it s more efficient for smaller sets of data  because of a smaller overhead  is that true   -- hence I d like to restrict the question to cases where the amount of keys is non-trivial   1 024    Edit  duh  I forgot the obvious  thanks GMan   -- yes  maps are ordered of course -- I know that  and am looking for other reasons

User · Answer

Small addition to all of above   Better use map  when you need to get elements by range  as they are sorted and you can just iterate over them from one boundary to another

User · Answer

Hash tables have higher constants than common map implementations  which become significant for small containers   Max size is 10  100  or maybe even 1 000 or more  Constants are the same as ever  but O log n  is close to O k     Remember logarithmic complexity is still really good    What makes a good hash function depends on your data s characteristics  so if I don t plan on looking at a custom hash function  but can certainly change my mind later  and easily since I typedef damn near everything  and even though defaults are chosen to perform decently for many data sources  I find the ordered nature of map to be enough of a help initially that I still default to map rather than a hash table in that case   Plus that way you don t have to even think about writing a hash function for other  usually UDT  types  and just write op lt   which you want anyway

User · Answer

Don t forget that map keeps its elements ordered  If you can t give that up  obviously you can t use unordered map   Something else to keep in mind is that unordered map generally uses more memory  map just has a few house-keeping pointers  and memory for each object  Contrarily  unordered map has a big array  these can get quite big in some implementations   and then additional memory for each object  If you need to be memory-aware  map should prove better  because it lacks the large array   So  if you need pure lookup-retrieval  I d say unordered map is the way to go  But there are always trade-offs  and if you can t afford them  then you can t use it   Just from personal experience  I found an enormous improvement in performance  measured  of course  when using unordered map instead of map in a main entity look-up table   On the other hand  I found it was much slower at repeatedly inserting and removing elements  It s great for a relatively static collection of elements  but if you re doing tons of insertions and deletions the hashing   bucketing seems to add up   Note  this was over many iterations

User · Answer

I would just point out that    there are many kind of unordered maps   Look up the Wikipedia Article on hash map  Depending on which implementation was used  the characteristics in term of look-up  insertion and deletion might vary quite significantly   And that s what worries me the most with the addition of unordered map to the STL  they will have to choose a particular implementation as I doubt they ll go down the Policy road  and so we will be stuck with an implementation for the average use and nothing for the other cases     For example some hash maps have linear rehashing  where instead of rehashing the whole hash map at once  a portion is rehash at each insertion  which helps amortizing the cost   Another example  some hash maps use a simple list of nodes for a bucket  others use a map  others don t use nodes but find the nearest slot and lastly some will use a list of nodes but reorder it so that the last accessed element is at the front  like a caching thing    So at the moment I tend to prefer the std  map or perhaps a loki  AssocVector  for frozen data sets    Don t get me wrong  I d like to use the std  unordered map and I may in the future  but it s difficult to  trust  the portability of such a container when you think of all the ways of implementing it and the various performances that result of this

User · Answer

Reasons have been given in other answers  here is another   std  map  balanced binary tree  operations are amortized O log n  and worst case O log n   std  unordered map  hash table  operations are amortized O 1  and worst case O n    How this plays out in practice is that the hash table  hiccups  every once in a while with an O n  operation  which may or may not be something your application can tolerate   If it can t tolerate it  you d prefer std  map over std  unordered map

User · Answer

I was intrigued by the answer from  Jerry Coffin  which suggested that the ordered map would exhibit performance increases on long strings  after some experimentation  which can be downloaded from pastebin   I ve found that this seems to hold true only for collections of random strings  when the map is initialised with a sorted dictionary  which contain words with considerable amounts of prefix-overlap   this rule breaks down  presumably because of the increased tree depth necessary to retrieve value  The results are shown below  the 1st number column is insert time  2nd is fetch time  g   -g -O3 --std c  0x   -c -o stdtests o stdtests cpp g   -o stdtests stdtests o gmurphy interloper HashTests    stdtests   1st number column is insert time  2nd is fetch time     Integer Keys      unordered       137      15    ordered       168      81     Random String Keys      unordered        55      50    ordered        33      31     Real Words Keys      unordered       278      76    ordered       516     298

User · Answer

From  http   www cplusplus com reference map map    Internally  the elements in a map are always sorted by its key following a specific strict weak ordering criterion indicated by its internal comparison object  of type Compare    map containers are generally slower than unordered map containers to access individual elements by their key  but they allow the direct iteration on subsets based on their order

User · Answer

Summary  Assuming ordering is not important    If you are going to build large table once and do lots of queries  use std  unordered map If you are going to build small table  may be under 100 elements  and do lots of queries  use std  map  This is because reads on it are O log n   If you are going to change table a lot then may be std  map is good option   If you are in doubt  just use std  unordered map    Historical Context  In most languages  unordered map  aka hash based dictionaries  are the default map however in C   you get ordered map as default map  How did that happen  Some people erroneously assume that C   committee made this decision in their unique wisdom but the truth is unfortunately uglier than that    It is widely believed that C   ended up with ordered map as default because there are not too many parameters on how they can be implemented  On the other hand  hash based implementations has tons of things to talk about  So to avoid gridlocks in standardization they just got along with ordered map  Around 2005  many languages already had good implementations of hash based implementation and so it was more easier for the committee to accept new std  unordered map  In a perfect world  std  map would have been unordered and we would have std  ordered map as separate type   Performance  Below two graphs should speak for themselves  source

User · Answer

If you want to compare the speed of your std  map and std  unordered map implementations  you could use Google s sparsehash project which has a time hash map program to time them   For example  with gcc 4 4 2 on an x86 64 Linux system      time hash map TR1 UNORDERED MAP  4 byte objects  10000000 iterations   map grow              126 1 ns   27427396 hashes  40000000 copies   290 9 MB map predict grow       67 4 ns   10000000 hashes  40000000 copies   232 8 MB map replace            22 3 ns   37427396 hashes  40000000 copies  map fetch              16 3 ns   37427396 hashes  40000000 copies  map fetch empty         9 8 ns   10000000 hashes         0 copies  map remove             49 1 ns   37427396 hashes  40000000 copies  map toggle             86 1 ns   20000000 hashes  40000000 copies   STANDARD MAP  4 byte objects  10000000 iterations   map grow              225 3 ns          0 hashes  20000000 copies   462 4 MB map predict grow      225 1 ns          0 hashes  20000000 copies   462 6 MB map replace           151 2 ns          0 hashes  20000000 copies  map fetch             156 0 ns          0 hashes  20000000 copies  map fetch empty         1 4 ns          0 hashes         0 copies  map remove            141 0 ns          0 hashes  20000000 copies  map toggle             67 3 ns          0 hashes  20000000 copies

User · Answer

I d echo roughly the same point GMan made  depending on the type of use  std  map can be  and often is  faster than std  tr1  unordered map  using the implementation included in VS 2008 SP1    There are a few complicating factors to keep in mind  For example  in std  map  you re comparing keys  which means you only ever look at enough of the beginning of a key to distinguish between the right and left sub-branches of the tree  In my experience  nearly the only time you look at an entire key is if you re using something like int that you can compare in a single instruction  With a more typical key type like std  string  you often compare only a few characters or so   A decent hash function  by contrast  always looks at the entire key  IOW  even if the table lookup is constant complexity  the hash itself has roughly linear complexity  though on the length of the key  not the number of items   With long strings as keys  an std  map might finish a search before an unordered map would even start its search   Second  while there are several methods of resizing hash tables  most of them are pretty slow -- to the point that unless lookups are considerably more frequent than insertions and deletions  std  map will often be faster than std  unordered map   Of course  as I mentioned in the comment on your previous question  you can also use a table of trees  This has both advantages and disadvantages  On one hand  it limits the worst case to that of a tree  It also allows fast insertion and deletion  because  at least when I ve done it  I ve used a fixed-size of table  Eliminating all table resizing allows you to keep your hash table a lot simpler and typically faster   One other point  the requirements for hashing and tree-based maps are different  Hashing obviously requires a hash function  and an equality comparison  where ordered maps require a less-than comparison  Of course the hybrid I mentioned requires both  Of course  for the common case of using a string as the key  this isn t really a problem  but some types of keys suit ordering better than hashing  or vice versa

User · Answer

Significant differences that have not really been adequately mentioned here   map keeps iterators to all elements stable  in C  17 you can even move elements from one map to the other without invalidating iterators to them  and if properly implemented without any potential allocation   map timings for single operations are typically more consistent since they never need large allocations  unordered map using std  hash as implemented in the libstdc   is vulnerable to DoS if fed with untrusted input  it uses MurmurHash2 with a constant seed - not that seeding would really help  see https   emboss github io blog 2012 12 14 breaking-murmur-hash-flooding-dos-reloaded    Being ordered enables efficient range searches  e g  iterate over all elements with key   42

User · Answer

I ve made a test recently which makes 50000  merge amp sort  That means if the string keys are the same  merge the byte string  And the final output should be sorted  So this includes a look up for every insertion   For the map implementation  it takes 200 ms to finish the job  For the unordered map   map  it takes 70 ms for unordered map insertion and 80 ms for map insertion  So the hybrid implementation is 50 ms faster   We should think twice before we use the map  If you only need the data to be sorted in the final result of your program  a hybrid solution may be better

[c++] Is there any advantage of using map over unordered_map in case of trivial keys?

Examples related to c++

Examples related to performance

Examples related to dictionary

Examples related to unordered-map