HashMap get put complexity

Question

We are used to saying that HashMap get put operations are O 1   However it depends on the hash implementation  The default object hash is actually the internal address in the JVM heap  Are we sure it is good enough to claim that the get put are O 1     Available memory is another issue  As I understand from the javadocs  the HashMap load factor should be 0 75  What if we do not have enough memory in JVM and the load factor exceeds the limit    So  it looks like O 1  is not guaranteed  Does it make sense or am I missing something

User · Answer

I agree with   the general amortized complexity of O 1  a bad hashCode   implementation could result to multiple collisions  which means that in the worst case every object goes to the same bucket  thus O N  if each bucket is backed by a List  since Java 8  HashMap dynamically replaces the Nodes  linked list  used in each bucket with TreeNodes  red-black tree when a list gets bigger than 8 elements  resulting to a worst performance of O logN    But  this is not the full truth if we want to be 100  precise  The implementation of hashCode   and the type of key Object  immutable cached or being a Collection  might also affect real time complexity in strict terms  Let s assume the following three cases   HashMap lt Integer  V gt  HashMap lt String  V gt  HashMap lt List lt E gt   V gt   Do they have the same complexity  Well  the amortised complexity of the 1st one is  as expected  O 1   But  for the rest  we also need to compute hashCode   of the lookup element  which means we might have to traverse arrays and lists in our algorithm  Lets assume that the size of all of the above arrays lists is k  Then  HashMap lt String  V gt  and HashMap lt List lt E gt   V gt  will have O k  amortised complexity and similarly  O k   logN  worst case in Java8   Note that using a String key is a more complex case  because it is immutable and Java caches the result of hashCode   in a private variable hash  so it s only computed once      Cache the hash code for the string        private int hash     Default to 0  But  the above is also having its own worst case  because Java s String hashCode   implementation is checking if hash    0 before computing hashCode  But hey  there are non-empty Strings that output a hashcode of zero  such as  quot f5a5a608 quot   see here  in which case memoization might not be helpful

User · Answer

In practice  it is O 1   but this actually is a terrible and mathematically non-sense simplification  The O   notation says how the algorithm behaves when the size of the problem tends to infinity  Hashmap get put works like an O 1  algorithm for a limited size  The limit is fairly large from the computer memory and from the addressing point of view  but far from infinity   When one says that hashmap get put is O 1  it should really say that the time needed for the get put is more or less constant and does not depend on the number of elements in the hashmap so far as the hashmap can be presented on the actual computing system  If the problem goes beyond that size and we need larger hashmaps then  after a while  certainly the number of the bits describing one element will also increase as we run out of the possible describable different elements  For example  if we used a hashmap to store 32bit numbers and later we increase the problem size so that we will have more than 2 32 bit elements in the hashmap  then the individual elements will be described with more than 32bits   The number of the bits needed to describe the individual elements is log N   where N is the maximum number of elements  therefore get and put are really O log N    If you compare it with a tree set  which is O log n  then hash set is O long max n   and we simply feel that this is O 1   because on a certain implementation max n  is fixed  does not change  the size of the objects we store measured in bits  and the algorithm calculating the hash code is fast   Finally  if finding an element in any data structure were O 1  we would create information out of thin air  Having a data structure of n element I can select one element in n different way  With that  I can encode log n  bit information  If I can encode that in zero bit  that is what O 1  means  then I created an infinitely compressing ZIP algorithm

User · Answer

It depends on many things  It s usually O 1   with a decent hash which itself is constant time    but you could have a hash which takes a long time to compute  and if there are multiple items in the hash map which return the same hash code  get will have to iterate over them calling equals on each of them to find a match   In the worst case  a HashMap has an O n  lookup due to walking through all entries in the same hash bucket  e g  if they all have the same hash code   Fortunately  that worst case scenario doesn t come up very often in real life  in my experience  So no  O 1  certainly isn t guaranteed - but it s usually what you should assume when considering which algorithms and data structures to use   In JDK 8  HashMap has been tweaked so that if keys can be compared for ordering  then any densely-populated bucket is implemented as a tree  so that even if there are lots of entries with the same hash code  the complexity is O log n   That can cause issues if you have a key type where equality and ordering are different  of course   And yes  if you don t have enough memory for the hash map  you ll be in trouble    but that s going to be true whatever data structure you use

User · Answer

It has already been mentioned that hashmaps are O n m  in average  if n is the number of items and m is the size  It has also been mentioned that in principle the whole thing could collapse into a singly linked list with O n  query time   This all assumes that calculating the hash is constant time    However what isn t often mentioned is  that with probability at least 1-1 n  so for 1000 items that s a 99 9  chance  the largest bucket won t be filled more than O logn   Hence matching the average complexity of binary search trees   And the constant is good  a tighter bound is  log n   m n    O 1     All that s required for this theoretical bound is that you use a reasonably good hash function  see Wikipedia  Universal Hashing  It can be as simple as a x gt  gt m   And of course that the person giving you the values to hash doesn t know how you have chosen your random constants   TL DR  With Very High Probability the worst case get put complexity of a hashmap is O logn

User · Answer

HashMap operation is dependent factor of hashCode implementation  For the ideal scenario lets say the good hash implementation which provide unique hash code for every object  No hash collision  then the best  worst and average case scenario would be O 1   Let s consider a scenario where a bad implementation of hashCode always returns 1 or such hash which has hash collision  In this case the time complexity would be O n    Now coming to the second part of the question about memory  then yes memory constraint would be taken care by JVM

User · Answer

I m not sure the default hashcode is the address - I read the OpenJDK source for hashcode generation a while ago  and I remember it being something a bit more complicated  Still not something that guarantees a good distribution  perhaps  However  that is to some extent moot  as few classes you d use as keys in a hashmap use the default hashcode - they supply their own implementations  which ought to be good   On top of that  what you may not know  again  this is based in reading source - it s not guaranteed  is that HashMap stirs the hash before using it  to mix entropy from throughout the word into the bottom bits  which is where it s needed for all but the hugest hashmaps  That helps deal with hashes that specifically don t do that themselves  although i can t think of any common cases where you d see that   Finally  what happens when the table is overloaded is that it degenerates into a set of parallel linked lists - performance becomes O n   Specifically  the number of links traversed will on average be half the load factor

[java] HashMap get/put complexity

Examples related to java

Examples related to data-structures

Examples related to hashmap

Examples related to complexity-theory