Why is it important to override GetHashCode when Equals method is overridden

Question

Given the following class  public class Foo       public int FooId   get  set        public string FooName   get  set         public override bool Equals object obj                Foo fooItem   obj as Foo           if  fooItem    null                        return false                     return fooItem FooId    this FooId             public override int GetHashCode                    Which is preferred           return base GetHashCode               return this FooId GetHashCode              I have overridden the Equals method because Foo represent a row for the Foos table   Which is the preferred method for overriding the GetHashCode   Why is it important to override GetHashCode

User · Answer

By overriding Equals you re basically stating that you are the one who knows better how to compare two instances of a given type  so you re likely to be the best candidate to provide the best hash code   This is an example of how ReSharper writes a GetHashCode   function for you   public override int GetHashCode         unchecked               var result   0          result    result   397    m someVar1          result    result   397    m someVar2          result    result   397    m someVar3          result    result   397    m someVar4          return result            As you can see it just tries to guess a good hash code based on all the fields in the class  but since you know your object s domain or value ranges you could still provide a better one

User · Answer

You should always guarantee that if two objects are equal  as defined by Equals    they should return the same hash code  As some of the other comments state  in theory this is not mandatory if the object will never be used in a hash based container like HashSet or Dictionary  I would advice you to always follow this rule though  The reason is simply because it is way too easy for someone to change a collection from one type to another with the good intention of actually improving the performance or just conveying the code semantics in a better way   For example  suppose we keep some objects in a List  Sometime later someone actually realizes that a HashSet is a much better alternative because of the better search characteristics for example  This is when we can get into trouble  List would internally use the default equality comparer for the type which means Equals in your case while HashSet makes use of GetHashCode    If the two behave differently  so will your program  And bear in mind that such issues are not the easiest to troubleshoot   I ve summarized this behavior with some other GetHashCode   pitfalls in a blog post where you can find further examples and explanations

User · Answer

As of C  9  net 5 or  net core 3 1   you may want to use records as it does Value Based Equality

User · Answer

It s actually very hard to implement GetHashCode   correctly because  in addition to the rules Marc already mentioned  the hash code should not change during the lifetime of an object  Therefore the fields which are used to calculate the hash code must be immutable   I finally found a solution to this problem when I was working with NHibernate  My approach is to calculate the hash code from the ID of the object  The ID can only be set though the constructor so if you want to change the ID  which is very unlikely  you have to create a new object which has a new ID and therefore a new hash code  This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID

User · Answer

It is because the framework requires that two objects that are the same must have the same hashcode   If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method  then the hash code of the two objects must also be the same    Dictionaries and Hashtables rely on this principle

User · Answer

As of  NET 4 7 the preferred method of overriding GetHashCode   is shown below  If targeting older  NET versions  include the System ValueTuple nuget package      C  7 0  public override int GetHashCode     gt   FooId  FooName  GetHashCode      In terms of performance  this method will outperform most composite hash code implementations  The ValueTuple is a struct so there won t be any garbage  and the underlying algorithm is as fast as it gets

User · Answer

We have two problems to cope with    You cannot provide a sensible GetHashCode   if any field in the object can be changed    Also often a object will NEVER be used in a collection that depends on  GetHashCode      So the cost of implementing GetHashCode   is often not worth it  or it is not possible  If someone puts your object in a collection that calls GetHashCode   and you have overrided Equals   without also making GetHashCode   behave in a correct way  that person may spend days tracking down the problem    Therefore by default I do   public class Foo       public int FooId   get  set        public string FooName   get  set         public override bool Equals object obj                Foo fooItem   obj as Foo           if  fooItem    null                       return false                     return fooItem FooId    this FooId             public override int GetHashCode                    Some comment to explain if there is a real problem with providing GetHashCode               or if I just don t see a need for it for the given class         throw new Exception  Sorry I don t know what GetHashCode should do for this class

User · Answer

Please don  t forget to check the obj parameter against null when overriding Equals    And also compare the type   public override bool Equals object obj        Foo fooItem   obj as Foo       if  fooItem    null               return false             return fooItem FooId    this FooId      The reason for this is  Equals must return false on comparison to null  See also http   msdn microsoft com en-us library bsc2ak47 aspx

User · Answer

It s actually very hard to implement GetHashCode   correctly because  in addition to the rules Marc already mentioned  the hash code should not change during the lifetime of an object  Therefore the fields which are used to calculate the hash code must be immutable   I finally found a solution to this problem when I was working with NHibernate  My approach is to calculate the hash code from the ID of the object  The ID can only be set though the constructor so if you want to change the ID  which is very unlikely  you have to create a new object which has a new ID and therefore a new hash code  This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID

User · Answer

How about   public override int GetHashCode         return string Format   0   1   2    prop1  prop2  prop3  GetHashCode           Assuming performance is not an issue

User · Answer

Below using reflection seems to me a better option considering public properties as with this you don t have have to worry about addition   removal of properties  although not so common scenario   This I found to be performing better also  Compared time using Diagonistics stop watch        public int getHashCode                 PropertyInfo   theProperties   this GetType   GetProperties            int hash   31          foreach  PropertyInfo info in theProperties                        if  info    null                                var value   info GetValue this null                   if value    null                  unchecked                                       hash   29   hash   value GetHashCode                                                      return hash

User · Answer

As of  NET 4 7 the preferred method of overriding GetHashCode   is shown below  If targeting older  NET versions  include the System ValueTuple nuget package      C  7 0  public override int GetHashCode     gt   FooId  FooName  GetHashCode      In terms of performance  this method will outperform most composite hash code implementations  The ValueTuple is a struct so there won t be any garbage  and the underlying algorithm is as fast as it gets

User · Answer

Yes  it is important if your item will be used as a key in a dictionary  or HashSet lt T gt   etc - since this is used  in the absence of a custom IEqualityComparer lt T gt   to group items into buckets  If the hash-code for two items does not match  they may never be considered equal  Equals will simply never be called    The GetHashCode   method should reflect the Equals logic  the rules are    if two things are equal  Equals         true  then they must return the same value for GetHashCode   if the GetHashCode   is equal  it is not necessary for them to be the same  this is a collision  and Equals will be called to see if it is a real equality or not    In this case  it looks like  return FooId   is a suitable GetHashCode   implementation  If you are testing multiple properties  it is common to combine them using code like below  to reduce diagonal collisions  i e  so that new Foo 3 5  has a different hash-code to new Foo 5 3     unchecked    only needed if you re compiling with arithmetic checks enabled       the default compiler behaviour is  disabled   so most folks won t need this      int hash   13      hash    hash   7    field1 GetHashCode        hash    hash   7    field2 GetHashCode                return hash      Oh - for convenience  you might also consider providing    and    operators when overriding Equals and GetHashCode     A demonstration of what happens when you get this wrong is here

User · Answer

As of C  9  net 5 or  net core 3 1   you may want to use records as it does Value Based Equality

User · Answer

Yes  it is important if your item will be used as a key in a dictionary  or HashSet lt T gt   etc - since this is used  in the absence of a custom IEqualityComparer lt T gt   to group items into buckets  If the hash-code for two items does not match  they may never be considered equal  Equals will simply never be called    The GetHashCode   method should reflect the Equals logic  the rules are    if two things are equal  Equals         true  then they must return the same value for GetHashCode   if the GetHashCode   is equal  it is not necessary for them to be the same  this is a collision  and Equals will be called to see if it is a real equality or not    In this case  it looks like  return FooId   is a suitable GetHashCode   implementation  If you are testing multiple properties  it is common to combine them using code like below  to reduce diagonal collisions  i e  so that new Foo 3 5  has a different hash-code to new Foo 5 3     unchecked    only needed if you re compiling with arithmetic checks enabled       the default compiler behaviour is  disabled   so most folks won t need this      int hash   13      hash    hash   7    field1 GetHashCode        hash    hash   7    field2 GetHashCode                return hash      Oh - for convenience  you might also consider providing    and    operators when overriding Equals and GetHashCode     A demonstration of what happens when you get this wrong is here

User · Answer

It s my understanding that the original GetHashCode   returns the memory address of the object  so it s essential to override it if you wish to compare two different objects    EDITED  That was incorrect  the original GetHashCode   method cannot assure the equality of 2 values  Though objects that are equal return the same hash code

User · Answer

By overriding Equals you re basically stating that you are the one who knows better how to compare two instances of a given type  so you re likely to be the best candidate to provide the best hash code   This is an example of how ReSharper writes a GetHashCode   function for you   public override int GetHashCode         unchecked               var result   0          result    result   397    m someVar1          result    result   397    m someVar2          result    result   397    m someVar3          result    result   397    m someVar4          return result            As you can see it just tries to guess a good hash code based on all the fields in the class  but since you know your object s domain or value ranges you could still provide a better one

User · Answer

It s not necessarily important  it depends on the size of your collections and your performance requirements and whether your class will be used in a library where you may not know the performance requirements  I frequently know my collection sizes are not very large and my time is more valuable than a few microseconds of performance gained by creating a perfect hash code  so  to get rid of the annoying warning by the compiler  I simply use      public override int GetHashCode              return base GetHashCode            Of course I could use a  pragma to turn off the warning as well but I prefer this way    When you are in the position that you do need the performance than all of the issues mentioned by others here apply  of course  Most important - otherwise you will get wrong results when retrieving items from a hash set or dictionary  the hash code must not vary with the life time of an object  more accurately  during the time whenever the hash code is needed  such as while being a key in a dictionary   for example  the following is wrong as Value is public and so can be changed externally to the class during the life time of the instance  so you must not use it as the basis for the hash code       class A            public int Value         public override int GetHashCode                    return Value GetHashCode      WRONG  Value is not constant during the instance s life time                    On the other hand  if Value can t be changed it s ok to use       class A            public readonly int Value         public override int GetHashCode                    return Value GetHashCode      OK  Value is read-only and can t be changed during the instance s life time

User · Answer

Just to add on above answers   If you don t override Equals then the default behavior is that references of the objects are compared  The same applies to hashcode - the default implmentation is typically based on a memory address of the reference  Because you did override Equals it means the correct behavior is to compare whatever you implemented on Equals and not the references  so you should do the same for the hashcode    Clients of your class will expect the hashcode to have similar logic to the equals method  for example linq methods which use a IEqualityComparer first compare the hashcodes and only if they re equal they ll compare the Equals   method which might be more expensive to run  if we didn t implement hashcode  equal object will probably have different hashcodes  because they have different memory address  and will be determined wrongly as not equal  Equals   won t even hit    In addition  except the problem that you might not be able to find your object if you used it in a dictionary  because it was inserted by one hashcode and when you look for it the default hashcode will probably be different and again the Equals   won t even be called  like Marc Gravell explains in his answer  you also introduce a violation of the dictionary or hashset concept which should not allow identical keys -  you already declared that those objects are essentially the same when you overrode Equals so you don t want both of them as different keys on a data structure which suppose to have a unique key  But because they have a different hashcode the  same  key will be inserted as different one

User · Answer

It s my understanding that the original GetHashCode   returns the memory address of the object  so it s essential to override it if you wish to compare two different objects    EDITED  That was incorrect  the original GetHashCode   method cannot assure the equality of 2 values  Though objects that are equal return the same hash code

User · Answer

It s not necessarily important  it depends on the size of your collections and your performance requirements and whether your class will be used in a library where you may not know the performance requirements  I frequently know my collection sizes are not very large and my time is more valuable than a few microseconds of performance gained by creating a perfect hash code  so  to get rid of the annoying warning by the compiler  I simply use      public override int GetHashCode              return base GetHashCode            Of course I could use a  pragma to turn off the warning as well but I prefer this way    When you are in the position that you do need the performance than all of the issues mentioned by others here apply  of course  Most important - otherwise you will get wrong results when retrieving items from a hash set or dictionary  the hash code must not vary with the life time of an object  more accurately  during the time whenever the hash code is needed  such as while being a key in a dictionary   for example  the following is wrong as Value is public and so can be changed externally to the class during the life time of the instance  so you must not use it as the basis for the hash code       class A            public int Value         public override int GetHashCode                    return Value GetHashCode      WRONG  Value is not constant during the instance s life time                    On the other hand  if Value can t be changed it s ok to use       class A            public readonly int Value         public override int GetHashCode                    return Value GetHashCode      OK  Value is read-only and can t be changed during the instance s life time

User · Answer

By overriding Equals you re basically stating that you are the one who knows better how to compare two instances of a given type  so you re likely to be the best candidate to provide the best hash code   This is an example of how ReSharper writes a GetHashCode   function for you   public override int GetHashCode         unchecked               var result   0          result    result   397    m someVar1          result    result   397    m someVar2          result    result   397    m someVar3          result    result   397    m someVar4          return result            As you can see it just tries to guess a good hash code based on all the fields in the class  but since you know your object s domain or value ranges you could still provide a better one

User · Answer

It is because the framework requires that two objects that are the same must have the same hashcode   If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method  then the hash code of the two objects must also be the same    Dictionaries and Hashtables rely on this principle

User · Answer

We have two problems to cope with    You cannot provide a sensible GetHashCode   if any field in the object can be changed    Also often a object will NEVER be used in a collection that depends on  GetHashCode      So the cost of implementing GetHashCode   is often not worth it  or it is not possible  If someone puts your object in a collection that calls GetHashCode   and you have overrided Equals   without also making GetHashCode   behave in a correct way  that person may spend days tracking down the problem    Therefore by default I do   public class Foo       public int FooId   get  set        public string FooName   get  set         public override bool Equals object obj                Foo fooItem   obj as Foo           if  fooItem    null                       return false                     return fooItem FooId    this FooId             public override int GetHashCode                    Some comment to explain if there is a real problem with providing GetHashCode               or if I just don t see a need for it for the given class         throw new Exception  Sorry I don t know what GetHashCode should do for this class

User · Answer

By overriding Equals you re basically stating that you are the one who knows better how to compare two instances of a given type  so you re likely to be the best candidate to provide the best hash code   This is an example of how ReSharper writes a GetHashCode   function for you   public override int GetHashCode         unchecked               var result   0          result    result   397    m someVar1          result    result   397    m someVar2          result    result   397    m someVar3          result    result   397    m someVar4          return result            As you can see it just tries to guess a good hash code based on all the fields in the class  but since you know your object s domain or value ranges you could still provide a better one

User · Answer

How about   public override int GetHashCode         return string Format   0   1   2    prop1  prop2  prop3  GetHashCode           Assuming performance is not an issue

User · Answer

Hash code is used for hash-based collections like Dictionary  Hashtable  HashSet etc  The purpose of this code is to very quickly pre-sort specific object by putting it into specific group  bucket   This pre-sorting helps tremendously in finding this object when you need to retrieve it back from hash-collection because code has to search for your object in just one bucket instead of in all objects it contains  The better distribution of hash codes  better uniqueness  the faster retrieval  In ideal situation where each object has a unique hash code  finding it is an O 1  operation  In most cases it approaches O 1

User · Answer

Please don  t forget to check the obj parameter against null when overriding Equals    And also compare the type   public override bool Equals object obj        Foo fooItem   obj as Foo       if  fooItem    null               return false             return fooItem FooId    this FooId      The reason for this is  Equals must return false on comparison to null  See also http   msdn microsoft com en-us library bsc2ak47 aspx

User · Answer

It s actually very hard to implement GetHashCode   correctly because  in addition to the rules Marc already mentioned  the hash code should not change during the lifetime of an object  Therefore the fields which are used to calculate the hash code must be immutable   I finally found a solution to this problem when I was working with NHibernate  My approach is to calculate the hash code from the ID of the object  The ID can only be set though the constructor so if you want to change the ID  which is very unlikely  you have to create a new object which has a new ID and therefore a new hash code  This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID

User · Answer

Just to add on above answers   If you don t override Equals then the default behavior is that references of the objects are compared  The same applies to hashcode - the default implmentation is typically based on a memory address of the reference  Because you did override Equals it means the correct behavior is to compare whatever you implemented on Equals and not the references  so you should do the same for the hashcode    Clients of your class will expect the hashcode to have similar logic to the equals method  for example linq methods which use a IEqualityComparer first compare the hashcodes and only if they re equal they ll compare the Equals   method which might be more expensive to run  if we didn t implement hashcode  equal object will probably have different hashcodes  because they have different memory address  and will be determined wrongly as not equal  Equals   won t even hit    In addition  except the problem that you might not be able to find your object if you used it in a dictionary  because it was inserted by one hashcode and when you look for it the default hashcode will probably be different and again the Equals   won t even be called  like Marc Gravell explains in his answer  you also introduce a violation of the dictionary or hashset concept which should not allow identical keys -  you already declared that those objects are essentially the same when you overrode Equals so you don t want both of them as different keys on a data structure which suppose to have a unique key  But because they have a different hashcode the  same  key will be inserted as different one

User · Answer

It is because the framework requires that two objects that are the same must have the same hashcode   If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method  then the hash code of the two objects must also be the same    Dictionaries and Hashtables rely on this principle

User · Answer

Yes  it is important if your item will be used as a key in a dictionary  or HashSet lt T gt   etc - since this is used  in the absence of a custom IEqualityComparer lt T gt   to group items into buckets  If the hash-code for two items does not match  they may never be considered equal  Equals will simply never be called    The GetHashCode   method should reflect the Equals logic  the rules are    if two things are equal  Equals         true  then they must return the same value for GetHashCode   if the GetHashCode   is equal  it is not necessary for them to be the same  this is a collision  and Equals will be called to see if it is a real equality or not    In this case  it looks like  return FooId   is a suitable GetHashCode   implementation  If you are testing multiple properties  it is common to combine them using code like below  to reduce diagonal collisions  i e  so that new Foo 3 5  has a different hash-code to new Foo 5 3     unchecked    only needed if you re compiling with arithmetic checks enabled       the default compiler behaviour is  disabled   so most folks won t need this      int hash   13      hash    hash   7    field1 GetHashCode        hash    hash   7    field2 GetHashCode                return hash      Oh - for convenience  you might also consider providing    and    operators when overriding Equals and GetHashCode     A demonstration of what happens when you get this wrong is here

User · Answer

You should always guarantee that if two objects are equal  as defined by Equals    they should return the same hash code  As some of the other comments state  in theory this is not mandatory if the object will never be used in a hash based container like HashSet or Dictionary  I would advice you to always follow this rule though  The reason is simply because it is way too easy for someone to change a collection from one type to another with the good intention of actually improving the performance or just conveying the code semantics in a better way   For example  suppose we keep some objects in a List  Sometime later someone actually realizes that a HashSet is a much better alternative because of the better search characteristics for example  This is when we can get into trouble  List would internally use the default equality comparer for the type which means Equals in your case while HashSet makes use of GetHashCode    If the two behave differently  so will your program  And bear in mind that such issues are not the easiest to troubleshoot   I ve summarized this behavior with some other GetHashCode   pitfalls in a blog post where you can find further examples and explanations

User · Answer

Hash code is used for hash-based collections like Dictionary  Hashtable  HashSet etc  The purpose of this code is to very quickly pre-sort specific object by putting it into specific group  bucket   This pre-sorting helps tremendously in finding this object when you need to retrieve it back from hash-collection because code has to search for your object in just one bucket instead of in all objects it contains  The better distribution of hash codes  better uniqueness  the faster retrieval  In ideal situation where each object has a unique hash code  finding it is an O 1  operation  In most cases it approaches O 1

User · Answer

It s actually very hard to implement GetHashCode   correctly because  in addition to the rules Marc already mentioned  the hash code should not change during the lifetime of an object  Therefore the fields which are used to calculate the hash code must be immutable   I finally found a solution to this problem when I was working with NHibernate  My approach is to calculate the hash code from the ID of the object  The ID can only be set though the constructor so if you want to change the ID  which is very unlikely  you have to create a new object which has a new ID and therefore a new hash code  This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID

User · Answer

It is because the framework requires that two objects that are the same must have the same hashcode   If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method  then the hash code of the two objects must also be the same    Dictionaries and Hashtables rely on this principle

User · Answer

Below using reflection seems to me a better option considering public properties as with this you don t have have to worry about addition   removal of properties  although not so common scenario   This I found to be performing better also  Compared time using Diagonistics stop watch        public int getHashCode                 PropertyInfo   theProperties   this GetType   GetProperties            int hash   31          foreach  PropertyInfo info in theProperties                        if  info    null                                var value   info GetValue this null                   if value    null                  unchecked                                       hash   29   hash   value GetHashCode                                                      return hash

User · Answer

Yes  it is important if your item will be used as a key in a dictionary  or HashSet lt T gt   etc - since this is used  in the absence of a custom IEqualityComparer lt T gt   to group items into buckets  If the hash-code for two items does not match  they may never be considered equal  Equals will simply never be called    The GetHashCode   method should reflect the Equals logic  the rules are    if two things are equal  Equals         true  then they must return the same value for GetHashCode   if the GetHashCode   is equal  it is not necessary for them to be the same  this is a collision  and Equals will be called to see if it is a real equality or not    In this case  it looks like  return FooId   is a suitable GetHashCode   implementation  If you are testing multiple properties  it is common to combine them using code like below  to reduce diagonal collisions  i e  so that new Foo 3 5  has a different hash-code to new Foo 5 3     unchecked    only needed if you re compiling with arithmetic checks enabled       the default compiler behaviour is  disabled   so most folks won t need this      int hash   13      hash    hash   7    field1 GetHashCode        hash    hash   7    field2 GetHashCode                return hash      Oh - for convenience  you might also consider providing    and    operators when overriding Equals and GetHashCode     A demonstration of what happens when you get this wrong is here

[c#] Why is it important to override GetHashCode when Equals method is overridden?

Examples related to c#

Examples related to overriding

Examples related to hashcode