[java] Best implementation for hashCode method for a collection

How do we decide on the best implementation of hashCode() method for a collection (assuming that equals method has been overridden correctly) ?

This question is related to java hash equals hashcode

The answer is


When combining hash values, I usually use the combining method that's used in the boost c++ library, namely:

seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);

This does a fairly good job of ensuring an even distribution. For some discussion of how this formula works, see the StackOverflow post: Magic number in boost::hash_combine

There's a good discussion of different hash functions at: http://burtleburtle.net/bob/hash/doobs.html


Here is another JDK 1.7+ approach demonstration with superclass logics accounted. I see it as pretty convinient with Object class hashCode() accounted, pure JDK dependency and no extra manual work. Please note Objects.hash() is null tolerant.

I have not include any equals() implementation but in reality you will of course need it.

import java.util.Objects;

public class Demo {

    public static class A {

        private final String param1;

        public A(final String param1) {
            this.param1 = param1;
        }

        @Override
        public int hashCode() {
            return Objects.hash(
                super.hashCode(),
                this.param1);
        }

    }

    public static class B extends A {

        private final String param2;
        private final String param3;

        public B(
            final String param1,
            final String param2,
            final String param3) {

            super(param1);
            this.param2 = param2;
            this.param3 = param3;
        }

        @Override
        public final int hashCode() {
            return Objects.hash(
                super.hashCode(),
                this.param2,
                this.param3);
        }
    }

    public static void main(String [] args) {

        A a = new A("A");
        B b = new B("A", "B", "C");

        System.out.println("A: " + a.hashCode());
        System.out.println("B: " + b.hashCode());
    }

}

I prefer using utility methods fromm Google Collections lib from class Objects that helps me to keep my code clean. Very often equals and hashcode methods are made from IDE's template, so their are not clean to read.


any hashing method that evenly distributes the hash value over the possible range is a good implementation. See effective java ( http://books.google.com.au/books?id=ZZOiqZQIbRMC&dq=effective+java&pg=PP1&ots=UZMZ2siN25&sig=kR0n73DHJOn-D77qGj0wOxAxiZw&hl=en&sa=X&oi=book_result&resnum=1&ct=result ) , there is a good tip in there for hashcode implementation (item 9 i think...).


There's a good implementation of the Effective Java's hashcode() and equals() logic in Apache Commons Lang. Checkout HashCodeBuilder and EqualsBuilder.


First make sure that equals is implemented correctly. From an IBM DeveloperWorks article:

  • Symmetry: For two references, a and b, a.equals(b) if and only if b.equals(a)
  • Reflexivity: For all non-null references, a.equals(a)
  • Transitivity: If a.equals(b) and b.equals(c), then a.equals(c)

Then make sure that their relation with hashCode respects the contact (from the same article):

  • Consistency with hashCode(): Two equal objects must have the same hashCode() value

Finally a good hash function should strive to approach the ideal hash function.


For a simple class it is often easiest to implement hashCode() based on the class fields which are checked by the equals() implementation.

public class Zam {
    private String foo;
    private String bar;
    private String somethingElse;

    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }

        if (obj == null) {
            return false;
        }

        if (getClass() != obj.getClass()) {
            return false;
        }

        Zam otherObj = (Zam)obj;

        if ((getFoo() == null && otherObj.getFoo() == null) || (getFoo() != null && getFoo().equals(otherObj.getFoo()))) {
            if ((getBar() == null && otherObj. getBar() == null) || (getBar() != null && getBar().equals(otherObj. getBar()))) {
                return true;
            }
        }

        return false;
    }

    public int hashCode() {
        return (getFoo() + getBar()).hashCode();
    }

    public String getFoo() {
        return foo;
    }

    public String getBar() {
        return bar;
    }
}

The most important thing is to keep hashCode() and equals() consistent: if equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values.


The standard implementation is weak and using it leads to unnecessary collisions. Imagine a

class ListPair {
    List<Integer> first;
    List<Integer> second;

    ListPair(List<Integer> first, List<Integer> second) {
        this.first = first;
        this.second = second;
    }

    public int hashCode() {
        return Objects.hashCode(first, second);
    }

    ...
}

Now,

new ListPair(List.of(a), List.of(b, c))

and

new ListPair(List.of(b), List.of(a, c))

have the same hashCode, namely 31*(a+b) + c as the multiplier used for List.hashCode gets reused here. Obviously, collisions are unavoidable, but producing needless collisions is just... needless.

There's nothing substantially smart about using 31. The multiplier must be odd in order to avoid losing information (any even multiplier loses at least the most significant bit, multiples of four lose two, etc.). Any odd multiplier is usable. Small multipliers may lead to faster computation (the JIT can use shifts and additions), but given that multiplication has latency of only three cycles on modern Intel/AMD, this hardly matters. Small multipliers also leads to more collision for small inputs, which may be a problem sometimes.

Using a prime is pointless as primes have no meaning in the ring Z/(2**32).

So, I'd recommend using a randomly chosen big odd number (feel free to take a prime). As i86/amd64 CPUs can use a shorter instruction for operands fitting in a single signed byte, there is a tiny speed advantage for multipliers like 109. For minimizing collisions, take something like 0x58a54cf5.

Using different multipliers in different places is helpful, but probably not enough to justify the additional work.


about8.blogspot.com, you said

if equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values

I cannot agree with you. If two objects have the same hashcode it doesn't have to mean that they are equal.

If A equals B then A.hashcode must be equal to B.hascode

but

if A.hashcode equals B.hascode it does not mean that A must equals B


If you use eclipse, you can generate equals() and hashCode() using:

Source -> Generate hashCode() and equals().

Using this function you can decide which fields you want to use for equality and hash code calculation, and Eclipse generates the corresponding methods.


It is better to use the functionality provided by Eclipse which does a pretty good job and you can put your efforts and energy in developing the business logic.


Although this is linked to Android documentation (Wayback Machine) and My own code on Github, it will work for Java in general. My answer is an extension of dmeister's Answer with just code that is much easier to read and understand.

@Override 
public int hashCode() {

    // Start with a non-zero constant. Prime is preferred
    int result = 17;

    // Include a hash for each field.

    // Primatives

    result = 31 * result + (booleanField ? 1 : 0);                   // 1 bit   » 32-bit

    result = 31 * result + byteField;                                // 8 bits  » 32-bit 
    result = 31 * result + charField;                                // 16 bits » 32-bit
    result = 31 * result + shortField;                               // 16 bits » 32-bit
    result = 31 * result + intField;                                 // 32 bits » 32-bit

    result = 31 * result + (int)(longField ^ (longField >>> 32));    // 64 bits » 32-bit

    result = 31 * result + Float.floatToIntBits(floatField);         // 32 bits » 32-bit

    long doubleFieldBits = Double.doubleToLongBits(doubleField);     // 64 bits (double) » 64-bit (long) » 32-bit (int)
    result = 31 * result + (int)(doubleFieldBits ^ (doubleFieldBits >>> 32));

    // Objects

    result = 31 * result + Arrays.hashCode(arrayField);              // var bits » 32-bit

    result = 31 * result + referenceField.hashCode();                // var bits » 32-bit (non-nullable)   
    result = 31 * result +                                           // var bits » 32-bit (nullable)   
        (nullableReferenceField == null
            ? 0
            : nullableReferenceField.hashCode());

    return result;

}

EDIT

Typically, when you override hashcode(...), you also want to override equals(...). So for those that will or has already implemented equals, here is a good reference from my Github...

@Override
public boolean equals(Object o) {

    // Optimization (not required).
    if (this == o) {
        return true;
    }

    // Return false if the other object has the wrong type, interface, or is null.
    if (!(o instanceof MyType)) {
        return false;
    }

    MyType lhs = (MyType) o; // lhs means "left hand side"

            // Primitive fields
    return     booleanField == lhs.booleanField
            && byteField    == lhs.byteField
            && charField    == lhs.charField
            && shortField   == lhs.shortField
            && intField     == lhs.intField
            && longField    == lhs.longField
            && floatField   == lhs.floatField
            && doubleField  == lhs.doubleField

            // Arrays

            && Arrays.equals(arrayField, lhs.arrayField)

            // Objects

            && referenceField.equals(lhs.referenceField)
            && (nullableReferenceField == null
                        ? lhs.nullableReferenceField == null
                        : nullableReferenceField.equals(lhs.nullableReferenceField));
}

I use a tiny wrapper around Arrays.deepHashCode(...) because it handles arrays supplied as parameters correctly

public static int hash(final Object... objects) {
    return Arrays.deepHashCode(objects);
}

Use the reflection methods on Apache Commons EqualsBuilder and HashCodeBuilder.


As you specifically asked for collections, I'd like to add an aspect that the other answers haven't mentioned yet: A HashMap doesn't expect their keys to change their hashcode once they are added to the collection. Would defeat the whole purpose...


@about8 : there is a pretty serious bug there.

Zam obj1 = new Zam("foo", "bar", "baz");
Zam obj2 = new Zam("fo", "obar", "baz");

same hashcode

you probably want something like

public int hashCode() {
    return (getFoo().hashCode() + getBar().hashCode()).toString().hashCode();

(can you get hashCode directly from int in Java these days? I think it does some autocasting.. if that's the case, skip the toString, it's ugly.)


It is better to use the functionality provided by Eclipse which does a pretty good job and you can put your efforts and energy in developing the business logic.


Just a quick note for completing other more detailed answer (in term of code):

If I consider the question how-do-i-create-a-hash-table-in-java and especially the jGuru FAQ entry, I believe some other criteria upon which a hash code could be judged are:

  • synchronization (does the algo support concurrent access or not) ?
  • fail safe iteration (does the algo detect a collection which changes during iteration)
  • null value (does the hash code support null value in the collection)

If you're happy with the Effective Java implementation recommended by dmeister, you can use a library call instead of rolling your own:

@Override
public int hashCode() {
    return Objects.hashCode(this.firstName, this.lastName);
}

This requires either Guava (com.google.common.base.Objects.hashCode) or the standard library in Java 7 (java.util.Objects.hash) but works the same way.


If I understand your question correctly, you have a custom collection class (i.e. a new class that extends from the Collection interface) and you want to implement the hashCode() method.

If your collection class extends AbstractList, then you don't have to worry about it, there is already an implementation of equals() and hashCode() that works by iterating through all the objects and adding their hashCodes() together.

   public int hashCode() {
      int hashCode = 1;
      Iterator i = iterator();
      while (i.hasNext()) {
        Object obj = i.next();
        hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
      }
  return hashCode;
   }

Now if what you want is the best way to calculate the hash code for a specific class, I normally use the ^ (bitwise exclusive or) operator to process all fields that I use in the equals method:

public int hashCode(){
   return intMember ^ (stringField != null ? stringField.hashCode() : 0);
}

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to hash

php mysqli_connect: authentication method unknown to the client [caching_sha2_password] What is Hash and Range Primary Key? How to create a laravel hashed password Hashing a file in Python PHP salt and hash SHA256 for login password Append key/value pair to hash with << in Ruby Are there any SHA-256 javascript implementations that are generally considered trustworthy? How do I generate a SALT in Java for Salted-Hash? What does hash do in python? Hashing with SHA1 Algorithm in C#

Examples related to equals

this in equals method Why do we have to override the equals() method in Java? Compare two objects with .equals() and == operator Check if bash variable equals 0 Setting equal heights for div's with jQuery Java, how to compare Strings with String Arrays How can I express that two values are not equal to eachother? How to override equals method in Java How do you say not equal to in Ruby? Getting an element from a Set

Examples related to hashcode

Hashing with SHA1 Algorithm in C# HashMaps and Null values? How to create a HashMap with two keys (Key-Pair, Value)? What is hashCode used for? Is it unique? How does a Java HashMap handle different objects with the same hash code? Hashcode and Equals for Hashset What is the use of hashCode in Java? Good Hash Function for Strings Why do I need to override the equals and hashCode methods in Java? Memory address of variables in Java