[java] Most efficient way to increment a Map value in Java

I hope this question is not considered too basic for this forum, but we'll see. I'm wondering how to refactor some code for better performance that is getting run a bunch of times.

Say I'm creating a word frequency list, using a Map (probably a HashMap), where each key is a String with the word that's being counted and the value is an Integer that's incremented each time a token of the word is found.

In Perl, incrementing such a value would be trivially easy:

$map{$word}++;

But in Java, it's much more complicated. Here the way I'm currently doing it:

int count = map.containsKey(word) ? map.get(word) : 0;
map.put(word, count + 1);

Which of course relies on the autoboxing feature in the newer Java versions. I wonder if you can suggest a more efficient way of incrementing such a value. Are there even good performance reasons for eschewing the Collections framework and using a something else instead?

Update: I've done a test of several of the answers. See below.

This question is related to java optimization collections

The answer is


Quite simple, just use the built-in function in Map.java as followed

map.put(key, map.getOrDefault(key, 0) + 1);

If you're using Eclipse Collections, you can use a HashBag. It will be the most efficient approach in terms of memory usage and it will also perform well in terms of execution speed.

HashBag is backed by a MutableObjectIntMap which stores primitive ints instead of Counter objects. This reduces memory overhead and improves execution speed.

HashBag provides the API you'd need since it's a Collection that also allows you to query for the number of occurrences of an item.

Here's an example from the Eclipse Collections Kata.

MutableBag<String> bag =
  HashBag.newBagWith("one", "two", "two", "three", "three", "three");

Assert.assertEquals(3, bag.occurrencesOf("three"));

bag.add("one");
Assert.assertEquals(2, bag.occurrencesOf("one"));

bag.addOccurrences("one", 4);
Assert.assertEquals(6, bag.occurrencesOf("one"));

Note: I am a committer for Eclipse Collections.


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


I don't know how efficient it is but the below code works as well.You need to define a BiFunction at the beginning. Plus, you can make more than just increment with this method.

public static Map<String, Integer> strInt = new HashMap<String, Integer>();

public static void main(String[] args) {
    BiFunction<Integer, Integer, Integer> bi = (x,y) -> {
        if(x == null)
            return y;
        return x+y;
    };
    strInt.put("abc", 0);


    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abcd", 1, bi);

    System.out.println(strInt.get("abc"));
    System.out.println(strInt.get("abcd"));
}

output is

3
1

You can make use of computeIfAbsent method in Map interface provided in Java 8.

final Map<String,AtomicLong> map = new ConcurrentHashMap<>();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("B", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet(); //[A=2, B=1]

The method computeIfAbsent checks if the specified key is already associated with a value or not? If no associated value then it attempts to compute its value using the given mapping function. In any case it returns the current (existing or computed) value associated with the specified key, or null if the computed value is null.

On a side note if you have a situation where multiple threads update a common sum you can have a look at LongAdder class.Under high contention, expected throughput of this class is significantly higher than AtomicLong, at the expense of higher space consumption.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


The Functional Java library's TreeMap datastructure has an update method in the latest trunk head:

public TreeMap<K, V> update(final K k, final F<V, V> f)

Example usage:

import static fj.data.TreeMap.empty;
import static fj.function.Integers.add;
import static fj.pre.Ord.stringOrd;
import fj.data.TreeMap;

public class TreeMap_Update
  {public static void main(String[] a)
    {TreeMap<String, Integer> map = empty(stringOrd);
     map = map.set("foo", 1);
     map = map.update("foo", add.f(1));
     System.out.println(map.get("foo").some());}}

This program prints "2".


"put" need "get" (to ensure no duplicate key).
So directly do a "put",
and if there was a previous value, then do an addition:

Map map = new HashMap ();

MutableInt newValue = new MutableInt (1); // default = inc
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.add(oldValue); // old + inc
}

If count starts at 0, then add 1: (or any others values...)

Map map = new HashMap ();

MutableInt newValue = new MutableInt (0); // default
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.setValue(oldValue + 1); // old + inc
}

Notice : This code is not thread safe. Use it to build then use the map, not to concurrently update it.

Optimization : In a loop, keep old value to become the new value of next loop.

Map map = new HashMap ();
final int defaut = 0;
final int inc = 1;

MutableInt oldValue = new MutableInt (default);
while(true) {
  MutableInt newValue = oldValue;

  oldValue = map.put (key, newValue); // insert or...
  if (oldValue != null) {
    newValue.setValue(oldValue + inc); // ...update

    oldValue.setValue(default); // reuse
  } else
    oldValue = new MutableInt (default); // renew
  }
}

@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


I suggest to use Java 8 Map::compute(). It considers the case when a key doesn't exist, too.

Map.compute(num, (k, v) -> (v == null) ? 1 : v + 1);

You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


Google Guava is your friend...

...at least in some cases. They have this nice AtomicLongMap. Especially nice because you are dealing with long as value in your map.

E.g.

AtomicLongMap<String> map = AtomicLongMap.create();
[...]
map.getAndIncrement(word);

Also possible to add more then 1 to the value:

map.getAndAdd(word, 112L); 

As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


"put" need "get" (to ensure no duplicate key).
So directly do a "put",
and if there was a previous value, then do an addition:

Map map = new HashMap ();

MutableInt newValue = new MutableInt (1); // default = inc
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.add(oldValue); // old + inc
}

If count starts at 0, then add 1: (or any others values...)

Map map = new HashMap ();

MutableInt newValue = new MutableInt (0); // default
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.setValue(oldValue + 1); // old + inc
}

Notice : This code is not thread safe. Use it to build then use the map, not to concurrently update it.

Optimization : In a loop, keep old value to become the new value of next loop.

Map map = new HashMap ();
final int defaut = 0;
final int inc = 1;

MutableInt oldValue = new MutableInt (default);
while(true) {
  MutableInt newValue = oldValue;

  oldValue = map.put (key, newValue); // insert or...
  if (oldValue != null) {
    newValue.setValue(oldValue + inc); // ...update

    oldValue.setValue(default); // reuse
  } else
    oldValue = new MutableInt (default); // renew
  }
}

Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code

Best results per method (smaller is better):

                 time, ms
kolobokeCompile  18.8
koloboke         19.8
trove            20.8
fastutil         22.7
mutableInt       24.3
atomicInteger    25.3
eclipse          26.9
hashMap          28.0
hppc             33.6
hppcRt           36.5

Time\space results:


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


A variation on the MutableInt approach that might be even faster, if a bit of a hack, is to use a single-element int array:

Map<String,int[]> map = new HashMap<String,int[]>();
...
int[] value = map.get(key);
if (value == null) 
  map.put(key, new int[]{1} );
else
  ++value[0];

It would be interesting if you could rerun your performance tests with this variation. It might be the fastest.


Edit: The above pattern worked fine for me, but eventually I changed to use Trove's collections to reduce memory size in some very large maps I was creating -- and as a bonus it was also faster.

One really nice feature is that the TObjectIntHashMap class has a single adjustOrPutValue call that, depending on whether there is already a value at that key, will either put an initial value or increment the existing value. This is perfect for incrementing:

TObjectIntHashMap<String> map = new TObjectIntHashMap<String>();
...
map.adjustOrPutValue(key, 1, 1);

Hope I'm understanding your question correctly, I'm coming to Java from Python so I can empathize with your struggle.

if you have

map.put(key, 1)

you would do

map.put(key, map.get(key) + 1)

Hope this helps!


The simple and easy way in java 8 is the following:

final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.computeIfAbsent("foo", key -> new AtomicLong(0)).incrementAndGet();

I suggest to use Java 8 Map::compute(). It considers the case when a key doesn't exist, too.

Map.compute(num, (k, v) -> (v == null) ? 1 : v + 1);

Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


Google Collections HashMultiset :
- quite elegant to use
- but consume CPU and memory

Best would be to have a method like : Entry<K,V> getOrPut(K); (elegant, and low cost)

Such a method will compute hash and index only once, and then we could do what we want with the entry (either replace or update the value).

More elegant:
- take a HashSet<Entry>
- extend it so that get(K) put a new Entry if needed
- Entry could be your own object.
--> (new MyHashSet()).get(k).increment();


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


Google Guava is your friend...

...at least in some cases. They have this nice AtomicLongMap. Especially nice because you are dealing with long as value in your map.

E.g.

AtomicLongMap<String> map = AtomicLongMap.create();
[...]
map.getAndIncrement(word);

Also possible to add more then 1 to the value:

map.getAndAdd(word, 112L); 

A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code

Best results per method (smaller is better):

                 time, ms
kolobokeCompile  18.8
koloboke         19.8
trove            20.8
fastutil         22.7
mutableInt       24.3
atomicInteger    25.3
eclipse          26.9
hashMap          28.0
hppc             33.6
hppcRt           36.5

Time\space results:


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


"put" need "get" (to ensure no duplicate key).
So directly do a "put",
and if there was a previous value, then do an addition:

Map map = new HashMap ();

MutableInt newValue = new MutableInt (1); // default = inc
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.add(oldValue); // old + inc
}

If count starts at 0, then add 1: (or any others values...)

Map map = new HashMap ();

MutableInt newValue = new MutableInt (0); // default
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.setValue(oldValue + 1); // old + inc
}

Notice : This code is not thread safe. Use it to build then use the map, not to concurrently update it.

Optimization : In a loop, keep old value to become the new value of next loop.

Map map = new HashMap ();
final int defaut = 0;
final int inc = 1;

MutableInt oldValue = new MutableInt (default);
while(true) {
  MutableInt newValue = oldValue;

  oldValue = map.put (key, newValue); // insert or...
  if (oldValue != null) {
    newValue.setValue(oldValue + inc); // ...update

    oldValue.setValue(default); // reuse
  } else
    oldValue = new MutableInt (default); // renew
  }
}

The simple and easy way in java 8 is the following:

final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.computeIfAbsent("foo", key -> new AtomicLong(0)).incrementAndGet();

The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


You can make use of computeIfAbsent method in Map interface provided in Java 8.

final Map<String,AtomicLong> map = new ConcurrentHashMap<>();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("B", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet(); //[A=2, B=1]

The method computeIfAbsent checks if the specified key is already associated with a value or not? If no associated value then it attempts to compute its value using the given mapping function. In any case it returns the current (existing or computed) value associated with the specified key, or null if the computed value is null.

On a side note if you have a situation where multiple threads update a common sum you can have a look at LongAdder class.Under high contention, expected throughput of this class is significantly higher than AtomicLong, at the expense of higher space consumption.


Google Guava is your friend...

...at least in some cases. They have this nice AtomicLongMap. Especially nice because you are dealing with long as value in your map.

E.g.

AtomicLongMap<String> map = AtomicLongMap.create();
[...]
map.getAndIncrement(word);

Also possible to add more then 1 to the value:

map.getAndAdd(word, 112L); 

Now there is a shorter way with Java 8 using Map::merge.

myMap.merge(key, 1, Integer::sum)

What it does:

  • if key do not exists, put 1 as value
  • otherwise sum 1 to the value linked to key

More information here.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


The simple and easy way in java 8 is the following:

final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.computeIfAbsent("foo", key -> new AtomicLong(0)).incrementAndGet();

Now there is a shorter way with Java 8 using Map::merge.

myMap.merge(key, 1, Integer::sum)

What it does:

  • if key do not exists, put 1 as value
  • otherwise sum 1 to the value linked to key

More information here.


I don't know how efficient it is but the below code works as well.You need to define a BiFunction at the beginning. Plus, you can make more than just increment with this method.

public static Map<String, Integer> strInt = new HashMap<String, Integer>();

public static void main(String[] args) {
    BiFunction<Integer, Integer, Integer> bi = (x,y) -> {
        if(x == null)
            return y;
        return x+y;
    };
    strInt.put("abc", 0);


    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abcd", 1, bi);

    System.out.println(strInt.get("abc"));
    System.out.println(strInt.get("abcd"));
}

output is

3
1

Since a lot of people search Java topics for Groovy answers, here's how you can do it in Groovy:

dev map = new HashMap<String, Integer>()
map.put("key1", 3)

map.merge("key1", 1) {a, b -> a + b}
map.merge("key2", 1) {a, b -> a + b}

The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


If you're using Eclipse Collections, you can use a HashBag. It will be the most efficient approach in terms of memory usage and it will also perform well in terms of execution speed.

HashBag is backed by a MutableObjectIntMap which stores primitive ints instead of Counter objects. This reduces memory overhead and improves execution speed.

HashBag provides the API you'd need since it's a Collection that also allows you to query for the number of occurrences of an item.

Here's an example from the Eclipse Collections Kata.

MutableBag<String> bag =
  HashBag.newBagWith("one", "two", "two", "three", "three", "three");

Assert.assertEquals(3, bag.occurrencesOf("three"));

bag.add("one");
Assert.assertEquals(2, bag.occurrencesOf("one"));

bag.addOccurrences("one", 4);
Assert.assertEquals(6, bag.occurrencesOf("one"));

Note: I am a committer for Eclipse Collections.


Google Collections HashMultiset :
- quite elegant to use
- but consume CPU and memory

Best would be to have a method like : Entry<K,V> getOrPut(K); (elegant, and low cost)

Such a method will compute hash and index only once, and then we could do what we want with the entry (either replace or update the value).

More elegant:
- take a HashSet<Entry>
- extend it so that get(K) put a new Entry if needed
- Entry could be your own object.
--> (new MyHashSet()).get(k).increment();


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

The Functional Java library's TreeMap datastructure has an update method in the latest trunk head:

public TreeMap<K, V> update(final K k, final F<V, V> f)

Example usage:

import static fj.data.TreeMap.empty;
import static fj.function.Integers.add;
import static fj.pre.Ord.stringOrd;
import fj.data.TreeMap;

public class TreeMap_Update
  {public static void main(String[] a)
    {TreeMap<String, Integer> map = empty(stringOrd);
     map = map.set("foo", 1);
     map = map.update("foo", add.f(1));
     System.out.println(map.get("foo").some());}}

This program prints "2".


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


Hope I'm understanding your question correctly, I'm coming to Java from Python so I can empathize with your struggle.

if you have

map.put(key, 1)

you would do

map.put(key, map.get(key) + 1)

Hope this helps!


You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


If you're using Eclipse Collections, you can use a HashBag. It will be the most efficient approach in terms of memory usage and it will also perform well in terms of execution speed.

HashBag is backed by a MutableObjectIntMap which stores primitive ints instead of Counter objects. This reduces memory overhead and improves execution speed.

HashBag provides the API you'd need since it's a Collection that also allows you to query for the number of occurrences of an item.

Here's an example from the Eclipse Collections Kata.

MutableBag<String> bag =
  HashBag.newBagWith("one", "two", "two", "three", "three", "three");

Assert.assertEquals(3, bag.occurrencesOf("three"));

bag.add("one");
Assert.assertEquals(2, bag.occurrencesOf("one"));

bag.addOccurrences("one", 4);
Assert.assertEquals(6, bag.occurrencesOf("one"));

Note: I am a committer for Eclipse Collections.


I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


Quite simple, just use the built-in function in Map.java as followed

map.put(key, map.getOrDefault(key, 0) + 1);

It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


Since a lot of people search Java topics for Groovy answers, here's how you can do it in Groovy:

dev map = new HashMap<String, Integer>()
map.put("key1", 3)

map.merge("key1", 1) {a, b -> a + b}
map.merge("key2", 1) {a, b -> a + b}

There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

Quite simple, just use the built-in function in Map.java as followed

map.put(key, map.getOrDefault(key, 0) + 1);

If you're using Eclipse Collections, you can use a HashBag. It will be the most efficient approach in terms of memory usage and it will also perform well in terms of execution speed.

HashBag is backed by a MutableObjectIntMap which stores primitive ints instead of Counter objects. This reduces memory overhead and improves execution speed.

HashBag provides the API you'd need since it's a Collection that also allows you to query for the number of occurrences of an item.

Here's an example from the Eclipse Collections Kata.

MutableBag<String> bag =
  HashBag.newBagWith("one", "two", "two", "three", "three", "three");

Assert.assertEquals(3, bag.occurrencesOf("three"));

bag.add("one");
Assert.assertEquals(2, bag.occurrencesOf("one"));

bag.addOccurrences("one", 4);
Assert.assertEquals(6, bag.occurrencesOf("one"));

Note: I am a committer for Eclipse Collections.


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code

Best results per method (smaller is better):

                 time, ms
kolobokeCompile  18.8
koloboke         19.8
trove            20.8
fastutil         22.7
mutableInt       24.3
atomicInteger    25.3
eclipse          26.9
hashMap          28.0
hppc             33.6
hppcRt           36.5

Time\space results:


Google Guava is your friend...

...at least in some cases. They have this nice AtomicLongMap. Especially nice because you are dealing with long as value in your map.

E.g.

AtomicLongMap<String> map = AtomicLongMap.create();
[...]
map.getAndIncrement(word);

Also possible to add more then 1 to the value:

map.getAndAdd(word, 112L); 

A variation on the MutableInt approach that might be even faster, if a bit of a hack, is to use a single-element int array:

Map<String,int[]> map = new HashMap<String,int[]>();
...
int[] value = map.get(key);
if (value == null) 
  map.put(key, new int[]{1} );
else
  ++value[0];

It would be interesting if you could rerun your performance tests with this variation. It might be the fastest.


Edit: The above pattern worked fine for me, but eventually I changed to use Trove's collections to reduce memory size in some very large maps I was creating -- and as a bonus it was also faster.

One really nice feature is that the TObjectIntHashMap class has a single adjustOrPutValue call that, depending on whether there is already a value at that key, will either put an initial value or increment the existing value. This is perfect for incrementing:

TObjectIntHashMap<String> map = new TObjectIntHashMap<String>();
...
map.adjustOrPutValue(key, 1, 1);

You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


Google Collections HashMultiset :
- quite elegant to use
- but consume CPU and memory

Best would be to have a method like : Entry<K,V> getOrPut(K); (elegant, and low cost)

Such a method will compute hash and index only once, and then we could do what we want with the entry (either replace or update the value).

More elegant:
- take a HashSet<Entry>
- extend it so that get(K) put a new Entry if needed
- Entry could be your own object.
--> (new MyHashSet()).get(k).increment();


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


The simple and easy way in java 8 is the following:

final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.computeIfAbsent("foo", key -> new AtomicLong(0)).incrementAndGet();

You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


I suggest to use Java 8 Map::compute(). It considers the case when a key doesn't exist, too.

Map.compute(num, (k, v) -> (v == null) ? 1 : v + 1);

The Functional Java library's TreeMap datastructure has an update method in the latest trunk head:

public TreeMap<K, V> update(final K k, final F<V, V> f)

Example usage:

import static fj.data.TreeMap.empty;
import static fj.function.Integers.add;
import static fj.pre.Ord.stringOrd;
import fj.data.TreeMap;

public class TreeMap_Update
  {public static void main(String[] a)
    {TreeMap<String, Integer> map = empty(stringOrd);
     map = map.set("foo", 1);
     map = map.update("foo", add.f(1));
     System.out.println(map.get("foo").some());}}

This program prints "2".


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

Google Collections HashMultiset :
- quite elegant to use
- but consume CPU and memory

Best would be to have a method like : Entry<K,V> getOrPut(K); (elegant, and low cost)

Such a method will compute hash and index only once, and then we could do what we want with the entry (either replace or update the value).

More elegant:
- take a HashSet<Entry>
- extend it so that get(K) put a new Entry if needed
- Entry could be your own object.
--> (new MyHashSet()).get(k).increment();


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


Since a lot of people search Java topics for Groovy answers, here's how you can do it in Groovy:

dev map = new HashMap<String, Integer>()
map.put("key1", 3)

map.merge("key1", 1) {a, b -> a + b}
map.merge("key2", 1) {a, b -> a + b}

Hope I'm understanding your question correctly, I'm coming to Java from Python so I can empathize with your struggle.

if you have

map.put(key, 1)

you would do

map.put(key, map.get(key) + 1)

Hope this helps!


You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


Since a lot of people search Java topics for Groovy answers, here's how you can do it in Groovy:

dev map = new HashMap<String, Integer>()
map.put("key1", 3)

map.merge("key1", 1) {a, b -> a + b}
map.merge("key2", 1) {a, b -> a + b}

Are you sure that this is a bottleneck? Have you done any performance analysis?

Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.

Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


Another way would be creating a mutable integer:

class MutableInt {
  int value = 0;
  public void inc () { ++value; }
  public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
  value = new MutableInt ();
  map.put (key, value);
} else {
  value.inc ();
}

of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.


Now there is a shorter way with Java 8 using Map::merge.

myMap.merge(key, 1, Integer::sum)

What it does:

  • if key do not exists, put 1 as value
  • otherwise sum 1 to the value linked to key

More information here.


You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


The Functional Java library's TreeMap datastructure has an update method in the latest trunk head:

public TreeMap<K, V> update(final K k, final F<V, V> f)

Example usage:

import static fj.data.TreeMap.empty;
import static fj.function.Integers.add;
import static fj.pre.Ord.stringOrd;
import fj.data.TreeMap;

public class TreeMap_Update
  {public static void main(String[] a)
    {TreeMap<String, Integer> map = empty(stringOrd);
     map = map.set("foo", 1);
     map = map.update("foo", add.f(1));
     System.out.println(map.get("foo").some());}}

This program prints "2".


The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


There are a couple of approaches:

  1. Use a Bag alorithm like the sets contained in Google Collections.

  2. Create mutable container which you can use in the Map:


    class My{
        String word;
        int count;
    }

And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.

Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.

Counting words using Google Collections, looks something like this:



    HashMultiset s = new HashMultiset();
    s.add("word");
    s.add("word");
    System.out.println(""+s.count("word") );


Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.


A variation on the MutableInt approach that might be even faster, if a bit of a hack, is to use a single-element int array:

Map<String,int[]> map = new HashMap<String,int[]>();
...
int[] value = map.get(key);
if (value == null) 
  map.put(key, new int[]{1} );
else
  ++value[0];

It would be interesting if you could rerun your performance tests with this variation. It might be the fastest.


Edit: The above pattern worked fine for me, but eventually I changed to use Trove's collections to reduce memory size in some very large maps I was creating -- and as a bonus it was also faster.

One really nice feature is that the TObjectIntHashMap class has a single adjustOrPutValue call that, depending on whether there is already a value at that key, will either put an initial value or increment the existing value. This is perfect for incrementing:

TObjectIntHashMap<String> map = new TObjectIntHashMap<String>();
...
map.adjustOrPutValue(key, 1, 1);

Map<String, Integer> map = new HashMap<>();
String key = "a random key";
int count = map.getOrDefault(key, 0); // ensure count will be one of 0,1,2,3,...
map.put(key, count + 1);

And that's how you increment a value with simple code.

Benefit:

  • No need to add a new class or use another concept of mutable int
  • Not relying on any library
  • Easy to understand what's going on exactly (Not too much abstraction)

Downside:

  • The hash map will be searched twice for get() and put(). So it will not be the most performant code.

Theoretically, once you call get(), you already know where to put(), so you should not have to search again. But searching in hash map usually takes a very minimal time that you can kind of ignore this performance issue.

But if you are very serious about the issue, you are a perfectionist, another way is to use merge method, this is (probably) more efficient than the previous code snippet as you will be (theoretically) searching the map only once: (though this code is not obvious from first sight, it's short and performant)

map.merge(key, 1, (a,b) -> a+b);

Suggestion: you should care about code readability more than little performance gain in most of the time. If the first code snippet is easier for you to understand then use it. But if you are able to understand the 2nd one fine then you can also go for it!


Hope I'm understanding your question correctly, I'm coming to Java from Python so I can empathize with your struggle.

if you have

map.put(key, 1)

you would do

map.put(key, map.get(key) + 1)

Hope this helps!


The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


I think your solution would be the standard way, but - as you noted yourself - it is probably not the fastest way possible.

You may look at GNU Trove. That is a library which contains all sorts of fast primitive Collections. Your example would use a TObjectIntHashMap which has a method adjustOrPutValue which does exactly what you want.


The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


"put" need "get" (to ensure no duplicate key).
So directly do a "put",
and if there was a previous value, then do an addition:

Map map = new HashMap ();

MutableInt newValue = new MutableInt (1); // default = inc
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.add(oldValue); // old + inc
}

If count starts at 0, then add 1: (or any others values...)

Map map = new HashMap ();

MutableInt newValue = new MutableInt (0); // default
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
  newValue.setValue(oldValue + 1); // old + inc
}

Notice : This code is not thread safe. Use it to build then use the map, not to concurrently update it.

Optimization : In a loop, keep old value to become the new value of next loop.

Map map = new HashMap ();
final int defaut = 0;
final int inc = 1;

MutableInt oldValue = new MutableInt (default);
while(true) {
  MutableInt newValue = oldValue;

  oldValue = map.put (key, newValue); // insert or...
  if (oldValue != null) {
    newValue.setValue(oldValue + inc); // ...update

    oldValue.setValue(default); // reuse
  } else
    oldValue = new MutableInt (default); // renew
  }
}

I suggest to use Java 8 Map::compute(). It considers the case when a key doesn't exist, too.

Map.compute(num, (k, v) -> (v == null) ? 1 : v + 1);

@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code

Best results per method (smaller is better):

                 time, ms
kolobokeCompile  18.8
koloboke         19.8
trove            20.8
fastutil         22.7
mutableInt       24.3
atomicInteger    25.3
eclipse          26.9
hashMap          28.0
hppc             33.6
hppcRt           36.5

Time\space results:


The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


@Vilmantas Baranauskas: Regarding this answer, I would comment if I had the rep points, but I don't. I wanted to note that the Counter class defined there is NOT thread-safe as it is not sufficient to just synchronize inc() without synchronizing value(). Other threads calling value() are not guaranteed to see the the value unless a happens-before relationship has been established with the update.


Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.

    Integer count = map.get(word);
    if(count == null){
        count = 0;
    }
    map.put(word, count + 1);

As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


You can make use of computeIfAbsent method in Map interface provided in Java 8.

final Map<String,AtomicLong> map = new ConcurrentHashMap<>();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("B", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet(); //[A=2, B=1]

The method computeIfAbsent checks if the specified key is already associated with a value or not? If no associated value then it attempts to compute its value using the given mapping function. In any case it returns the current (existing or computed) value associated with the specified key, or null if the computed value is null.

On a side note if you have a situation where multiple threads update a common sum you can have a look at LongAdder class.Under high contention, expected throughput of this class is significantly higher than AtomicLong, at the expense of higher space consumption.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


Map<String, Integer> map = new HashMap<>();
String key = "a random key";
int count = map.getOrDefault(key, 0); // ensure count will be one of 0,1,2,3,...
map.put(key, count + 1);

And that's how you increment a value with simple code.

Benefit:

  • No need to add a new class or use another concept of mutable int
  • Not relying on any library
  • Easy to understand what's going on exactly (Not too much abstraction)

Downside:

  • The hash map will be searched twice for get() and put(). So it will not be the most performant code.

Theoretically, once you call get(), you already know where to put(), so you should not have to search again. But searching in hash map usually takes a very minimal time that you can kind of ignore this performance issue.

But if you are very serious about the issue, you are a perfectionist, another way is to use merge method, this is (probably) more efficient than the previous code snippet as you will be (theoretically) searching the map only once: (though this code is not obvious from first sight, it's short and performant)

map.merge(key, 1, (a,b) -> a+b);

Suggestion: you should care about code readability more than little performance gain in most of the time. If the first code snippet is easier for you to understand then use it. But if you are able to understand the 2nd one fine then you can also go for it!


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


I don't know how efficient it is but the below code works as well.You need to define a BiFunction at the beginning. Plus, you can make more than just increment with this method.

public static Map<String, Integer> strInt = new HashMap<String, Integer>();

public static void main(String[] args) {
    BiFunction<Integer, Integer, Integer> bi = (x,y) -> {
        if(x == null)
            return y;
        return x+y;
    };
    strInt.put("abc", 0);


    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abcd", 1, bi);

    System.out.println(strInt.get("abc"));
    System.out.println(strInt.get("abcd"));
}

output is

3
1

The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.


I don't know how efficient it is but the below code works as well.You need to define a BiFunction at the beginning. Plus, you can make more than just increment with this method.

public static Map<String, Integer> strInt = new HashMap<String, Integer>();

public static void main(String[] args) {
    BiFunction<Integer, Integer, Integer> bi = (x,y) -> {
        if(x == null)
            return y;
        return x+y;
    };
    strInt.put("abc", 0);


    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abc", 1, bi);
    strInt.merge("abcd", 1, bi);

    System.out.println(strInt.get("abc"));
    System.out.println(strInt.get("abcd"));
}

output is

3
1

Map<String, Integer> map = new HashMap<>();
String key = "a random key";
int count = map.getOrDefault(key, 0); // ensure count will be one of 0,1,2,3,...
map.put(key, count + 1);

And that's how you increment a value with simple code.

Benefit:

  • No need to add a new class or use another concept of mutable int
  • Not relying on any library
  • Easy to understand what's going on exactly (Not too much abstraction)

Downside:

  • The hash map will be searched twice for get() and put(). So it will not be the most performant code.

Theoretically, once you call get(), you already know where to put(), so you should not have to search again. But searching in hash map usually takes a very minimal time that you can kind of ignore this performance issue.

But if you are very serious about the issue, you are a perfectionist, another way is to use merge method, this is (probably) more efficient than the previous code snippet as you will be (theoretically) searching the map only once: (though this code is not obvious from first sight, it's short and performant)

map.merge(key, 1, (a,b) -> a+b);

Suggestion: you should care about code readability more than little performance gain in most of the time. If the first code snippet is easier for you to understand then use it. But if you are able to understand the 2nd one fine then you can also go for it!


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


You can make use of computeIfAbsent method in Map interface provided in Java 8.

final Map<String,AtomicLong> map = new ConcurrentHashMap<>();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("B", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet(); //[A=2, B=1]

The method computeIfAbsent checks if the specified key is already associated with a value or not? If no associated value then it attempts to compute its value using the given mapping function. In any case it returns the current (existing or computed) value associated with the specified key, or null if the computed value is null.

On a side note if you have a situation where multiple threads update a common sum you can have a look at LongAdder class.Under high contention, expected throughput of this class is significantly higher than AtomicLong, at the expense of higher space consumption.


As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.

    final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
    map.putIfAbsent("foo", new AtomicLong(0));
    map.get("foo").incrementAndGet();

will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.


Map<String, Integer> map = new HashMap<>();
String key = "a random key";
int count = map.getOrDefault(key, 0); // ensure count will be one of 0,1,2,3,...
map.put(key, count + 1);

And that's how you increment a value with simple code.

Benefit:

  • No need to add a new class or use another concept of mutable int
  • Not relying on any library
  • Easy to understand what's going on exactly (Not too much abstraction)

Downside:

  • The hash map will be searched twice for get() and put(). So it will not be the most performant code.

Theoretically, once you call get(), you already know where to put(), so you should not have to search again. But searching in hash map usually takes a very minimal time that you can kind of ignore this performance issue.

But if you are very serious about the issue, you are a perfectionist, another way is to use merge method, this is (probably) more efficient than the previous code snippet as you will be (theoretically) searching the map only once: (though this code is not obvious from first sight, it's short and performant)

map.merge(key, 1, (a,b) -> a+b);

Suggestion: you should care about code readability more than little performance gain in most of the time. If the first code snippet is easier for you to understand then use it. But if you are able to understand the 2nd one fine then you can also go for it!


It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:

Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2

There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.


You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


Quite simple, just use the built-in function in Map.java as followed

map.put(key, map.getOrDefault(key, 0) + 1);

Now there is a shorter way with Java 8 using Map::merge.

myMap.merge(key, 1, Integer::sum)

What it does:

  • if key do not exists, put 1 as value
  • otherwise sum 1 to the value linked to key

More information here.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


A variation on the MutableInt approach that might be even faster, if a bit of a hack, is to use a single-element int array:

Map<String,int[]> map = new HashMap<String,int[]>();
...
int[] value = map.get(key);
if (value == null) 
  map.put(key, new int[]{1} );
else
  ++value[0];

It would be interesting if you could rerun your performance tests with this variation. It might be the fastest.


Edit: The above pattern worked fine for me, but eventually I changed to use Trove's collections to reduce memory size in some very large maps I was creating -- and as a bonus it was also faster.

One really nice feature is that the TObjectIntHashMap class has a single adjustOrPutValue call that, depending on whether there is already a value at that key, will either put an initial value or increment the existing value. This is perfect for incrementing:

TObjectIntHashMap<String> map = new TObjectIntHashMap<String>();
...
map.adjustOrPutValue(key, 1, 1);

You should be aware of the fact that your original attempt

int count = map.containsKey(word) ? map.get(word) : 0;

contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!

If you look at the API for Map, get operations usually return null when the map does not contain the requested element.

Note that this will make a solution like

map.put( key, map.get(key) + 1 );

dangerous, since it might yield NullPointerExceptions. You should check for a null first.

Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.

For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.

To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.

final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
    map.put(i + 1);
} else {
    // do something
}

If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.


I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.

Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.


Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.

If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):

static class MutableInt {
  int value = 1;
  void inc() { ++value; }
  int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
  value = new MutableInt();
  map.put(key, value);
} else {
  value.inc();
}

If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.

By the way, a good search term for this subject is "histogram".


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to optimization

Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Measuring execution time of a function in C++ GROUP BY having MAX date How to efficiently remove duplicates from an array without using Set Storing JSON in database vs. having a new column for each key Read file As String How to write a large buffer into a binary file in C++, fast? Is optimisation level -O3 dangerous in g++? Why is processing a sorted array faster than processing an unsorted array? MySQL my.cnf performance tuning recommendations

Examples related to collections

Kotlin's List missing "add", "remove", Map missing "put", etc? How to unset (remove) a collection element after fetching it? How can I get a List from some class properties with Java 8 Stream? Java 8 stream map to list of keys sorted by values How to convert String into Hashmap in java How can I turn a List of Lists into a List in Java 8? MongoDB Show all contents from all collections Get nth character of a string in Swift programming language Java 8 Distinct by property Is there a typescript List<> and/or Map<> class/library?