[java] What is String pool in Java?

I am confused about StringPool in Java. I came across this while reading the String chapter in Java. Please help me understand, in layman terms, what StringPool actually does.

This question is related to java

The answer is


This prints true (even though we don't use equals method: correct way to compare strings)

    String s = "a" + "bc";
    String t = "ab" + "c";
    System.out.println(s == t);

When compiler optimizes your string literals, it sees that both s and t have same value and thus you need only one string object. It's safe because String is immutable in Java.
As result, both s and t point to the same object and some little memory saved.

Name 'string pool' comes from the idea that all already defined string are stored in some 'pool' and before creating new String object compiler checks if such string is already defined.


Let's start with a quote from the virtual machine spec:

Loading of a class or interface that contains a String literal may create a new String object (ยง2.4.8) to represent that literal. This may not occur if the a String object has already been created to represent a previous occurrence of that literal, or if the String.intern method has been invoked on a String object representing the same string as the literal.

This may not occur - This is a hint, that there's something special about String objects. Usually, invoking a constructor will always create a new instance of the class. This is not the case with Strings, especially when String objects are 'created' with literals. Those Strings are stored in a global store (pool) - or at least the references are kept in a pool, and whenever a new instance of an already known Strings is needed, the vm returns a reference to the object from the pool. In pseudo code, it may go like that:

1: a := "one" 
   --> if(pool[hash("one")] == null)  // true
           pool[hash("one") --> "one"]
       return pool[hash("one")]

2: b := "one" 
  --> if(pool[hash("one")] == null)   // false, "one" already in pool
        pool[hash("one") --> "one"]
      return pool[hash("one")] 

So in this case, variables a and b hold references to the same object. IN this case, we have (a == b) && (a.equals(b)) == true.

This is not the case if we use the constructor:

1: a := "one"
2: b := new String("one")

Again, "one" is created on the pool but then we create a new instance from the same literal, and in this case, it leads to (a == b) && (a.equals(b)) == false

So why do we have a String pool? Strings and especially String literals are widely used in typical Java code. And they are immutable. And being immutable allowed to cache String to save memory and increase performance (less effort for creation, less garbage to be collected).

As programmers we don't have to care much about the String pool, as long as we keep in mind:

  • (a == b) && (a.equals(b)) may be true or false (always use equals to compare Strings)
  • Don't use reflection to change the backing char[] of a String (as you don't know who is actualling using that String)

I don't think it actually does much, it looks like it's just a cache for string literals. If you have multiple Strings who's values are the same, they'll all point to the same string literal in the string pool.

String s1 = "Arul"; //case 1 
String s2 = "Arul"; //case 2 

In case 1, literal s1 is created newly and kept in the pool. But in case 2, literal s2 refer the s1, it will not create new one instead.

if(s1 == s2) System.out.println("equal"); //Prints equal. 

String n1 = new String("Arul"); 
String n2 = new String("Arul"); 
if(n1 == n2) System.out.println("equal"); //No output.  

http://p2p.wrox.com/java-espanol/29312-string-pooling.html


When the JVM loads classes, or otherwise sees a literal string, or some code interns a string, it adds the string to a mostly-hidden lookup table that has one copy of each such string. If another copy is added, the runtime arranges it so that all the literals refer to the same string object. This is called "interning". If you say something like

String s = "test";
return (s == "test");

it'll return true, because the first and second "test" are actually the same object. Comparing interned strings this way can be much, much faster than String.equals, as there's a single reference comparison rather than a bunch of char comparisons.

You can add a string to the pool by calling String.intern(), which will give you back the pooled version of the string (which could be the same string you're interning, but you'd be crazy to rely on that -- you often can't be sure exactly what code has been loaded and run up til now and interned the same string). The pooled version (the string returned from intern) will be equal to any identical literal. For example:

String s1 = "test";
String s2 = new String("test");  // "new String" guarantees a different object

System.out.println(s1 == s2);  // should print "false"

s2 = s2.intern();
System.out.println(s1 == s2);  // should print "true"