[java] Fastest way to iterate over all the chars in a String

In Java, what would the fastest way to iterate over all the chars in a String, this:

String str = "a really, really long string";
for (int i = 0, n = str.length(); i < n; i++) {
    char c = str.charAt(i);
}

Or this:

char[] chars = str.toCharArray();
for (int i = 0, n = chars.length; i < n; i++) {
    char c = chars[i];
}

EDIT :

What I'd like to know is if the cost of repeatedly calling the charAt method during a long iteration ends up being either less than or greater than the cost of performing a single call to toCharArray at the beginning and then directly accessing the array during the iteration.

It'd be great if someone could provide a robust benchmark for different string lengths, having in mind JIT warm-up time, JVM start-up time, etc. and not just the difference between two calls to System.currentTimeMillis().

This question is related to java string performance char iteration

The answer is


String.toCharArray() creates new char array, means allocation of memory of string length, then copies original char array of string using System.arraycopy() and then returns this copy to caller. String.charAt() returns character at position i from original copy, that's why String.charAt() will be faster than String.toCharArray(). Although, String.toCharArray() returns copy and not char from original String array, where String.charAt() returns character from original char array. Code below returns value at the specified index of this string.

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}

code below returns a newly allocated character array whose length is the length of this string

public char[] toCharArray() {
    // Cannot use Arrays.copyOf because of class initialization order issues
    char result[] = new char[value.length];
    System.arraycopy(value, 0, result, 0, value.length);
    return result;
}

The second one causes a new char array to be created, and all chars from the String to be copied to this new char array, so I would guess that the first one is faster (and less memory-hungry).


The first one using str.charAt should be faster.

If you dig inside the source code of String class, we can see that charAt is implemented as follows:

public char charAt(int index) {
    if ((index < 0) || (index >= count)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index + offset];
}

Here, all it does is index an array and return the value.

Now, if we see the implementation of toCharArray, we will find the below:

public char[] toCharArray() {
    char result[] = new char[count];
    getChars(0, count, result, 0);
    return result;
}

public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
    if (srcBegin < 0) {
        throw new StringIndexOutOfBoundsException(srcBegin);
    }
    if (srcEnd > count) {
        throw new StringIndexOutOfBoundsException(srcEnd);
    }
    if (srcBegin > srcEnd) {
        throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
    }
    System.arraycopy(value, offset + srcBegin, dst, dstBegin,
         srcEnd - srcBegin);
}

As you see, it is doing a System.arraycopy which is definitely going to be a tad slower than not doing it.


This is just micro-optimisation that you shouldn't worry about.

char[] chars = str.toCharArray();

returns you a copy of str character arrays (in JDK, it returns a copy of characters by calling System.arrayCopy).

Other than that, str.charAt() only checks if the index is indeed in bounds and returns a character within the array index.

The first one doesn't create additional memory in JVM.


Just for curiosity and to compare with Saint Hill's answer.

If you need to process heavy data you should not use JVM in client mode. Client mode is not made for optimizations.

Let's compare results of @Saint Hill benchmarks using a JVM in Client mode and Server mode.

Core2Quad Q6600 G0 @ 2.4GHz
JavaSE 1.7.0_40

See also: Real differences between "java -server" and "java -client"?


CLIENT MODE:

len =      2:    111k charAt(i),  105k cbuff[i],   62k new[i],   17k field access.   (chars/ms) 
len =      4:    285k charAt(i),  166k cbuff[i],  114k new[i],   43k field access.   (chars/ms) 
len =      6:    315k charAt(i),  230k cbuff[i],  162k new[i],   69k field access.   (chars/ms) 
len =      8:    333k charAt(i),  275k cbuff[i],  181k new[i],   85k field access.   (chars/ms) 
len =     12:    342k charAt(i),  342k cbuff[i],  222k new[i],  117k field access.   (chars/ms) 
len =     16:    363k charAt(i),  347k cbuff[i],  275k new[i],  152k field access.   (chars/ms) 
len =     20:    363k charAt(i),  392k cbuff[i],  289k new[i],  180k field access.   (chars/ms) 
len =     24:    375k charAt(i),  428k cbuff[i],  311k new[i],  205k field access.   (chars/ms) 
len =     28:    378k charAt(i),  474k cbuff[i],  341k new[i],  233k field access.   (chars/ms) 
len =     32:    376k charAt(i),  492k cbuff[i],  340k new[i],  251k field access.   (chars/ms) 
len =     64:    374k charAt(i),  551k cbuff[i],  374k new[i],  367k field access.   (chars/ms) 
len =    128:    385k charAt(i),  624k cbuff[i],  415k new[i],  509k field access.   (chars/ms) 
len =    256:    390k charAt(i),  675k cbuff[i],  436k new[i],  619k field access.   (chars/ms) 
len =    512:    394k charAt(i),  703k cbuff[i],  439k new[i],  695k field access.   (chars/ms) 
len =   1024:    395k charAt(i),  718k cbuff[i],  462k new[i],  742k field access.   (chars/ms) 
len =   2048:    396k charAt(i),  725k cbuff[i],  471k new[i],  767k field access.   (chars/ms) 
len =   4096:    396k charAt(i),  727k cbuff[i],  459k new[i],  780k field access.   (chars/ms) 
len =   8192:    397k charAt(i),  712k cbuff[i],  446k new[i],  772k field access.   (chars/ms) 

SERVER MODE:

len =      2:     86k charAt(i),   41k cbuff[i],   46k new[i],   80k field access.   (chars/ms) 
len =      4:    571k charAt(i),  250k cbuff[i],   97k new[i],  222k field access.   (chars/ms) 
len =      6:    666k charAt(i),  333k cbuff[i],  125k new[i],  315k field access.   (chars/ms) 
len =      8:    800k charAt(i),  400k cbuff[i],  181k new[i],  380k field access.   (chars/ms) 
len =     12:    800k charAt(i),  521k cbuff[i],  260k new[i],  545k field access.   (chars/ms) 
len =     16:    800k charAt(i),  592k cbuff[i],  296k new[i],  640k field access.   (chars/ms) 
len =     20:    800k charAt(i),  666k cbuff[i],  408k new[i],  800k field access.   (chars/ms) 
len =     24:    800k charAt(i),  705k cbuff[i],  452k new[i],  800k field access.   (chars/ms) 
len =     28:    777k charAt(i),  736k cbuff[i],  368k new[i],  933k field access.   (chars/ms) 
len =     32:    800k charAt(i),  780k cbuff[i],  571k new[i],  969k field access.   (chars/ms) 
len =     64:    800k charAt(i),  901k cbuff[i],  800k new[i],  1306k field access.   (chars/ms) 
len =    128:    1084k charAt(i),  888k cbuff[i],  633k new[i],  1620k field access.   (chars/ms) 
len =    256:    1122k charAt(i),  966k cbuff[i],  729k new[i],  1790k field access.   (chars/ms) 
len =    512:    1163k charAt(i),  1007k cbuff[i],  676k new[i],  1910k field access.   (chars/ms) 
len =   1024:    1179k charAt(i),  1027k cbuff[i],  698k new[i],  1954k field access.   (chars/ms) 
len =   2048:    1184k charAt(i),  1043k cbuff[i],  732k new[i],  2007k field access.   (chars/ms) 
len =   4096:    1188k charAt(i),  1049k cbuff[i],  742k new[i],  2031k field access.   (chars/ms) 
len =   8192:    1157k charAt(i),  1032k cbuff[i],  723k new[i],  2048k field access.   (chars/ms) 

CONCLUSION:

As you can see, server mode is much faster.


Despite @Saint Hill's answer if you consider the time complexity of str.toCharArray(),

the first one is faster even for very large strings. You can run the code below to see it for yourself.

        char [] ch = new char[1_000_000_00];
    String str = new String(ch); // to create a large string

    // ---> from here
    long currentTime = System.nanoTime();
    for (int i = 0, n = str.length(); i < n; i++) {
        char c = str.charAt(i);
    }
    // ---> to here
    System.out.println("str.charAt(i):"+(System.nanoTime()-currentTime)/1000000.0 +" (ms)");

    /**
     *   ch = str.toCharArray() itself takes lots of time   
     */
    // ---> from here
    currentTime = System.nanoTime();
    ch = str.toCharArray();
    for (int i = 0, n = str.length(); i < n; i++) {
        char c = ch[i];
    }
    // ---> to  here
    System.out.println("ch = str.toCharArray() + c = ch[i] :"+(System.nanoTime()-currentTime)/1000000.0 +" (ms)");

output:

str.charAt(i):5.492102 (ms)
ch = str.toCharArray() + c = ch[i] :79.400064 (ms)

Looks like niether is faster or slower

    public static void main(String arguments[]) {


        //Build a long string
        StringBuilder sb = new StringBuilder();
        for(int j = 0; j < 10000; j++) {
            sb.append("a really, really long string");
        }
        String str = sb.toString();
        for (int testscount = 0; testscount < 10; testscount ++) {


            //Test 1
            long start = System.currentTimeMillis();
            for(int c = 0; c < 10000000; c++) {
                for (int i = 0, n = str.length(); i < n; i++) {
                    char chr = str.charAt(i);
                    doSomethingWithChar(chr);//To trick JIT optimistaion
                }
            }

            System.out.println("1: " + (System.currentTimeMillis() - start));

            //Test 2
            start = System.currentTimeMillis();
            char[] chars = str.toCharArray();
            for(int c = 0; c < 10000000; c++) {
                for (int i = 0, n = chars.length; i < n; i++) {
                    char chr = chars[i];
                    doSomethingWithChar(chr);//To trick JIT optimistaion
                }
            }
            System.out.println("2: " + (System.currentTimeMillis() - start));
            System.out.println();
        }


    }


    public static void doSomethingWithChar(char chr) {
        int newInt = chr << 2;
    }

For long strings I'll chose the first one. Why copy around long strings? Documentations says:

public char[] toCharArray() Converts this string to a new character array.

Returns: a newly allocated character array whose length is the length of this string and whose contents are initialized to contain the character sequence represented by this string.

//Edit 1

I've changed the test to trick JIT optimisation.

//Edit 2

Repeat test 10 times to let JVM warm up.

//Edit 3

Conclusions:

First of all str.toCharArray(); copies entire string in memory. It can be memory consuming for long strings. Method String.charAt( ) looks up char in char array inside String class checking index before. It looks like for short enough Strings first method (i.e. chatAt method) is a bit slower due to this index check. But if the String is long enough, copying whole char array gets slower, and the first method is faster. The longer the string is, the slower toCharArray performs. Try to change limit in for(int j = 0; j < 10000; j++) loop to see it. If we let JVM warm up code runs faster, but proportions are the same.

After all it's just micro-optimisation.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder

Examples related to char

How can I convert a char to int in Java? C# - How to convert string to char? How to take character input in java Char Comparison in C Convert Char to String in C cannot convert 'std::basic_string<char>' to 'const char*' for argument '1' to 'int system(const char*)' How to get the real and total length of char * (char array)? Why is conversion from string constant to 'char*' valid in C but invalid in C++ char *array and char array[] C++ - How to append a char to char*?

Examples related to iteration

Is there a way in Pandas to use previous row value in dataframe.apply when previous value is also calculated in the apply? How to loop over grouped Pandas dataframe? How to iterate through a list of dictionaries in Jinja template? How to iterate through an ArrayList of Objects of ArrayList of Objects? Ways to iterate over a list in Java Python list iterator behavior and next(iterator) How to loop through an array containing objects and access their properties recursion versus iteration What is the perfect counterpart in Python for "while not EOF" How to iterate over a JavaScript object?