[c#] Most efficient way to concatenate strings?

What's the most efficient way to concatenate strings?

This question is related to c# .net string optimization

The answer is


From Chinh Do - StringBuilder is not always faster:

Rules of Thumb

  • When concatenating three dynamic string values or less, use traditional string concatenation.

  • When concatenating more than three dynamic string values, use StringBuilder.

  • When building a big string from several string literals, use either the @ string literal or the inline + operator.

Most of the time StringBuilder is your best bet, but there are cases as shown in that post that you should at least think about each situation.


The most efficient is to use StringBuilder, like so:

StringBuilder sb = new StringBuilder();
sb.Append("string1");
sb.Append("string2");
...etc...
String strResult = sb.ToString();

@jonezy: String.Concat is fine if you have a couple of small things. But if you're concatenating megabytes of data, your program will likely tank.


If you're operating in a loop, StringBuilder is probably the way to go; it saves you the overhead of creating new strings regularly. In code that'll only run once, though, String.Concat is probably fine.

However, Rico Mariani (.NET optimization guru) made up a quiz in which he stated at the end that, in most cases, he recommends String.Format.


Try this 2 pieces of code and you will find the solution.

 static void Main(string[] args)
    {
        StringBuilder s = new StringBuilder();
        for (int i = 0; i < 10000000; i++)
        {
            s.Append( i.ToString());
        }
        Console.Write("End");
        Console.Read();
    }

Vs

static void Main(string[] args)
    {
        string s = "";
        for (int i = 0; i < 10000000; i++)
        {
            s += i.ToString();
        }
        Console.Write("End");
        Console.Read();
    }

You will find that 1st code will end really quick and the memory will be in a good amount.

The second code maybe the memory will be ok, but it will take longer... much longer. So if you have an application for a lot of users and you need speed, use the 1st. If you have an app for a short term one user app, maybe you can use both or the 2nd will be more "natural" for developers.

Cheers.


There are 6 types of string concatenations:

  1. Using the plus (+) symbol.
  2. Using string.Concat().
  3. Using string.Join().
  4. Using string.Format().
  5. Using string.Append().
  6. Using StringBuilder.

In an experiment, it has been proved that string.Concat() is the best way to approach if the words are less than 1000(approximately) and if the words are more than 1000 then StringBuilder should be used.

For more information, check this site.

string.Join() vs string.Concat()

The string.Concat method here is equivalent to the string.Join method invocation with an empty separator. Appending an empty string is fast, but not doing so is even faster, so the string.Concat method would be superior here.


Another solution:

inside the loop, use List instead of string.

List<string> lst= new List<string>();

for(int i=0; i<100000; i++){
    ...........
    lst.Add(...);
}
return String.Join("", lst.ToArray());;

it is very very fast.


For just two strings, you definitely do not want to use StringBuilder. There is some threshold above which the StringBuilder overhead is less than the overhead of allocating multiple strings.

So, for more that 2-3 strings, use DannySmurf's code. Otherwise, just use the + operator.


Here is the fastest method I've evolved over a decade for my large-scale NLP app. I have variations for IEnumerable<T> and other input types, with and without separators of different types (Char, String), but here I show the simple case of concatenating all strings in an array into a single string, with no separator. Latest version here is developed and unit-tested on C# 7 and .NET 4.7.

There are two keys to higher performance; the first is to pre-compute the exact total size required. This step is trivial when the input is an array as shown here. For handling IEnumerable<T> instead, it is worth first gathering the strings into a temporary array for computing that total (The array is required to avoid calling ToString() more than once per element since technically, given the possibility of side-effects, doing so could change the expected semantics of a 'string join' operation).

Next, given the total allocation size of the final string, the biggest boost in performance is gained by building the result string in-place. Doing this requires the (perhaps controversial) technique of temporarily suspending the immutability of a new String which is initially allocated full of zeros. Any such controversy aside, however...

...note that this is the only bulk-concatenation solution on this page which entirely avoids an extra round of allocation and copying by the String constructor.

Complete code:

/// <summary>
/// Concatenate the strings in 'rg', none of which may be null, into a single String.
/// </summary>
public static unsafe String StringJoin(this String[] rg)
{
    int i;
    if (rg == null || (i = rg.Length) == 0)
        return String.Empty;

    if (i == 1)
        return rg[0];

    String s, t;
    int cch = 0;
    do
        cch += rg[--i].Length;
    while (i > 0);
    if (cch == 0)
        return String.Empty;

    i = rg.Length;
    fixed (Char* _p = (s = new String(default(Char), cch)))
    {
        Char* pDst = _p + cch;
        do
            if ((t = rg[--i]).Length > 0)
                fixed (Char* pSrc = t)
                    memcpy(pDst -= t.Length, pSrc, (UIntPtr)(t.Length << 1));
        while (pDst > _p);
    }
    return s;
}

[DllImport("MSVCR120_CLR0400", CallingConvention = CallingConvention.Cdecl)]
static extern unsafe void* memcpy(void* dest, void* src, UIntPtr cb);

I should mention that this code has a slight modification from what I use myself. In the original, I call the cpblk IL instruction from C# to do the actual copying. For simplicity and portability in the code here, I replaced that with P/Invoke memcpy instead, as you can see. For highest performance on x64 (but maybe not x86) you may want to use the cpblk method instead.


System.String is immutable. When we modify the value of a string variable then a new memory is allocated to the new value and the previous memory allocation released. System.StringBuilder was designed to have concept of a mutable string where a variety of operations can be performed without allocation separate memory location for the modified string.


Adding to the other answers, please keep in mind that StringBuilder can be told an initial amount of memory to allocate.

The capacity parameter defines the maximum number of characters that can be stored in the memory allocated by the current instance. Its value is assigned to the Capacity property. If the number of characters to be stored in the current instance exceeds this capacity value, the StringBuilder object allocates additional memory to store them.

If capacity is zero, the implementation-specific default capacity is used.

Repeatedly appending to a StringBuilder that hasn't been pre-allocated can result in a lot of unnecessary allocations just like repeatedly concatenating regular strings.

If you know how long the final string will be, can trivially calculate it, or can make an educated guess about the common case (allocating too much isn't necessarily a bad thing), you should be providing this information to the constructor or the Capacity property. Especially when running performance tests to compare StringBuilder with other methods like String.Concat, which do the same thing internally. Any test you see online which doesn't include StringBuilder pre-allocation in its comparisons is wrong.

If you can't make any kind of guess about the size, you're probably writing a utility function which should have its own optional argument for controlling pre-allocation.


Following may be one more alternate solution to concatenate multiple strings.

String str1 = "sometext";
string str2 = "some other text";

string afterConcate = $"{str1}{str2}";

string interpolation


It's also important to point it out that you should use the + operator if you are concatenating string literals.

When you concatenate string literals or string constants by using the + operator, the compiler creates a single string. No run time concatenation occurs.

How to: Concatenate Multiple Strings (C# Programming Guide)


From this MSDN article:

There is some overhead associated with creating a StringBuilder object, both in time and memory. On a machine with fast memory, a StringBuilder becomes worthwhile if you're doing about five operations. As a rule of thumb, I would say 10 or more string operations is a justification for the overhead on any machine, even a slower one.

So if you trust MSDN go with StringBuilder if you have to do more than 10 strings operations/concatenations - otherwise simple string concat with '+' is fine.


It would depend on the code. StringBuilder is more efficient generally, but if you're only concatenating a few strings and doing it all in one line, code optimizations will likely take care of it for you. It's important to think about how the code looks too: for larger sets StringBuilder will make it easier to read, for small ones StringBuilder will just add needless clutter.


It really depends on your usage pattern. A detailed benchmark between string.Join, string,Concat and string.Format can be found here: String.Format Isn't Suitable for Intensive Logging

(This is actually the same answer I gave to this question)


Rico Mariani, the .NET Performance guru, had an article on this very subject. It's not as simple as one might suspect. The basic advice is this:

If your pattern looks like:

x = f1(...) + f2(...) + f3(...) + f4(...)

that's one concat and it's zippy, StringBuilder probably won't help.

If your pattern looks like:

if (...) x += f1(...)
if (...) x += f2(...)
if (...) x += f3(...)
if (...) x += f4(...)

then you probably want StringBuilder.

Yet another article to support this claim comes from Eric Lippert where he describes the optimizations performed on one line + concatenations in a detailed manner.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to optimization

Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Measuring execution time of a function in C++ GROUP BY having MAX date How to efficiently remove duplicates from an array without using Set Storing JSON in database vs. having a new column for each key Read file As String How to write a large buffer into a binary file in C++, fast? Is optimisation level -O3 dangerous in g++? Why is processing a sorted array faster than processing an unsorted array? MySQL my.cnf performance tuning recommendations