[c#] Difference between InvariantCulture and Ordinal string comparison

When comparing two strings in c# for equality, what is the difference between InvariantCulture and Ordinal comparison?

This question is related to c# .net string-comparison ordinal

The answer is


No need to use fancy unicode char exmaples to show the difference. Here's one simple example I found out today which is surprising, consisting of only ASCII characters.

According to the ASCII table, 0 (0x48) is smaller than _ (0x95) when compared ordinally. InvariantCulture would say the opposite (PowerShell code below):

PS> [System.StringComparer]::Ordinal.Compare("_", "0")
47
PS> [System.StringComparer]::InvariantCulture.Compare("_", "0")
-1

Always try to use InvariantCulture in those string methods that accept it as overload. By using InvariantCulture you are on a safe side. Many .NET programmers may not use this functionality but if your software will be used by different cultures, InvariantCulture is an extremely handy feature.


Here is an example where string equality comparison using InvariantCultureIgnoreCase and OrdinalIgnoreCase will not give the same results:

string str = "\xC4"; //A with umlaut, Ä
string A = str.Normalize(NormalizationForm.FormC);
//Length is 1, this will contain the single A with umlaut character (Ä)
string B = str.Normalize(NormalizationForm.FormD);
//Length is 2, this will contain an uppercase A followed by an umlaut combining character
bool equals1 = A.Equals(B, StringComparison.OrdinalIgnoreCase);
bool equals2 = A.Equals(B, StringComparison.InvariantCultureIgnoreCase);

If you run this, equals1 will be false, and equals2 will be true.


Pointing to Best Practices for Using Strings in the .NET Framework:

  • Use StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
  • Use comparisons with StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for better performance.
  • Use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase values instead of string operations based on CultureInfo.InvariantCulture when the comparison is linguistically irrelevant (symbolic, for example).

And finally:

  • Do not use string operations based on StringComparison.InvariantCulture in most cases. One of the few exceptions is when you are persisting linguistically meaningful but culturally agnostic data.

Another handy difference (in English where accents are uncommon) is that an InvariantCulture comparison compares the entire strings by case-insensitive first, and then if necessary (and requested) distinguishes by case after first comparing only on the distinct letters. (You can also do a case-insensitive comparison, of course, which won't distinguish by case.) Corrected: Accented letters are considered to be another flavor of the same letters and the string is compared first ignoring accents and then accounting for them if the general letters all match (much as with differing case except not ultimately ignored in a case-insensitive compare). This groups accented versions of the otherwise same word near each other instead of completely separate at the first accent difference. This is the sort order you would typically find in a dictionary, with capitalized words appearing right next to their lowercase equivalents, and accented letters being near the corresponding unaccented letter.

An ordinal comparison compares strictly on the numeric character values, stopping at the first difference. This sorts capitalized letters completely separate from the lowercase letters (and accented letters presumably separate from those), so capitalized words would sort nowhere near their lowercase equivalents.

InvariantCulture also considers capitals to be greater than lower case, whereas Ordinal considers capitals to be less than lowercase (a holdover of ASCII from the old days before computers had lowercase letters, the uppercase letters were allocated first and thus had lower values than the lowercase letters added later).

For example, by Ordinal: "0" < "9" < "A" < "Ab" < "Z" < "a" < "aB" < "ab" < "z" < "Á" < "Áb" < "á" < "áb"

And by InvariantCulture: "0" < "9" < "a" < "A" < "á" < "Á" < "ab" < "aB" < "Ab" < "áb" < "Áb" < "z" < "Z"


Always try to use InvariantCulture in those string methods that accept it as overload. By using InvariantCulture you are on a safe side. Many .NET programmers may not use this functionality but if your software will be used by different cultures, InvariantCulture is an extremely handy feature.


Although the question is about equality, for quick visual reference, here the order of some strings sorted using a couple of cultures illustrating some of the idiosyncrasies out there.

Ordinal          0 9 A Ab a aB aa ab ss Ä Äb ß ä äb ? ? ? ? ? A
IgnoreCase       0 9 a A aa ab Ab aB ss ä Ä äb Äb ß ? ? ? ? ? A
--------------------------------------------------------------------
InvariantCulture 0 9 a A A ä Ä aa ab aB Ab äb Äb ss ß ? ? ? ? ?
IgnoreCase       0 9 A a A Ä ä aa Ab aB ab Äb äb ß ss ? ? ? ? ?
--------------------------------------------------------------------
da-DK            0 9 a A A ab aB Ab ss ß ä Ä äb Äb aa ? ? ? ? ?
IgnoreCase       0 9 A a A Ab aB ab ß ss Ä ä Äb äb aa ? ? ? ? ?
--------------------------------------------------------------------
de-DE            0 9 a A A ä Ä aa ab aB Ab äb Äb ß ss ? ? ? ? ?
IgnoreCase       0 9 A a A Ä ä aa Ab aB ab Äb äb ss ß ? ? ? ? ?
--------------------------------------------------------------------
en-US            0 9 a A A ä Ä aa ab aB Ab äb Äb ß ss ? ? ? ? ?
IgnoreCase       0 9 A a A Ä ä aa Ab aB ab Äb äb ss ß ? ? ? ? ?
--------------------------------------------------------------------
ja-JP            0 9 a A A ä Ä aa ab aB Ab äb Äb ß ss ? ? ? ? ?
IgnoreCase       0 9 A a A Ä ä aa Ab aB ab Äb äb ss ß ? ? ? ? ?

Observations:

  • de-DE, ja-JP, and en-US sort the same way
  • Invariant only sorts ss and ß differently from the above three cultures
  • da-DK sorts quite differently
  • the IgnoreCase flag matters for all sampled cultures

The code used to generate above table:

var l = new List<string>
    { "0", "9", "A", "Ab", "a", "aB", "aa", "ab", "ss", "ß",
      "Ä", "Äb", "ä", "äb", "?", "?", "?", "?", "A", "?" };

foreach (var comparer in new[]
{
    StringComparer.Ordinal,
    StringComparer.OrdinalIgnoreCase,
    StringComparer.InvariantCulture,
    StringComparer.InvariantCultureIgnoreCase,
    StringComparer.Create(new CultureInfo("da-DK"), false),
    StringComparer.Create(new CultureInfo("da-DK"), true),
    StringComparer.Create(new CultureInfo("de-DE"), false),
    StringComparer.Create(new CultureInfo("de-DE"), true),
    StringComparer.Create(new CultureInfo("en-US"), false),
    StringComparer.Create(new CultureInfo("en-US"), true),
    StringComparer.Create(new CultureInfo("ja-JP"), false),
    StringComparer.Create(new CultureInfo("ja-JP"), true),
})
{
    l.Sort(comparer);
    Console.WriteLine(string.Join(" ", l));
}

Here is an example where string equality comparison using InvariantCultureIgnoreCase and OrdinalIgnoreCase will not give the same results:

string str = "\xC4"; //A with umlaut, Ä
string A = str.Normalize(NormalizationForm.FormC);
//Length is 1, this will contain the single A with umlaut character (Ä)
string B = str.Normalize(NormalizationForm.FormD);
//Length is 2, this will contain an uppercase A followed by an umlaut combining character
bool equals1 = A.Equals(B, StringComparison.OrdinalIgnoreCase);
bool equals2 = A.Equals(B, StringComparison.InvariantCultureIgnoreCase);

If you run this, equals1 will be false, and equals2 will be true.


Invariant is a linguistically appropriate type of comparison.
Ordinal is a binary type of comparison. (faster)
See http://www.siao2.com/2004/12/29/344136.aspx


It does matter, for example - there is a thing called character expansion

var s1 = "Strasse";
var s2 = "Straße";

s1.Equals(s2, StringComparison.Ordinal);           //false
s1.Equals(s2, StringComparison.InvariantCulture);  //true

With InvariantCulture the ß character gets expanded to ss.


Another handy difference (in English where accents are uncommon) is that an InvariantCulture comparison compares the entire strings by case-insensitive first, and then if necessary (and requested) distinguishes by case after first comparing only on the distinct letters. (You can also do a case-insensitive comparison, of course, which won't distinguish by case.) Corrected: Accented letters are considered to be another flavor of the same letters and the string is compared first ignoring accents and then accounting for them if the general letters all match (much as with differing case except not ultimately ignored in a case-insensitive compare). This groups accented versions of the otherwise same word near each other instead of completely separate at the first accent difference. This is the sort order you would typically find in a dictionary, with capitalized words appearing right next to their lowercase equivalents, and accented letters being near the corresponding unaccented letter.

An ordinal comparison compares strictly on the numeric character values, stopping at the first difference. This sorts capitalized letters completely separate from the lowercase letters (and accented letters presumably separate from those), so capitalized words would sort nowhere near their lowercase equivalents.

InvariantCulture also considers capitals to be greater than lower case, whereas Ordinal considers capitals to be less than lowercase (a holdover of ASCII from the old days before computers had lowercase letters, the uppercase letters were allocated first and thus had lower values than the lowercase letters added later).

For example, by Ordinal: "0" < "9" < "A" < "Ab" < "Z" < "a" < "aB" < "ab" < "z" < "Á" < "Áb" < "á" < "áb"

And by InvariantCulture: "0" < "9" < "a" < "A" < "á" < "Á" < "ab" < "aB" < "Ab" < "áb" < "Áb" < "z" < "Z"


Invariant is a linguistically appropriate type of comparison.
Ordinal is a binary type of comparison. (faster)
See http://www.siao2.com/2004/12/29/344136.aspx


Pointing to Best Practices for Using Strings in the .NET Framework:

  • Use StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
  • Use comparisons with StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for better performance.
  • Use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase values instead of string operations based on CultureInfo.InvariantCulture when the comparison is linguistically irrelevant (symbolic, for example).

And finally:

  • Do not use string operations based on StringComparison.InvariantCulture in most cases. One of the few exceptions is when you are persisting linguistically meaningful but culturally agnostic data.

It does matter, for example - there is a thing called character expansion

var s1 = "Strasse";
var s2 = "Straße";

s1.Equals(s2, StringComparison.Ordinal);           //false
s1.Equals(s2, StringComparison.InvariantCulture);  //true

With InvariantCulture the ß character gets expanded to ss.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to string-comparison

How do I compare two strings in python? How to compare the contents of two string objects in PowerShell String comparison in bash. [[: not found How do I compare version numbers in Python? Test if a string contains a word in PHP? Checking whether a string starts with XXXX comparing two strings in SQL Server Getting the closest string match How can I make SQL case sensitive string comparison on MySQL? String Comparison in Java

Examples related to ordinal

Cast Int to enum in Java How do you format the day of the month to say "11th", "21st" or "23rd" (ordinal indicator)? Difference between InvariantCulture and Ordinal string comparison