[c#] Why is the default value of the string type null instead of an empty string?

It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...

If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example. Additionally Nullable<String> would make sense.

So why did the designers of C# choose to use null as the default value of strings?

Note: This relates to this question, but is more focused on the why instead of what to do with it.

This question is related to c# string default-value

The answer is


Because a string variable is a reference, not an instance.

Initializing it to Empty by default would have been possible but it would have introduced a lot of inconsistencies all over the board.


Habib is right -- because string is a reference type.

But more importantly, you don't have to check for null each time you use it. You probably should throw a ArgumentNullException if someone passes your function a null reference, though.

Here's the thing -- the framework would throw a NullReferenceException for you anyway if you tried to call .ToUpper() on a string. Remember that this case still can happen even if you test your arguments for null since any property or method on the objects passed to your function as parameters may evaluate to null.

That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() for just this purpose.


Since you mentioned ToUpper(), and this usage is how I found this thread, I will share this shortcut (string ?? "").ToUpper():

    private string _city;
    public string City
    {
        get
        {
            return (this._city ?? "").ToUpper();
        }
        set
        {
            this._city = value;
        }
    }

Seems better than:

        if(null != this._city)
        { this._city = this._city.ToUpper(); }

Since string is a reference type and the default value for reference type is null.


Maybe the string keyword confused you, as it looks exactly like any other value type declaration, but it is actually an alias to System.String as explained in this question.
Also the dark blue color in Visual Studio and the lowercase first letter may mislead into thinking it is a struct.


You could also use the following, as of C# 6.0

string myString = null;
string result = myString?.ToUpper();

The string result will be null.


Why the designers of C# chose to use null as the default value of strings?

Because strings are reference types, reference types are default value is null. Variables of reference types store references to the actual data.

Let's use default keyword for this case;

string str = default(string); 

str is a string, so it is a reference type, so default value is null.

int str = (default)(int);

str is an int, so it is a value type, so default value is zero.


If the default value of string were the empty string, I would not have to test

Wrong! Changing the default value doesn't change the fact that it's a reference type and someone can still explicitly set the reference to be null.

Additionally Nullable<String> would make sense.

True point. It would make more sense to not allow null for any reference types, instead requiring Nullable<TheRefType> for that feature.

So why did the designers of C# choose to use null as the default value of strings?

Consistency with other reference types. Now, why allow null in reference types at all? Probably so that it feels like C, even though this is a questionable design decision in a language that also provides Nullable.


The fundamental reason/problem is that the designers of the CLS specification (which defines how languages interact with .net) did not define a means by which class members could specify that they must be called directly, rather than via callvirt, without the caller performing a null-reference check; nor did it provide a meany of defining structures which would not be subject to "normal" boxing.

Had the CLS specification defined such a means, then it would be possible for .net to consistently follow the lead established by the Common Object Model (COM), under which a null string reference was considered semantically equivalent to an empty string, and for other user-defined immutable class types which are supposed to have value semantics to likewise define default values. Essentially, what would happen would be for each member of String, e.g. Length to be written as something like [InvokableOnNull()] int String Length { get { if (this==null) return 0; else return _Length;} }. This approach would have offered very nice semantics for things which should behave like values, but because of implementation issues need to be stored on the heap. The biggest difficulty with this approach is that the semantics of conversion between such types and Object could get a little murky.

An alternative approach would have been to allow the definition of special structure types which did not inherit from Object but instead had custom boxing and unboxing operations (which would convert to/from some other class type). Under such an approach, there would be a class type NullableString which behaves as string does now, and a custom-boxed struct type String, which would hold a single private field Value of type String. Attempting to convert a String to NullableString or Object would return Value if non-null, or String.Empty if null. Attempting to cast to String, a non-null reference to a NullableString instance would store the reference in Value (perhaps storing null if the length was zero); casting any other reference would throw an exception.

Even though strings have to be stored on the heap, there is conceptually no reason why they shouldn't behave like value types that have a non-null default value. Having them be stored as a "normal" structure which held a reference would have been efficient for code that used them as type "string", but would have added an extra layer of indirection and inefficiency when casting to "object". While I don't foresee .net adding either of the above features at this late date, perhaps designers of future frameworks might consider including them.


Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.

The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.

Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.

Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.

In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.

https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value

NULL vs Empty when dealing with user input


You could write an extension method (for what it's worth):

public static string EmptyNull(this string str)
{
    return str ?? "";
}

Now this works safely:

string str = null;
string upper = str.EmptyNull().ToUpper();

Nullable types did not come in until 2.0.

If nullable types had been made in the beginning of the language then string would have been non-nullable and string? would have been nullable. But they could not do this du to backward compatibility.

A lot of people talk about ref-type or not ref type, but string is an out of the ordinary class and solutions would have been found to make it possible.


Perhaps if you'd use ?? operator when assigning your string variable, it might help you.

string str = SomeMethodThatReturnsaString() ?? "";
// if SomeMethodThatReturnsaString() returns a null value, "" is assigned to str.

A String is an immutable object which means when given a value, the old value doesn't get wiped out of memory, but remains in the old location, and the new value is put in a new location. So if the default value of String a was String.Empty, it would waste the String.Empty block in memory when it was given its first value.

Although it seems minuscule, it could turn into a problem when initializing a large array of strings with default values of String.Empty. Of course, you could always use the mutable StringBuilder class if this was going to be a problem.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to default-value

How to set a default value in react-select How to set default values in Go structs What is the default value for Guid? CURRENT_DATE/CURDATE() not working as default DATE value Entity Framework 6 Code first Default value Default values and initialization in Java Python argparse: default value or specified value Javascript Get Element by Id and set the value Why is the default value of the string type null instead of an empty string? SQL Column definition : default value and not null redundant?