[c#] Default string initialization: NULL or Empty?

I have always initialized my strings to NULL, with the thinking that NULL means the absence of a value and "" or String.Empty is a valid value. I have seen more examples lately of code where String.Empty is considered the default value or represents no value. This strikes me as odd, with the newly added nullable types in c# it seems like we are taking strides backwards with strings by not using the NULL to represent 'No Value'.

What do you use as the default initializer and why?

Edit: Based on the answers I futher my further thoughts

  1. Avoiding error handling If the value shouldn't be null, why did it get set to NULL in the first place? Perhaps it would be better to identify the error at the place where it occurs rather than cover it up through out the rest of your codebase?

  2. Avoiding null checks If you are tired of doing null checks in code, wouldn't it be better to abstract the null checks? Perhaps wrap (or extend!) the string methods to make them NULL safe? What happens if you constantly use String.Empty and a null happens to work it's way into your system, do you start adding NULL checks anyways?

I can't help but return to the opinion that it is laziness. Any DBA would slap you nine ways to silly if you used '' instead of null in his\her database. I think the same principles apply in programming and there should be somebody to smack those upside the head who use String.Empty rather than NULL to represent no value.

Related Questions

This question is related to c#

The answer is


seems like this is a special case of the http://en.wikipedia.org/wiki/Null_Object_pattern


I think there's no reason not to use null for an unassigned (or at this place in a program flow not occurring) value. If you want to distinguish, there's ==null. If you just want to check for a certain value and don't care whether it's null or something different, String.Equals("XXX",MyStringVar) does just fine.


This is actually a gaping hole in the C# language. There is no way to define a string that cannot be null. This causes problems as simple as the one you are describing, which forces programmers to make a decision they shouldn't have to make, since in many cases NULL and String.Empty mean the same thing. That, in turn, can later force other programmers to have to handle both NULL and String.Empty, which is annoying.

A bigger problem is that databases allow you to define fields that map to a C# string, but database fields can be defined as NOT NULL. So, there is no way to accurately represent, say, a varchar( 100 ) NOT NULL field in SQL Server using a C# type.

Other languages, such as Spec #, do allow this.

In my opinion, C#'s inability to define a string that doesn't allow null is just as bad as its previous inability to define an int that does allow null.

To completely answer your question: I always use empty string for default initialization because it is more similar to how database data types work. (Edit: This statement was very unclear. It should read "I use empty string for default initialization when NULL is a superfluous state, much in the same way I set up a database column as NOT NULL if NULL would be a superfluous state. Similarly, many of my DB columns are set up as NOT NULL, so when I bring those into a C# string, the string will be empty or have a value, but will never be NULL. In other words, I only initialize a string to NULL if null has a meaning that is distinct from the meaning of String.Empty, and I find that case to be less than common (but people here have given legitimate examples of this case).")


It depends on the situation. In most cases I use String.Empty because I don't want to be doing null checks every time I attempt to use a string. It makes the code a lot simpler and you are less likely to introduce unwanted NullReferenceException crashes.

I only set the string to null when I need to know if it has been set or not and where an empty string is something valid to set it to. In practice, I find these situations rare.


I either set it to "" or null - I always check by using String.IsNullOrEmpty, so either is fine.

But the inner geek in me says I should set it to null before I have a proper value for it...


It depends.

Do you need to be able to tell if the value is missing (is it possible for it to not be defined)?

Is the empty string a valid value for the usage of that string?

If you answered "yes" to both, then you'll want to use null. Otherwise you can't tell the difference between "no value" and "empty string".

If you don't need to know if there's no value then the empty string is probably safer, as it allows you to skip null checks wherever you use it.


Is it possible that this is an error avoidance technique (advisable or not..)? Since "" is still a string, you would be able to call string functions on it that would result in an exception if it was NULL?


It depends.

Do you need to be able to tell if the value is missing (is it possible for it to not be defined)?

Is the empty string a valid value for the usage of that string?

If you answered "yes" to both, then you'll want to use null. Otherwise you can't tell the difference between "no value" and "empty string".

If you don't need to know if there's no value then the empty string is probably safer, as it allows you to skip null checks wherever you use it.


Null should only be used in cases where a value is optional. If the value is not optional (like 'Name' or 'Address'), then the value should never be null. This applies to databases as well as POCOs and the user interface. Null means "this value is optional, and is currently absent."

If your field is not optional, then you should initialize it as the empty string. To initialize it as null would place your object into an invalid state (invalid by your own data model).

Personally I would rather strings to not be nullable by default, but instead only nullable if we declare a "string?". Although perhaps this not feasible or logical at a deeper level; not sure.


I always initialise them as NULL.

I always use string.IsNullOrEmpty(someString) to check it's value.

Simple.


I either set it to "" or null - I always check by using String.IsNullOrEmpty, so either is fine.

But the inner geek in me says I should set it to null before I have a proper value for it...


seems like this is a special case of the http://en.wikipedia.org/wiki/Null_Object_pattern


For most software that isn't actually string-processing software, program logic ought not to depend on the content of string variables. Whenever I see something like this in a program:

if (s == "value")

I get a bad feeling. Why is there a string literal in this method? What's setting s? Does it know that logic depends on the value of the string? Does it know that it has to be lower case to work? Should I be fixing this by changing it to use String.Compare? Should I be creating an Enum and parsing into it?

From this perspective, one gets to a philosophy of code that's pretty simple: you avoid examining a string's contents wherever possible. Comparing a string to String.Empty is really just a special case of comparing it to a literal: it's something to avoid doing unless you really have to.

Knowing this, I don't blink when I see something like this in our code base:

string msg = Validate(item);
if (msg != null)
{
   DisplayErrorMessage(msg);
   return;
}

I know that Validate would never return String.Empty, because we write better code than that.

Of course, the rest of the world doesn't work like this. When your program is dealing with user input, databases, files, and so on, you have to account for other philosophies. There, it's the job of your code to impose order on chaos. Part of that order is knowing when an empty string should mean String.Empty and when it should mean null.

(Just to make sure I wasn't talking out of my ass, I just searched our codebase for `String.IsNullOrEmpty'. All 54 occurrences of it are in methods that process user input, return values from Python scripts, examine values retrieved from external APIs, etc.)


I always declare string with string.empty;


Strings aren't value types, and never will be ;-)


According to MSDN:

By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.

Always using IsNullOrEmpty() is good practice nevertheless.


Why do you want your string to be initialized at all? You don't have to initialize a variable when you declare one, and IMO, you should only do so when the value you are assigning is valid in the context of the code block.

I see this a lot:

string name = null; // or String.Empty
if (condition)
{
  name = "foo";
}
else
{
  name = "bar";
}

return name;

Not initializing to null would be just as effective. Furthermore, most often you want a value to be assigned. By initializing to null, you can potentially miss code paths that don't assign a value. Like so:

string name = null; // or String.Empty
if (condition)
{
  name = "foo";
}
else if (othercondition)
{
  name = "bar";
}

return name; //returns null when condition and othercondition are false

When you don't initialize to null, the compiler will generate an error saying that not all code paths assign a value. Of course, this is a very simple example...

Matthijs


It depends on the situation. In most cases I use String.Empty because I don't want to be doing null checks every time I attempt to use a string. It makes the code a lot simpler and you are less likely to introduce unwanted NullReferenceException crashes.

I only set the string to null when I need to know if it has been set or not and where an empty string is something valid to set it to. In practice, I find these situations rare.


Reiterating Tomalak response, keep in mind that when you assign a string variable to an initial value of null, your variable is no longer a string object; same with any object in C#. So, if you attempt to access any methods or properties for your variable and you are assuming it is a string object, you will get the NullReferenceException exception.


Null should only be used in cases where a value is optional. If the value is not optional (like 'Name' or 'Address'), then the value should never be null. This applies to databases as well as POCOs and the user interface. Null means "this value is optional, and is currently absent."

If your field is not optional, then you should initialize it as the empty string. To initialize it as null would place your object into an invalid state (invalid by your own data model).

Personally I would rather strings to not be nullable by default, but instead only nullable if we declare a "string?". Although perhaps this not feasible or logical at a deeper level; not sure.


I always declare string with string.empty;


For most software that isn't actually string-processing software, program logic ought not to depend on the content of string variables. Whenever I see something like this in a program:

if (s == "value")

I get a bad feeling. Why is there a string literal in this method? What's setting s? Does it know that logic depends on the value of the string? Does it know that it has to be lower case to work? Should I be fixing this by changing it to use String.Compare? Should I be creating an Enum and parsing into it?

From this perspective, one gets to a philosophy of code that's pretty simple: you avoid examining a string's contents wherever possible. Comparing a string to String.Empty is really just a special case of comparing it to a literal: it's something to avoid doing unless you really have to.

Knowing this, I don't blink when I see something like this in our code base:

string msg = Validate(item);
if (msg != null)
{
   DisplayErrorMessage(msg);
   return;
}

I know that Validate would never return String.Empty, because we write better code than that.

Of course, the rest of the world doesn't work like this. When your program is dealing with user input, databases, files, and so on, you have to account for other philosophies. There, it's the job of your code to impose order on chaos. Part of that order is knowing when an empty string should mean String.Empty and when it should mean null.

(Just to make sure I wasn't talking out of my ass, I just searched our codebase for `String.IsNullOrEmpty'. All 54 occurrences of it are in methods that process user input, return values from Python scripts, examine values retrieved from external APIs, etc.)


Is it possible that this is an error avoidance technique (advisable or not..)? Since "" is still a string, you would be able to call string functions on it that would result in an exception if it was NULL?


I either set it to "" or null - I always check by using String.IsNullOrEmpty, so either is fine.

But the inner geek in me says I should set it to null before I have a proper value for it...


According to MSDN:

By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.

Always using IsNullOrEmpty() is good practice nevertheless.


I always initialise them as NULL.

I always use string.IsNullOrEmpty(someString) to check it's value.

Simple.


Reiterating Tomalak response, keep in mind that when you assign a string variable to an initial value of null, your variable is no longer a string object; same with any object in C#. So, if you attempt to access any methods or properties for your variable and you are assuming it is a string object, you will get the NullReferenceException exception.


This is actually a gaping hole in the C# language. There is no way to define a string that cannot be null. This causes problems as simple as the one you are describing, which forces programmers to make a decision they shouldn't have to make, since in many cases NULL and String.Empty mean the same thing. That, in turn, can later force other programmers to have to handle both NULL and String.Empty, which is annoying.

A bigger problem is that databases allow you to define fields that map to a C# string, but database fields can be defined as NOT NULL. So, there is no way to accurately represent, say, a varchar( 100 ) NOT NULL field in SQL Server using a C# type.

Other languages, such as Spec #, do allow this.

In my opinion, C#'s inability to define a string that doesn't allow null is just as bad as its previous inability to define an int that does allow null.

To completely answer your question: I always use empty string for default initialization because it is more similar to how database data types work. (Edit: This statement was very unclear. It should read "I use empty string for default initialization when NULL is a superfluous state, much in the same way I set up a database column as NOT NULL if NULL would be a superfluous state. Similarly, many of my DB columns are set up as NOT NULL, so when I bring those into a C# string, the string will be empty or have a value, but will never be NULL. In other words, I only initialize a string to NULL if null has a meaning that is distinct from the meaning of String.Empty, and I find that case to be less than common (but people here have given legitimate examples of this case).")


An empty string is a value (a piece of text which, incidentally, happens not to contain any letters). Null signifies no-value.

I initialize variables to null when I wish to indicate that they do not point to or contain actual values - when the intent is for no-value.


For most software that isn't actually string-processing software, program logic ought not to depend on the content of string variables. Whenever I see something like this in a program:

if (s == "value")

I get a bad feeling. Why is there a string literal in this method? What's setting s? Does it know that logic depends on the value of the string? Does it know that it has to be lower case to work? Should I be fixing this by changing it to use String.Compare? Should I be creating an Enum and parsing into it?

From this perspective, one gets to a philosophy of code that's pretty simple: you avoid examining a string's contents wherever possible. Comparing a string to String.Empty is really just a special case of comparing it to a literal: it's something to avoid doing unless you really have to.

Knowing this, I don't blink when I see something like this in our code base:

string msg = Validate(item);
if (msg != null)
{
   DisplayErrorMessage(msg);
   return;
}

I know that Validate would never return String.Empty, because we write better code than that.

Of course, the rest of the world doesn't work like this. When your program is dealing with user input, databases, files, and so on, you have to account for other philosophies. There, it's the job of your code to impose order on chaos. Part of that order is knowing when an empty string should mean String.Empty and when it should mean null.

(Just to make sure I wasn't talking out of my ass, I just searched our codebase for `String.IsNullOrEmpty'. All 54 occurrences of it are in methods that process user input, return values from Python scripts, examine values retrieved from external APIs, etc.)


seems like this is a special case of the http://en.wikipedia.org/wiki/Null_Object_pattern


Is it possible that this is an error avoidance technique (advisable or not..)? Since "" is still a string, you would be able to call string functions on it that would result in an exception if it was NULL?


An empty string is a value (a piece of text which, incidentally, happens not to contain any letters). Null signifies no-value.

I initialize variables to null when I wish to indicate that they do not point to or contain actual values - when the intent is for no-value.


I always declare string with string.empty;


I think there's no reason not to use null for an unassigned (or at this place in a program flow not occurring) value. If you want to distinguish, there's ==null. If you just want to check for a certain value and don't care whether it's null or something different, String.Equals("XXX",MyStringVar) does just fine.


An empty string is a value (a piece of text which, incidentally, happens not to contain any letters). Null signifies no-value.

I initialize variables to null when I wish to indicate that they do not point to or contain actual values - when the intent is for no-value.


Why do you want your string to be initialized at all? You don't have to initialize a variable when you declare one, and IMO, you should only do so when the value you are assigning is valid in the context of the code block.

I see this a lot:

string name = null; // or String.Empty
if (condition)
{
  name = "foo";
}
else
{
  name = "bar";
}

return name;

Not initializing to null would be just as effective. Furthermore, most often you want a value to be assigned. By initializing to null, you can potentially miss code paths that don't assign a value. Like so:

string name = null; // or String.Empty
if (condition)
{
  name = "foo";
}
else if (othercondition)
{
  name = "bar";
}

return name; //returns null when condition and othercondition are false

When you don't initialize to null, the compiler will generate an error saying that not all code paths assign a value. Of course, this is a very simple example...

Matthijs


For most software that isn't actually string-processing software, program logic ought not to depend on the content of string variables. Whenever I see something like this in a program:

if (s == "value")

I get a bad feeling. Why is there a string literal in this method? What's setting s? Does it know that logic depends on the value of the string? Does it know that it has to be lower case to work? Should I be fixing this by changing it to use String.Compare? Should I be creating an Enum and parsing into it?

From this perspective, one gets to a philosophy of code that's pretty simple: you avoid examining a string's contents wherever possible. Comparing a string to String.Empty is really just a special case of comparing it to a literal: it's something to avoid doing unless you really have to.

Knowing this, I don't blink when I see something like this in our code base:

string msg = Validate(item);
if (msg != null)
{
   DisplayErrorMessage(msg);
   return;
}

I know that Validate would never return String.Empty, because we write better code than that.

Of course, the rest of the world doesn't work like this. When your program is dealing with user input, databases, files, and so on, you have to account for other philosophies. There, it's the job of your code to impose order on chaos. Part of that order is knowing when an empty string should mean String.Empty and when it should mean null.

(Just to make sure I wasn't talking out of my ass, I just searched our codebase for `String.IsNullOrEmpty'. All 54 occurrences of it are in methods that process user input, return values from Python scripts, examine values retrieved from external APIs, etc.)


According to MSDN:

By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.

Always using IsNullOrEmpty() is good practice nevertheless.


This is actually a gaping hole in the C# language. There is no way to define a string that cannot be null. This causes problems as simple as the one you are describing, which forces programmers to make a decision they shouldn't have to make, since in many cases NULL and String.Empty mean the same thing. That, in turn, can later force other programmers to have to handle both NULL and String.Empty, which is annoying.

A bigger problem is that databases allow you to define fields that map to a C# string, but database fields can be defined as NOT NULL. So, there is no way to accurately represent, say, a varchar( 100 ) NOT NULL field in SQL Server using a C# type.

Other languages, such as Spec #, do allow this.

In my opinion, C#'s inability to define a string that doesn't allow null is just as bad as its previous inability to define an int that does allow null.

To completely answer your question: I always use empty string for default initialization because it is more similar to how database data types work. (Edit: This statement was very unclear. It should read "I use empty string for default initialization when NULL is a superfluous state, much in the same way I set up a database column as NOT NULL if NULL would be a superfluous state. Similarly, many of my DB columns are set up as NOT NULL, so when I bring those into a C# string, the string will be empty or have a value, but will never be NULL. In other words, I only initialize a string to NULL if null has a meaning that is distinct from the meaning of String.Empty, and I find that case to be less than common (but people here have given legitimate examples of this case).")


Strings aren't value types, and never will be ;-)


seems like this is a special case of the http://en.wikipedia.org/wiki/Null_Object_pattern


It depends on the situation. In most cases I use String.Empty because I don't want to be doing null checks every time I attempt to use a string. It makes the code a lot simpler and you are less likely to introduce unwanted NullReferenceException crashes.

I only set the string to null when I need to know if it has been set or not and where an empty string is something valid to set it to. In practice, I find these situations rare.


I always initialise them as NULL.

I always use string.IsNullOrEmpty(someString) to check it's value.

Simple.


Is it possible that this is an error avoidance technique (advisable or not..)? Since "" is still a string, you would be able to call string functions on it that would result in an exception if it was NULL?


It depends.

Do you need to be able to tell if the value is missing (is it possible for it to not be defined)?

Is the empty string a valid value for the usage of that string?

If you answered "yes" to both, then you'll want to use null. Otherwise you can't tell the difference between "no value" and "empty string".

If you don't need to know if there's no value then the empty string is probably safer, as it allows you to skip null checks wherever you use it.


Strings aren't value types, and never will be ;-)


According to MSDN:

By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.

Always using IsNullOrEmpty() is good practice nevertheless.


It depends on the situation. In most cases I use String.Empty because I don't want to be doing null checks every time I attempt to use a string. It makes the code a lot simpler and you are less likely to introduce unwanted NullReferenceException crashes.

I only set the string to null when I need to know if it has been set or not and where an empty string is something valid to set it to. In practice, I find these situations rare.


An empty string is a value (a piece of text which, incidentally, happens not to contain any letters). Null signifies no-value.

I initialize variables to null when I wish to indicate that they do not point to or contain actual values - when the intent is for no-value.


This is actually a gaping hole in the C# language. There is no way to define a string that cannot be null. This causes problems as simple as the one you are describing, which forces programmers to make a decision they shouldn't have to make, since in many cases NULL and String.Empty mean the same thing. That, in turn, can later force other programmers to have to handle both NULL and String.Empty, which is annoying.

A bigger problem is that databases allow you to define fields that map to a C# string, but database fields can be defined as NOT NULL. So, there is no way to accurately represent, say, a varchar( 100 ) NOT NULL field in SQL Server using a C# type.

Other languages, such as Spec #, do allow this.

In my opinion, C#'s inability to define a string that doesn't allow null is just as bad as its previous inability to define an int that does allow null.

To completely answer your question: I always use empty string for default initialization because it is more similar to how database data types work. (Edit: This statement was very unclear. It should read "I use empty string for default initialization when NULL is a superfluous state, much in the same way I set up a database column as NOT NULL if NULL would be a superfluous state. Similarly, many of my DB columns are set up as NOT NULL, so when I bring those into a C# string, the string will be empty or have a value, but will never be NULL. In other words, I only initialize a string to NULL if null has a meaning that is distinct from the meaning of String.Empty, and I find that case to be less than common (but people here have given legitimate examples of this case).")


I always initialise them as NULL.

I always use string.IsNullOrEmpty(someString) to check it's value.

Simple.


I always declare string with string.empty;


It depends.

Do you need to be able to tell if the value is missing (is it possible for it to not be defined)?

Is the empty string a valid value for the usage of that string?

If you answered "yes" to both, then you'll want to use null. Otherwise you can't tell the difference between "no value" and "empty string".

If you don't need to know if there's no value then the empty string is probably safer, as it allows you to skip null checks wherever you use it.