[c#] Can I convert a C# string value to an escaped string literal

In C#, can I convert a string value to a string literal, the way I would see it in code? I would like to replace tabs, newlines, etc. with their escape sequences.

If this code:

Console.WriteLine(someString);

produces:

Hello
World!

I want this code:

Console.WriteLine(ToLiteral(someString));

to produce:

\tHello\r\n\tWorld!\r\n

This question is related to c# string escaping

The answer is


Hallgrim's answer was excellent. Here's a small tweak in case you need to parse out additional whitespace characters and linebreaks with a c# regular expression. I needed this in the case of a serialized Json value for insertion into google sheets and ran into trouble as the code was inserting tabs, +, spaces, etc.

  provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, null);
  var literal = writer.ToString();
  var r2 = new Regex(@"\"" \+.\n[\s]+\""", RegexOptions.ECMAScript);
  literal = r2.Replace(literal, "");
  return literal;

EDIT: A more structured approach, including all escape sequences for strings and chars.
Doesn't replace unicode characters with their literal equivalent. Doesn't cook eggs, either.

public class ReplaceString
{
    static readonly IDictionary<string, string> m_replaceDict 
        = new Dictionary<string, string>();

    const string ms_regexEscapes = @"[\a\b\f\n\r\t\v\\""]";

    public static string StringLiteral(string i_string)
    {
        return Regex.Replace(i_string, ms_regexEscapes, match);
    }

    public static string CharLiteral(char c)
    {
        return c == '\'' ? @"'\''" : string.Format("'{0}'", c);
    }

    private static string match(Match m)
    {
        string match = m.ToString();
        if (m_replaceDict.ContainsKey(match))
        {
            return m_replaceDict[match];
        }

        throw new NotSupportedException();
    }

    static ReplaceString()
    {
        m_replaceDict.Add("\a", @"\a");
        m_replaceDict.Add("\b", @"\b");
        m_replaceDict.Add("\f", @"\f");
        m_replaceDict.Add("\n", @"\n");
        m_replaceDict.Add("\r", @"\r");
        m_replaceDict.Add("\t", @"\t");
        m_replaceDict.Add("\v", @"\v");

        m_replaceDict.Add("\\", @"\\");
        m_replaceDict.Add("\0", @"\0");

        //The SO parser gets fooled by the verbatim version 
        //of the string to replace - @"\"""
        //so use the 'regular' version
        m_replaceDict.Add("\"", "\\\""); 
    }

    static void Main(string[] args){

        string s = "here's a \"\n\tstring\" to test";
        Console.WriteLine(ReplaceString.StringLiteral(s));
        Console.WriteLine(ReplaceString.CharLiteral('c'));
        Console.WriteLine(ReplaceString.CharLiteral('\''));

    }
}

I submit my own implementation, which handles null values and should be more performant on account of using array lookup tables, manual hex conversion, and avoiding switch statements.

using System;
using System.Text;
using System.Linq;

public static class StringLiteralEncoding {
  private static readonly char[] HEX_DIGIT_LOWER = "0123456789abcdef".ToCharArray();
  private static readonly char[] LITERALENCODE_ESCAPE_CHARS;

  static StringLiteralEncoding() {
    // Per http://msdn.microsoft.com/en-us/library/h21280bw.aspx
    var escapes = new string[] { "\aa", "\bb", "\ff", "\nn", "\rr", "\tt", "\vv", "\"\"", "\\\\", "??", "\00" };
    LITERALENCODE_ESCAPE_CHARS = new char[escapes.Max(e => e[0]) + 1];
    foreach(var escape in escapes)
      LITERALENCODE_ESCAPE_CHARS[escape[0]] = escape[1];
  }

  /// <summary>
  /// Convert the string to the equivalent C# string literal, enclosing the string in double quotes and inserting
  /// escape sequences as necessary.
  /// </summary>
  /// <param name="s">The string to be converted to a C# string literal.</param>
  /// <returns><paramref name="s"/> represented as a C# string literal.</returns>
  public static string Encode(string s) {
    if(null == s) return "null";

    var sb = new StringBuilder(s.Length + 2).Append('"');
    for(var rp = 0; rp < s.Length; rp++) {
      var c = s[rp];
      if(c < LITERALENCODE_ESCAPE_CHARS.Length && '\0' != LITERALENCODE_ESCAPE_CHARS[c])
        sb.Append('\\').Append(LITERALENCODE_ESCAPE_CHARS[c]);
      else if('~' >= c && c >= ' ')
        sb.Append(c);
      else
        sb.Append(@"\x")
          .Append(HEX_DIGIT_LOWER[c >> 12 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  8 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  4 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c       & 0x0F]);
    }

    return sb.Append('"').ToString();
  }
}

EDIT: A more structured approach, including all escape sequences for strings and chars.
Doesn't replace unicode characters with their literal equivalent. Doesn't cook eggs, either.

public class ReplaceString
{
    static readonly IDictionary<string, string> m_replaceDict 
        = new Dictionary<string, string>();

    const string ms_regexEscapes = @"[\a\b\f\n\r\t\v\\""]";

    public static string StringLiteral(string i_string)
    {
        return Regex.Replace(i_string, ms_regexEscapes, match);
    }

    public static string CharLiteral(char c)
    {
        return c == '\'' ? @"'\''" : string.Format("'{0}'", c);
    }

    private static string match(Match m)
    {
        string match = m.ToString();
        if (m_replaceDict.ContainsKey(match))
        {
            return m_replaceDict[match];
        }

        throw new NotSupportedException();
    }

    static ReplaceString()
    {
        m_replaceDict.Add("\a", @"\a");
        m_replaceDict.Add("\b", @"\b");
        m_replaceDict.Add("\f", @"\f");
        m_replaceDict.Add("\n", @"\n");
        m_replaceDict.Add("\r", @"\r");
        m_replaceDict.Add("\t", @"\t");
        m_replaceDict.Add("\v", @"\v");

        m_replaceDict.Add("\\", @"\\");
        m_replaceDict.Add("\0", @"\0");

        //The SO parser gets fooled by the verbatim version 
        //of the string to replace - @"\"""
        //so use the 'regular' version
        m_replaceDict.Add("\"", "\\\""); 
    }

    static void Main(string[] args){

        string s = "here's a \"\n\tstring\" to test";
        Console.WriteLine(ReplaceString.StringLiteral(s));
        Console.WriteLine(ReplaceString.CharLiteral('c'));
        Console.WriteLine(ReplaceString.CharLiteral('\''));

    }
}

What about Regex.Escape(String) ?

Regex.Escape escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes.


My attempt at adding ToVerbatim to Hallgrim's accepted answer above:

private static string ToLiteral(string input)
{
    using (var writer = new StringWriter())
    {
        using (var provider = CodeDomProvider.CreateProvider("CSharp"))
        {
            provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, new CodeGeneratorOptions { IndentString = "\t" });
            var literal = writer.ToString();
            literal = literal.Replace(string.Format("\" +{0}\t\"", Environment.NewLine), "");           
            return literal;
        }
    }
}

private static string ToVerbatim( string input )
{
    string literal = ToLiteral( input );
    string verbatim = "@" + literal.Replace( @"\r\n", Environment.NewLine );
    return verbatim;
}

Interesting question.

If you can't find a better method, you can always replace.
In case you're opting for it, you could use this C# Escape Sequence List:

  • \' - single quote, needed for character literals
  • \" - double quote, needed for string literals
  • \ - backslash
  • \0 - Unicode character 0
  • \a - Alert (character 7)
  • \b - Backspace (character 8)
  • \f - Form feed (character 12)
  • \n - New line (character 10)
  • \r - Carriage return (character 13)
  • \t - Horizontal tab (character 9)
  • \v - Vertical quote (character 11)
  • \uxxxx - Unicode escape sequence for character with hex value xxxx
  • \xn[n][n][n] - Unicode escape sequence for character with hex value nnnn (variable length version of \uxxxx)
  • \Uxxxxxxxx - Unicode escape sequence for character with hex value xxxxxxxx (for generating surrogates)

This list can be found in the C# Frequently Asked Questions What character escape sequences are available?


If JSON conventions are enough for the unescaped strings you want to get escaped and you already use Newtonsoft.Jsonin your project (it has a pretty large overhead) you may use this package like the following:

using System;
using Newtonsoft.Json;

public class Program
{
    public static void Main()
    {
    Console.WriteLine(ToLiteral( @"abc\n123") );
    }

    private static string ToLiteral(string input){
        return JsonConvert.DeserializeObject<string>("\"" + input + "\"");
    }
}

public static class StringEscape
{
  static char[] toEscape = "\0\x1\x2\x3\x4\x5\x6\a\b\t\n\v\f\r\xe\xf\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\"\\".ToCharArray();
  static string[] literals = @"\0,\x0001,\x0002,\x0003,\x0004,\x0005,\x0006,\a,\b,\t,\n,\v,\f,\r,\x000e,\x000f,\x0010,\x0011,\x0012,\x0013,\x0014,\x0015,\x0016,\x0017,\x0018,\x0019,\x001a,\x001b,\x001c,\x001d,\x001e,\x001f".Split(new char[] { ',' });

  public static string Escape(this string input)
  {
    int i = input.IndexOfAny(toEscape);
    if (i < 0) return input;

    var sb = new System.Text.StringBuilder(input.Length + 5);
    int j = 0;
    do
    {
      sb.Append(input, j, i - j);
      var c = input[i];
      if (c < 0x20) sb.Append(literals[c]); else sb.Append(@"\").Append(c);
    } while ((i = input.IndexOfAny(toEscape, j = ++i)) > 0);

    return sb.Append(input, j, input.Length - j).ToString();
  }
}

public static class StringHelpers
{
    private static Dictionary<string, string> escapeMapping = new Dictionary<string, string>()
    {
        {"\"", @"\\\"""},
        {"\\\\", @"\\"},
        {"\a", @"\a"},
        {"\b", @"\b"},
        {"\f", @"\f"},
        {"\n", @"\n"},
        {"\r", @"\r"},
        {"\t", @"\t"},
        {"\v", @"\v"},
        {"\0", @"\0"},
    };

    private static Regex escapeRegex = new Regex(string.Join("|", escapeMapping.Keys.ToArray()));

    public static string Escape(this string s)
    {
        return escapeRegex.Replace(s, EscapeMatchEval);
    }

    private static string EscapeMatchEval(Match m)
    {
        if (escapeMapping.ContainsKey(m.Value))
        {
            return escapeMapping[m.Value];
        }
        return escapeMapping[Regex.Escape(m.Value)];
    }
}

What about Regex.Escape(String) ?

Regex.Escape escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes.


try:

var t = HttpUtility.JavaScriptStringEncode(s);

EDIT: A more structured approach, including all escape sequences for strings and chars.
Doesn't replace unicode characters with their literal equivalent. Doesn't cook eggs, either.

public class ReplaceString
{
    static readonly IDictionary<string, string> m_replaceDict 
        = new Dictionary<string, string>();

    const string ms_regexEscapes = @"[\a\b\f\n\r\t\v\\""]";

    public static string StringLiteral(string i_string)
    {
        return Regex.Replace(i_string, ms_regexEscapes, match);
    }

    public static string CharLiteral(char c)
    {
        return c == '\'' ? @"'\''" : string.Format("'{0}'", c);
    }

    private static string match(Match m)
    {
        string match = m.ToString();
        if (m_replaceDict.ContainsKey(match))
        {
            return m_replaceDict[match];
        }

        throw new NotSupportedException();
    }

    static ReplaceString()
    {
        m_replaceDict.Add("\a", @"\a");
        m_replaceDict.Add("\b", @"\b");
        m_replaceDict.Add("\f", @"\f");
        m_replaceDict.Add("\n", @"\n");
        m_replaceDict.Add("\r", @"\r");
        m_replaceDict.Add("\t", @"\t");
        m_replaceDict.Add("\v", @"\v");

        m_replaceDict.Add("\\", @"\\");
        m_replaceDict.Add("\0", @"\0");

        //The SO parser gets fooled by the verbatim version 
        //of the string to replace - @"\"""
        //so use the 'regular' version
        m_replaceDict.Add("\"", "\\\""); 
    }

    static void Main(string[] args){

        string s = "here's a \"\n\tstring\" to test";
        Console.WriteLine(ReplaceString.StringLiteral(s));
        Console.WriteLine(ReplaceString.CharLiteral('c'));
        Console.WriteLine(ReplaceString.CharLiteral('\''));

    }
}

Code:

string someString1 = "\tHello\r\n\tWorld!\r\n";
string someString2 = @"\tHello\r\n\tWorld!\r\n";

Console.WriteLine(someString1);
Console.WriteLine(someString2);

Output:

    Hello
    World!

\tHello\r\n\tWorld!\r\n

Is this what you want?


Here is a little improvement for Smilediver's answer, it will not escape all no-ASCII chars but only these are really needed.

using System;
using System.Globalization;
using System.Text;

public static class CodeHelper
{
    public static string ToLiteral(this string input)
    {
        var literal = new StringBuilder(input.Length + 2);
        literal.Append("\"");
        foreach (var c in input)
        {
            switch (c)
            {
                case '\'': literal.Append(@"\'"); break;
                case '\"': literal.Append("\\\""); break;
                case '\\': literal.Append(@"\\"); break;
                case '\0': literal.Append(@"\0"); break;
                case '\a': literal.Append(@"\a"); break;
                case '\b': literal.Append(@"\b"); break;
                case '\f': literal.Append(@"\f"); break;
                case '\n': literal.Append(@"\n"); break;
                case '\r': literal.Append(@"\r"); break;
                case '\t': literal.Append(@"\t"); break;
                case '\v': literal.Append(@"\v"); break;
                default:
                    if (Char.GetUnicodeCategory(c) != UnicodeCategory.Control)
                    {
                        literal.Append(c);
                    }
                    else
                    {
                        literal.Append(@"\u");
                        literal.Append(((ushort)c).ToString("x4"));
                    }
                    break;
            }
        }
        literal.Append("\"");
        return literal.ToString();
    }
}

Fully working implementation, including escaping of Unicode and ASCII non printable characters. Does not insert "+" signs like Hallgrim's answer.

    static string ToLiteral(string input) {
        StringBuilder literal = new StringBuilder(input.Length + 2);
        literal.Append("\"");
        foreach (var c in input) {
            switch (c) {
                case '\'': literal.Append(@"\'"); break;
                case '\"': literal.Append("\\\""); break;
                case '\\': literal.Append(@"\\"); break;
                case '\0': literal.Append(@"\0"); break;
                case '\a': literal.Append(@"\a"); break;
                case '\b': literal.Append(@"\b"); break;
                case '\f': literal.Append(@"\f"); break;
                case '\n': literal.Append(@"\n"); break;
                case '\r': literal.Append(@"\r"); break;
                case '\t': literal.Append(@"\t"); break;
                case '\v': literal.Append(@"\v"); break;
                default:
                    // ASCII printable character
                    if (c >= 0x20 && c <= 0x7e) {
                        literal.Append(c);
                    // As UTF16 escaped character
                    } else {
                        literal.Append(@"\u");
                        literal.Append(((int)c).ToString("x4"));
                    }
                    break;
            }
        }
        literal.Append("\"");
        return literal.ToString();
    }

Fully working implementation, including escaping of Unicode and ASCII non printable characters. Does not insert "+" signs like Hallgrim's answer.

    static string ToLiteral(string input) {
        StringBuilder literal = new StringBuilder(input.Length + 2);
        literal.Append("\"");
        foreach (var c in input) {
            switch (c) {
                case '\'': literal.Append(@"\'"); break;
                case '\"': literal.Append("\\\""); break;
                case '\\': literal.Append(@"\\"); break;
                case '\0': literal.Append(@"\0"); break;
                case '\a': literal.Append(@"\a"); break;
                case '\b': literal.Append(@"\b"); break;
                case '\f': literal.Append(@"\f"); break;
                case '\n': literal.Append(@"\n"); break;
                case '\r': literal.Append(@"\r"); break;
                case '\t': literal.Append(@"\t"); break;
                case '\v': literal.Append(@"\v"); break;
                default:
                    // ASCII printable character
                    if (c >= 0x20 && c <= 0x7e) {
                        literal.Append(c);
                    // As UTF16 escaped character
                    } else {
                        literal.Append(@"\u");
                        literal.Append(((int)c).ToString("x4"));
                    }
                    break;
            }
        }
        literal.Append("\"");
        return literal.ToString();
    }

Code:

string someString1 = "\tHello\r\n\tWorld!\r\n";
string someString2 = @"\tHello\r\n\tWorld!\r\n";

Console.WriteLine(someString1);
Console.WriteLine(someString2);

Output:

    Hello
    World!

\tHello\r\n\tWorld!\r\n

Is this what you want?


Here is a little improvement for Smilediver's answer, it will not escape all no-ASCII chars but only these are really needed.

using System;
using System.Globalization;
using System.Text;

public static class CodeHelper
{
    public static string ToLiteral(this string input)
    {
        var literal = new StringBuilder(input.Length + 2);
        literal.Append("\"");
        foreach (var c in input)
        {
            switch (c)
            {
                case '\'': literal.Append(@"\'"); break;
                case '\"': literal.Append("\\\""); break;
                case '\\': literal.Append(@"\\"); break;
                case '\0': literal.Append(@"\0"); break;
                case '\a': literal.Append(@"\a"); break;
                case '\b': literal.Append(@"\b"); break;
                case '\f': literal.Append(@"\f"); break;
                case '\n': literal.Append(@"\n"); break;
                case '\r': literal.Append(@"\r"); break;
                case '\t': literal.Append(@"\t"); break;
                case '\v': literal.Append(@"\v"); break;
                default:
                    if (Char.GetUnicodeCategory(c) != UnicodeCategory.Control)
                    {
                        literal.Append(c);
                    }
                    else
                    {
                        literal.Append(@"\u");
                        literal.Append(((ushort)c).ToString("x4"));
                    }
                    break;
            }
        }
        literal.Append("\"");
        return literal.ToString();
    }
}

Interesting question.

If you can't find a better method, you can always replace.
In case you're opting for it, you could use this C# Escape Sequence List:

  • \' - single quote, needed for character literals
  • \" - double quote, needed for string literals
  • \ - backslash
  • \0 - Unicode character 0
  • \a - Alert (character 7)
  • \b - Backspace (character 8)
  • \f - Form feed (character 12)
  • \n - New line (character 10)
  • \r - Carriage return (character 13)
  • \t - Horizontal tab (character 9)
  • \v - Vertical quote (character 11)
  • \uxxxx - Unicode escape sequence for character with hex value xxxx
  • \xn[n][n][n] - Unicode escape sequence for character with hex value nnnn (variable length version of \uxxxx)
  • \Uxxxxxxxx - Unicode escape sequence for character with hex value xxxxxxxx (for generating surrogates)

This list can be found in the C# Frequently Asked Questions What character escape sequences are available?


If JSON conventions are enough for the unescaped strings you want to get escaped and you already use Newtonsoft.Jsonin your project (it has a pretty large overhead) you may use this package like the following:

using System;
using Newtonsoft.Json;

public class Program
{
    public static void Main()
    {
    Console.WriteLine(ToLiteral( @"abc\n123") );
    }

    private static string ToLiteral(string input){
        return JsonConvert.DeserializeObject<string>("\"" + input + "\"");
    }
}

EDIT: A more structured approach, including all escape sequences for strings and chars.
Doesn't replace unicode characters with their literal equivalent. Doesn't cook eggs, either.

public class ReplaceString
{
    static readonly IDictionary<string, string> m_replaceDict 
        = new Dictionary<string, string>();

    const string ms_regexEscapes = @"[\a\b\f\n\r\t\v\\""]";

    public static string StringLiteral(string i_string)
    {
        return Regex.Replace(i_string, ms_regexEscapes, match);
    }

    public static string CharLiteral(char c)
    {
        return c == '\'' ? @"'\''" : string.Format("'{0}'", c);
    }

    private static string match(Match m)
    {
        string match = m.ToString();
        if (m_replaceDict.ContainsKey(match))
        {
            return m_replaceDict[match];
        }

        throw new NotSupportedException();
    }

    static ReplaceString()
    {
        m_replaceDict.Add("\a", @"\a");
        m_replaceDict.Add("\b", @"\b");
        m_replaceDict.Add("\f", @"\f");
        m_replaceDict.Add("\n", @"\n");
        m_replaceDict.Add("\r", @"\r");
        m_replaceDict.Add("\t", @"\t");
        m_replaceDict.Add("\v", @"\v");

        m_replaceDict.Add("\\", @"\\");
        m_replaceDict.Add("\0", @"\0");

        //The SO parser gets fooled by the verbatim version 
        //of the string to replace - @"\"""
        //so use the 'regular' version
        m_replaceDict.Add("\"", "\\\""); 
    }

    static void Main(string[] args){

        string s = "here's a \"\n\tstring\" to test";
        Console.WriteLine(ReplaceString.StringLiteral(s));
        Console.WriteLine(ReplaceString.CharLiteral('c'));
        Console.WriteLine(ReplaceString.CharLiteral('\''));

    }
}

Hallgrim's answer was excellent. Here's a small tweak in case you need to parse out additional whitespace characters and linebreaks with a c# regular expression. I needed this in the case of a serialized Json value for insertion into google sheets and ran into trouble as the code was inserting tabs, +, spaces, etc.

  provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, null);
  var literal = writer.ToString();
  var r2 = new Regex(@"\"" \+.\n[\s]+\""", RegexOptions.ECMAScript);
  literal = r2.Replace(literal, "");
  return literal;

Hallgrim's answer is excellent, but the "+", newline and indent additions were breaking functionality for me. An easy way around it is:

private static string ToLiteral(string input)
{
    using (var writer = new StringWriter())
    {
        using (var provider = CodeDomProvider.CreateProvider("CSharp"))
        {
            provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, new CodeGeneratorOptions {IndentString = "\t"});
            var literal = writer.ToString();
            literal = literal.Replace(string.Format("\" +{0}\t\"", Environment.NewLine), "");
            return literal;
        }
    }
}

try:

var t = HttpUtility.JavaScriptStringEncode(s);

public static class StringHelpers
{
    private static Dictionary<string, string> escapeMapping = new Dictionary<string, string>()
    {
        {"\"", @"\\\"""},
        {"\\\\", @"\\"},
        {"\a", @"\a"},
        {"\b", @"\b"},
        {"\f", @"\f"},
        {"\n", @"\n"},
        {"\r", @"\r"},
        {"\t", @"\t"},
        {"\v", @"\v"},
        {"\0", @"\0"},
    };

    private static Regex escapeRegex = new Regex(string.Join("|", escapeMapping.Keys.ToArray()));

    public static string Escape(this string s)
    {
        return escapeRegex.Replace(s, EscapeMatchEval);
    }

    private static string EscapeMatchEval(Match m)
    {
        if (escapeMapping.ContainsKey(m.Value))
        {
            return escapeMapping[m.Value];
        }
        return escapeMapping[Regex.Escape(m.Value)];
    }
}

Interesting question.

If you can't find a better method, you can always replace.
In case you're opting for it, you could use this C# Escape Sequence List:

  • \' - single quote, needed for character literals
  • \" - double quote, needed for string literals
  • \ - backslash
  • \0 - Unicode character 0
  • \a - Alert (character 7)
  • \b - Backspace (character 8)
  • \f - Form feed (character 12)
  • \n - New line (character 10)
  • \r - Carriage return (character 13)
  • \t - Horizontal tab (character 9)
  • \v - Vertical quote (character 11)
  • \uxxxx - Unicode escape sequence for character with hex value xxxx
  • \xn[n][n][n] - Unicode escape sequence for character with hex value nnnn (variable length version of \uxxxx)
  • \Uxxxxxxxx - Unicode escape sequence for character with hex value xxxxxxxx (for generating surrogates)

This list can be found in the C# Frequently Asked Questions What character escape sequences are available?


public static class StringHelpers
{
    private static Dictionary<string, string> escapeMapping = new Dictionary<string, string>()
    {
        {"\"", @"\\\"""},
        {"\\\\", @"\\"},
        {"\a", @"\a"},
        {"\b", @"\b"},
        {"\f", @"\f"},
        {"\n", @"\n"},
        {"\r", @"\r"},
        {"\t", @"\t"},
        {"\v", @"\v"},
        {"\0", @"\0"},
    };

    private static Regex escapeRegex = new Regex(string.Join("|", escapeMapping.Keys.ToArray()));

    public static string Escape(this string s)
    {
        return escapeRegex.Replace(s, EscapeMatchEval);
    }

    private static string EscapeMatchEval(Match m)
    {
        if (escapeMapping.ContainsKey(m.Value))
        {
            return escapeMapping[m.Value];
        }
        return escapeMapping[Regex.Escape(m.Value)];
    }
}

Code:

string someString1 = "\tHello\r\n\tWorld!\r\n";
string someString2 = @"\tHello\r\n\tWorld!\r\n";

Console.WriteLine(someString1);
Console.WriteLine(someString2);

Output:

    Hello
    World!

\tHello\r\n\tWorld!\r\n

Is this what you want?


Interesting question.

If you can't find a better method, you can always replace.
In case you're opting for it, you could use this C# Escape Sequence List:

  • \' - single quote, needed for character literals
  • \" - double quote, needed for string literals
  • \ - backslash
  • \0 - Unicode character 0
  • \a - Alert (character 7)
  • \b - Backspace (character 8)
  • \f - Form feed (character 12)
  • \n - New line (character 10)
  • \r - Carriage return (character 13)
  • \t - Horizontal tab (character 9)
  • \v - Vertical quote (character 11)
  • \uxxxx - Unicode escape sequence for character with hex value xxxx
  • \xn[n][n][n] - Unicode escape sequence for character with hex value nnnn (variable length version of \uxxxx)
  • \Uxxxxxxxx - Unicode escape sequence for character with hex value xxxxxxxx (for generating surrogates)

This list can be found in the C# Frequently Asked Questions What character escape sequences are available?


My attempt at adding ToVerbatim to Hallgrim's accepted answer above:

private static string ToLiteral(string input)
{
    using (var writer = new StringWriter())
    {
        using (var provider = CodeDomProvider.CreateProvider("CSharp"))
        {
            provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, new CodeGeneratorOptions { IndentString = "\t" });
            var literal = writer.ToString();
            literal = literal.Replace(string.Format("\" +{0}\t\"", Environment.NewLine), "");           
            return literal;
        }
    }
}

private static string ToVerbatim( string input )
{
    string literal = ToLiteral( input );
    string verbatim = "@" + literal.Replace( @"\r\n", Environment.NewLine );
    return verbatim;
}

I submit my own implementation, which handles null values and should be more performant on account of using array lookup tables, manual hex conversion, and avoiding switch statements.

using System;
using System.Text;
using System.Linq;

public static class StringLiteralEncoding {
  private static readonly char[] HEX_DIGIT_LOWER = "0123456789abcdef".ToCharArray();
  private static readonly char[] LITERALENCODE_ESCAPE_CHARS;

  static StringLiteralEncoding() {
    // Per http://msdn.microsoft.com/en-us/library/h21280bw.aspx
    var escapes = new string[] { "\aa", "\bb", "\ff", "\nn", "\rr", "\tt", "\vv", "\"\"", "\\\\", "??", "\00" };
    LITERALENCODE_ESCAPE_CHARS = new char[escapes.Max(e => e[0]) + 1];
    foreach(var escape in escapes)
      LITERALENCODE_ESCAPE_CHARS[escape[0]] = escape[1];
  }

  /// <summary>
  /// Convert the string to the equivalent C# string literal, enclosing the string in double quotes and inserting
  /// escape sequences as necessary.
  /// </summary>
  /// <param name="s">The string to be converted to a C# string literal.</param>
  /// <returns><paramref name="s"/> represented as a C# string literal.</returns>
  public static string Encode(string s) {
    if(null == s) return "null";

    var sb = new StringBuilder(s.Length + 2).Append('"');
    for(var rp = 0; rp < s.Length; rp++) {
      var c = s[rp];
      if(c < LITERALENCODE_ESCAPE_CHARS.Length && '\0' != LITERALENCODE_ESCAPE_CHARS[c])
        sb.Append('\\').Append(LITERALENCODE_ESCAPE_CHARS[c]);
      else if('~' >= c && c >= ' ')
        sb.Append(c);
      else
        sb.Append(@"\x")
          .Append(HEX_DIGIT_LOWER[c >> 12 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  8 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c >>  4 & 0x0F])
          .Append(HEX_DIGIT_LOWER[c       & 0x0F]);
    }

    return sb.Append('"').ToString();
  }
}

There's a method for this in Roslyn's Microsoft.CodeAnalysis.CSharp package on nuget :

    private static string ToLiteral(string valueTextForCompiler)
    {
        return Microsoft.CodeAnalysis.CSharp.SymbolDisplay.FormatLiteral(valueTextForCompiler, false);
    }

Obviously this didn't exist at the time of the original question, but might help people who end up here from Google.


public static class StringHelpers
{
    private static Dictionary<string, string> escapeMapping = new Dictionary<string, string>()
    {
        {"\"", @"\\\"""},
        {"\\\\", @"\\"},
        {"\a", @"\a"},
        {"\b", @"\b"},
        {"\f", @"\f"},
        {"\n", @"\n"},
        {"\r", @"\r"},
        {"\t", @"\t"},
        {"\v", @"\v"},
        {"\0", @"\0"},
    };

    private static Regex escapeRegex = new Regex(string.Join("|", escapeMapping.Keys.ToArray()));

    public static string Escape(this string s)
    {
        return escapeRegex.Replace(s, EscapeMatchEval);
    }

    private static string EscapeMatchEval(Match m)
    {
        if (escapeMapping.ContainsKey(m.Value))
        {
            return escapeMapping[m.Value];
        }
        return escapeMapping[Regex.Escape(m.Value)];
    }
}

public static class StringEscape
{
  static char[] toEscape = "\0\x1\x2\x3\x4\x5\x6\a\b\t\n\v\f\r\xe\xf\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\"\\".ToCharArray();
  static string[] literals = @"\0,\x0001,\x0002,\x0003,\x0004,\x0005,\x0006,\a,\b,\t,\n,\v,\f,\r,\x000e,\x000f,\x0010,\x0011,\x0012,\x0013,\x0014,\x0015,\x0016,\x0017,\x0018,\x0019,\x001a,\x001b,\x001c,\x001d,\x001e,\x001f".Split(new char[] { ',' });

  public static string Escape(this string input)
  {
    int i = input.IndexOfAny(toEscape);
    if (i < 0) return input;

    var sb = new System.Text.StringBuilder(input.Length + 5);
    int j = 0;
    do
    {
      sb.Append(input, j, i - j);
      var c = input[i];
      if (c < 0x20) sb.Append(literals[c]); else sb.Append(@"\").Append(c);
    } while ((i = input.IndexOfAny(toEscape, j = ++i)) > 0);

    return sb.Append(input, j, input.Length - j).ToString();
  }
}

Code:

string someString1 = "\tHello\r\n\tWorld!\r\n";
string someString2 = @"\tHello\r\n\tWorld!\r\n";

Console.WriteLine(someString1);
Console.WriteLine(someString2);

Output:

    Hello
    World!

\tHello\r\n\tWorld!\r\n

Is this what you want?


There's a method for this in Roslyn's Microsoft.CodeAnalysis.CSharp package on nuget :

    private static string ToLiteral(string valueTextForCompiler)
    {
        return Microsoft.CodeAnalysis.CSharp.SymbolDisplay.FormatLiteral(valueTextForCompiler, false);
    }

Obviously this didn't exist at the time of the original question, but might help people who end up here from Google.


Hallgrim's answer is excellent, but the "+", newline and indent additions were breaking functionality for me. An easy way around it is:

private static string ToLiteral(string input)
{
    using (var writer = new StringWriter())
    {
        using (var provider = CodeDomProvider.CreateProvider("CSharp"))
        {
            provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, new CodeGeneratorOptions {IndentString = "\t"});
            var literal = writer.ToString();
            literal = literal.Replace(string.Format("\" +{0}\t\"", Environment.NewLine), "");
            return literal;
        }
    }
}

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to escaping

Uses for the '&quot;' entity in HTML Javascript - How to show escape characters in a string? How to print a single backslash? How to escape special characters of a string with single backslashes Saving utf-8 texts with json.dumps as UTF8, not as \u escape sequence Properly escape a double quote in CSV How to Git stash pop specific stash in 1.8.3? In Java, should I escape a single quotation mark (') in String (double quoted)? How do I escape a single quote ( ' ) in JavaScript? Which characters need to be escaped when using Bash?