[c#] Converting string to byte array in C#

I'm converting something from VB into C#. Having a problem with the syntax of this statement:

if ((searchResult.Properties["user"].Count > 0))
{
    profile.User = System.Text.Encoding.UTF8.GetString(searchResult.Properties["user"][0]);
}

I then see the following errors:

Argument 1: cannot convert from 'object' to 'byte[]'

The best overloaded method match for 'System.Text.Encoding.GetString(byte[])' has some invalid arguments

I tried to fix the code based on this post, but still no success

string User = Encoding.UTF8.GetString("user", 0);

Any suggestions?

This question is related to c# string vb.net encoding byte

The answer is


A refinement to JustinStolle's edit (Eran Yogev's use of BlockCopy).

The proposed solution is indeed faster than using Encoding. Problem is that it doesn't work for encoding byte arrays of uneven length. As given, it raises an out-of-bound exception. Increasing the length by 1 leaves a trailing byte when decoding from string.

For me, the need came when I wanted to encode from DataTable to JSON. I was looking for a way to encode binary fields into strings and decode from string back to byte[].

I therefore created two classes - one that wraps the above solution (when encoding from strings it's fine, because the lengths are always even), and another that handles byte[] encoding.

I solved the uneven length problem by adding a single character that tells me if the original length of the binary array was odd ('1') or even ('0')

As follows:

public static class StringEncoder
{
    static byte[] EncodeToBytes(string str)
    {
        byte[] bytes = new byte[str.Length * sizeof(char)];
        System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
        return bytes;
    }
    static string DecodeToString(byte[] bytes)
    {
        char[] chars = new char[bytes.Length / sizeof(char)];
        System.Buffer.BlockCopy(bytes, 0, chars, 0, bytes.Length);
        return new string(chars);
    }
}

public static class BytesEncoder
{
    public static string EncodeToString(byte[] bytes)
    {
        bool even = (bytes.Length % 2 == 0);
        char[] chars = new char[1 + bytes.Length / sizeof(char) + (even ? 0 : 1)];
        chars[0] = (even ? '0' : '1');
        System.Buffer.BlockCopy(bytes, 0, chars, 2, bytes.Length);

        return new string(chars);
    }
    public static byte[] DecodeToBytes(string str)
    {
        bool even = str[0] == '0';
        byte[] bytes = new byte[(str.Length - 1) * sizeof(char) + (even ? 0 : -1)];
        char[] chars = str.ToCharArray();
        System.Buffer.BlockCopy(chars, 2, bytes, 0, bytes.Length);

        return bytes;
    }
}

First of all, add the System.Text namespace

using System.Text;

Then use this code

string input = "some text"; 
byte[] array = Encoding.ASCII.GetBytes(input);

Hope to fix it!


This work for me, after that I could convert put my picture in a bytea field in my database.

using (MemoryStream s = new MemoryStream(DirEntry.Properties["thumbnailphoto"].Value as byte[]))
{
    return s.ToArray();
}

This what worked for me

byte[] bytes = Convert.FromBase64String(textString);

And in reverse:

string str = Convert.ToBase64String(bytes);

use this

byte[] myByte= System.Text.ASCIIEncoding.Default.GetBytes(myString);

Encoding.Default should not be used...

@Randall's answer uses Encoding.Default, however Microsoft raises a warning against it:

Different computers can use different encodings as the default, and the default encoding can change on a single computer. If you use the Default encoding to encode and decode data streamed between computers or retrieved at different times on the same computer, it may translate that data incorrectly. In addition, the encoding returned by the Default property uses best-fit fallback to map unsupported characters to characters supported by the code page. For these reasons, using the default encoding is not recommended. To ensure that encoded bytes are decoded properly, you should use a Unicode encoding, such as UTF8Encoding or UnicodeEncoding. You could also use a higher-level protocol to ensure that the same format is used for encoding and decoding.

To check what the default encoding is, use Encoding.Default.WindowsCodePage (1250 in my case - and sadly, there is no predefined class of CP1250 encoding, but the object could be retrieved as Encoding.GetEncoding(1250)).

...UTF-8 encoding should be used instead...

Encoding.ASCII is 7bit, so it doesn't work either, in my case:

byte[] pass = Encoding.ASCII.GetBytes("šarže");
Console.WriteLine(Encoding.ASCII.GetString(pass)); // ?ar?e

Following Microsoft's recommendation:

var utf8 = new UTF8Encoding();
byte[] pass = utf8.GetBytes("šarže");
Console.WriteLine(utf8.GetString(pass)); // šarže

Encoding.UTF8 recommended by others is an instance uf UTF-8 encoding and can be also used directly or as

var utf8 = Encoding.UTF8 as UTF8Encoding;

...but it is not used always

Default encoding is misleading: .NET uses UTF-8 everywhere (including strings hardcoded in the source code), but Windows actually uses 2 other non-UTF8 non-standard defaults: ANSI codepage (for GUI apps before .NET) and OEM codepage (aka DOS standard). These differs from country to country (for instance, Windows Czech edition uses CP1250 and CP852) and are oftentimes hardcoded in windows API libraries. So if you just set UTF-8 to console by chcp 65001 (as .NET implicitly does and pretends it is the default) and run some localized command (like ping), it works in English version, but you get tofu text in Czech Republic.

Let me share my real world experience: I created WinForms application customizing git scripts for teachers. The output is obtained on the background anynchronously by a process described by Microsoft as (bold text added by me):

The word "shell" in this context (UseShellExecute) refers to a graphical shell (ANSI CP) (similar to the Windows shell) rather than command shells (for example, bash or sh) (OEM CP) and lets users launch graphical applications or open documents (with messed output in non-US environment).

So effectively GUI defaults to UTF-8, process defaults to CP1250 and console defaults to 852. So the output is in 852 interpreted as UTF-8 interpreted as CP1250. I got tofu text from which I could not deduce the original codepage due to the double conversion. I was pulling my hair for a week to figure out to explicitly set UTF-8 for process script and convert the output from CP1250 to UTF-8 in the main thread. Now it works here in the Eastern Europe, but Western Europe Windows uses 1252. ANSI CP is not determined easily as many commands like systeminfo are also localized and other methods differs from version to version: in such environment displaying national characters reliably is almost unfeasible.

So until the half of 21st century, please DO NOT use any "Default Codepage" and set it explicitly (to UTF-8 if possible).


The following approach will work only if the chars are 1 byte. (Default unicode will not work since it is 2 bytes)

public static byte[] ToByteArray(string value)
{            
    char[] charArr = value.ToCharArray();
    byte[] bytes = new byte[charArr.Length];
    for (int i = 0; i < charArr.Length; i++)
    {
        byte current = Convert.ToByte(charArr[i]);
        bytes[i] = current;
    }

    return bytes;
}

Keeping it simple


If you already have a byte array then you will need to know what type of encoding was used to make it into that byte array.

For example, if the byte array was created like this:

byte[] bytes = Encoding.ASCII.GetBytes(someString);

You will need to turn it back into a string like this:

string someString = Encoding.ASCII.GetString(bytes);

If you can find in the code you inherited, the encoding used to create the byte array then you should be set.


Building off Ali's answer, I would recommend an extension method that allows you to optionally pass in the encoding you want to use:

using System.Text;
public static class StringExtensions
{
    /// <summary>
    /// Creates a byte array from the string, using the 
    /// System.Text.Encoding.Default encoding unless another is specified.
    /// </summary>
    public static byte[] ToByteArray(this string str, Encoding encoding = Encoding.Default)
    {
        return encoding.GetBytes(str);
    }
}

And use it like below:

string foo = "bla bla";

// default encoding
byte[] default = foo.ToByteArray();

// custom encoding
byte[] unicode = foo.ToByteArray(Encoding.Unicode);

static byte[] GetBytes(string str)
{
     byte[] bytes = new byte[str.Length * sizeof(char)];
     System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
     return bytes;
}

static string GetString(byte[] bytes)
{
     char[] chars = new char[bytes.Length / sizeof(char)];
     System.Buffer.BlockCopy(bytes, 0, chars, 0, bytes.Length);
     return new string(chars);
}

Does anyone see any reason why not to do this?

mystring.Select(Convert.ToByte).ToArray()

This question has been answered sufficiently many times, but with C# 7.2 and the introduction of the Span type, there is a faster way to do this in unsafe code:

public static class StringSupport
{
    private static readonly int _charSize = sizeof(char);

    public static unsafe byte[] GetBytes(string str)
    {
        if (str == null) throw new ArgumentNullException(nameof(str));
        if (str.Length == 0) return new byte[0];

        fixed (char* p = str)
        {
            return new Span<byte>(p, str.Length * _charSize).ToArray();
        }
    }

    public static unsafe string GetString(byte[] bytes)
    {
        if (bytes == null) throw new ArgumentNullException(nameof(bytes));
        if (bytes.Length % _charSize != 0) throw new ArgumentException($"Invalid {nameof(bytes)} length");
        if (bytes.Length == 0) return string.Empty;

        fixed (byte* p = bytes)
        {
            return new string(new Span<char>(p, bytes.Length / _charSize));
        }
    }
}

Keep in mind that the bytes represent a UTF-16 encoded string (called "Unicode" in C# land).

Some quick benchmarking shows that the above methods are roughly 5x faster than their Encoding.Unicode.GetBytes(...)/GetString(...) implementations for medium sized strings (30-50 chars), and even faster for larger strings. These methods also seem to be faster than using pointers with Marshal.Copy(..) or Buffer.MemoryCopy(...).


This has been answered quite a lot, but for me, the only working method is this one:

    public static byte[] StringToByteArray(string str)
    {
        byte[] array = Convert.FromBase64String(str);
        return array;
    }

You could use MemoryMarshal API to perform very fast and efficient conversion. String will implicitly be cast to ReadOnlySpan<byte>, as MemoryMarshal.Cast accepts either Span<byte> or ReadOnlySpan<byte> as an input parameter.

public static class StringExtensions
{
    public static byte[] ToByteArray(this string s) => s.ToByteSpan().ToArray(); //  heap allocation, use only when you cannot operate on spans
    public static ReadOnlySpan<byte> ToByteSpan(this string s) => MemoryMarshal.Cast<char, byte>(s);
}

Following benchmark shows the difference:

Input: "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s,"

|                       Method |       Mean |     Error |    StdDev |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------------------- |-----------:|----------:|----------:|-------:|------:|------:|----------:|
| UsingEncodingUnicodeGetBytes | 160.042 ns | 3.2864 ns | 6.4099 ns | 0.0780 |     - |     - |     328 B |
| UsingMemoryMarshalAndToArray |  31.977 ns | 0.7177 ns | 1.5753 ns | 0.0781 |     - |     - |     328 B |
|           UsingMemoryMarshal |   1.027 ns | 0.0565 ns | 0.1630 ns |      - |     - |     - |         - |

var result = System.Text.Encoding.Unicode.GetBytes(text);

If the result of, 'searchResult.Properties [ "user" ] [ 0 ]', is a string:

if ( ( searchResult.Properties [ "user" ].Count > 0 ) ) {

   profile.User = System.Text.Encoding.UTF8.GetString ( searchResult.Properties [ "user" ] [ 0 ].ToCharArray ().Select ( character => ( byte ) character ).ToArray () );

}

The key point being that converting a string to a byte [] can be done using LINQ:

.ToCharArray ().Select ( character => ( byte ) character ).ToArray () )

And the inverse:

.Select ( character => ( char ) character ).ToArray () )

Also you can use an Extension Method to add a method to the string type as below:

static class Helper
{
   public static byte[] ToByteArray(this string str)
   {
      return System.Text.Encoding.ASCII.GetBytes(str);
   }
}

And use it like below:

string foo = "bla bla";
byte[] result = foo.ToByteArray();

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to vb.net

How to get parameter value for date/time column from empty MaskedTextBox HTTP 415 unsupported media type error when calling Web API 2 endpoint variable is not declared it may be inaccessible due to its protection level Differences Between vbLf, vbCrLf & vbCr Constants Simple working Example of json.net in VB.net How to open up a form from another form in VB.NET? Delete a row in DataGridView Control in VB.NET How to get cell value from DataGridView in VB.Net? Set default format of datetimepicker as dd-MM-yyyy How to configure SMTP settings in web.config

Examples related to encoding

How to check encoding of a CSV file UnicodeEncodeError: 'ascii' codec can't encode character at special name Using Javascript's atob to decode base64 doesn't properly decode utf-8 strings What is the difference between utf8mb4 and utf8 charsets in MySQL? The character encoding of the plain text document was not declared - mootool script UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128) How to encode text to base64 in python UTF-8 output from PowerShell Set Encoding of File to UTF8 With BOM in Sublime Text 3 Replace non-ASCII characters with a single space

Examples related to byte

Convert bytes to int? TypeError: a bytes-like object is required, not 'str' when writing to a file in Python3 How many bits is a "word"? How many characters can you store with 1 byte? Save byte array to file Convert dictionary to bytes and back again python? How do I get total physical memory size using PowerShell without WMI? Hashing with SHA1 Algorithm in C# How to create a byte array in C++? Converting string to byte array in C#