[c#] How to convert an object to a byte array in C#

I have a collection of objects that I need to write to a binary file.

I need the bytes in the file to be compact, so I can't use BinaryFormatter. BinaryFormatter throws in all sorts of info for deserialization needs.

If I try

byte[] myBytes = (byte[]) myObject 

I get a runtime exception.

I need this to be fast so I'd rather not be copying arrays of bytes around. I'd just like the cast byte[] myBytes = (byte[]) myObject to work!

OK just to be clear, I cannot have any metadata in the output file. Just the object bytes. Packed object-to-object. Based on answers received, it looks like I'll be writing low-level Buffer.BlockCopy code. Perhaps using unsafe code.

This question is related to c#

The answer is


I believe what you're trying to do is impossible.

The junk that BinaryFormatter creates is necessary to recover the object from the file after your program stopped.
However it is possible to get the object data, you just need to know the exact size of it (more difficult than it sounds) :

public static unsafe byte[] Binarize(object obj, int size)
{
    var r = new byte[size];
    var rf = __makeref(obj);
    var a = **(IntPtr**)(&rf);
    Marshal.Copy(a, r, 0, size);
    return res;
}

this can be recovered via:

public unsafe static dynamic ToObject(byte[] bytes)
{
    var rf = __makeref(bytes);
    **(int**)(&rf) += 8;
    return GCHandle.Alloc(bytes).Target;
}

The reason why the above methods don't work for serialization is that the first four bytes in the returned data correspond to a RuntimeTypeHandle. The RuntimeTypeHandle describes the layout/type of the object but the value of it changes every time the program is ran.

EDIT: that is stupid don't do that --> If you already know the type of the object to be deserialized for certain you can switch those bytes for BitConvertes.GetBytes((int)typeof(yourtype).TypeHandle.Value) at the time of deserialization.


To access the memory of an object directly (to do a "core dump") you'll need to head into unsafe code.

If you want something more compact than BinaryWriter or a raw memory dump will give you, then you need to write some custom serialisation code that extracts the critical information from the object and packs it in an optimal way.

edit P.S. It's very easy to wrap the BinaryWriter approach into a DeflateStream to compress the data, which will usually roughly halve the size of the data.


I found another way to convert an object to a byte[], here is my solution:

IEnumerable en = (IEnumerable) myObject;
byte[] myBytes = en.OfType<byte>().ToArray();

Regards


If you want the serialized data to be really compact, you can write serialization methods yourself. That way you will have a minimum of overhead.

Example:

public class MyClass {

   public int Id { get; set; }
   public string Name { get; set; }

   public byte[] Serialize() {
      using (MemoryStream m = new MemoryStream()) {
         using (BinaryWriter writer = new BinaryWriter(m)) {
            writer.Write(Id);
            writer.Write(Name);
         }
         return m.ToArray();
      }
   }

   public static MyClass Desserialize(byte[] data) {
      MyClass result = new MyClass();
      using (MemoryStream m = new MemoryStream(data)) {
         using (BinaryReader reader = new BinaryReader(m)) {
            result.Id = reader.ReadInt32();
            result.Name = reader.ReadString();
         }
      }
      return result;
   }

}

I took Crystalonics' answer and turned them into extension methods. I hope someone else will find them useful:

public static byte[] SerializeToByteArray(this object obj)
{
    if (obj == null)
    {
        return null;
    }
    var bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

public static T Deserialize<T>(this byte[] byteArray) where T : class
{
    if (byteArray == null)
    {
        return null;
    }
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(byteArray, 0, byteArray.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = (T)binForm.Deserialize(memStream);
        return obj;
    }
}

Well a cast from myObject to byte[] is never going to work unless you've got an explicit conversion or if myObject is a byte[]. You need a serialization framework of some kind. There are plenty out there, including Protocol Buffers which is near and dear to me. It's pretty "lean and mean" in terms of both space and time.

You'll find that almost all serialization frameworks have significant restrictions on what you can serialize, however - Protocol Buffers more than some, due to being cross-platform.

If you can give more requirements, we can help you out more - but it's never going to be as simple as casting...

EDIT: Just to respond to this:

I need my binary file to contain the object's bytes. Only the bytes, no metadata whatsoever. Packed object-to-object. So I'll be implementing custom serialization.

Please bear in mind that the bytes in your objects are quite often references... so you'll need to work out what to do with them.

I suspect you'll find that designing and implementing your own custom serialization framework is harder than you imagine.

I would personally recommend that if you only need to do this for a few specific types, you don't bother trying to come up with a general serialization framework. Just implement an instance method and a static method in all the types you need:

public void WriteTo(Stream stream)
public static WhateverType ReadFrom(Stream stream)

One thing to bear in mind: everything becomes more tricky if you've got inheritance involved. Without inheritance, if you know what type you're starting with, you don't need to include any type information. Of course, there's also the matter of versioning - do you need to worry about backward and forward compatibility with different versions of your types?


To convert an object to a byte array:

// Convert an object to a byte array
public static byte[] ObjectToByteArray(Object obj)
{
    BinaryFormatter bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

You just need copy this function to your code and send to it the object that you need to convert to a byte array. If you need convert the byte array to an object again you can use the function below:

// Convert a byte array to an Object
public static Object ByteArrayToObject(byte[] arrBytes)
{
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(arrBytes, 0, arrBytes.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = binForm.Deserialize(memStream);
        return obj;
    }
}

You can use these functions with custom classes. You just need add the [Serializable] attribute in your class to enable serialization


Take a look at Serialization, a technique to "convert" an entire object to a byte stream. You may send it to the network or write it into a file and then restore it back to an object later.


You are really talking about serialization, which can take many forms. Since you want small and binary, protocol buffers may be a viable option - giving version tolerance and portability as well. Unlike BinaryFormatter, the protocol buffers wire format doesn't include all the type metadata; just very terse markers to identify data.

In .NET there are a few implementations; in particular

I'd humbly argue that protobuf-net (which I wrote) allows more .NET-idiomatic usage with typical C# classes ("regular" protocol-buffers tends to demand code-generation); for example:

[ProtoContract]
public class Person {
   [ProtoMember(1)]
   public int Id {get;set;}
   [ProtoMember(2)]
   public string Name {get;set;}
}
....
Person person = new Person { Id = 123, Name = "abc" };
Serializer.Serialize(destStream, person);
...
Person anotherPerson = Serializer.Deserialize<Person>(sourceStream);

This worked for me:

byte[] bfoo = (byte[])foo;

foo is an Object that I'm 100% certain that is a byte array.