145

I have a collection of objects that I need to write to a binary file.

I need the bytes in the file to be compact, so I can't use BinaryFormatter. BinaryFormatter throws in all sorts of info for deserialization needs.

If I try

byte[] myBytes = (byte[]) myObject 

I get a runtime exception.

I need this to be fast so I'd rather not be copying arrays of bytes around. I'd just like the cast byte[] myBytes = (byte[]) myObject to work!

OK just to be clear, I cannot have any metadata in the output file. Just the object bytes. Packed object-to-object. Based on answers received, it looks like I'll be writing low-level Buffer.BlockCopy code. Perhaps using unsafe code.

1
  • BinaryFormatter is now obsolete and unable to make safe. Commented Aug 20, 2024 at 18:24

17 Answers 17

230

To convert an object to a byte array:

// Convert an object to a byte array
public static byte[] ObjectToByteArray(Object obj)
{
    BinaryFormatter bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

You just need copy this function to your code and send to it the object that you need to convert to a byte array. If you need convert the byte array to an object again you can use the function below:

// Convert a byte array to an Object
public static Object ByteArrayToObject(byte[] arrBytes)
{
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(arrBytes, 0, arrBytes.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = binForm.Deserialize(memStream);
        return obj;
    }
}

You can use these functions with custom classes. You just need add the [Serializable] attribute in your class to enable serialization

Sign up to request clarification or add additional context in comments.

7 Comments

I tried this and it added all sorts of metadata. The OP said he did not want metadata.
Not to mention, everyone seems to assume that what you're trying to serialize is something that you have written, or has already been pre set up to be serialized.
You can pass the byte array directly to the constuctor of MemoryStream in the second code example. This would eliminate the use of Write(...) and Seek(...).
Use of binary formatter is now considered unsafe. learn.microsoft.com/en-us/dotnet/standard/serialization/…
I am working with a pdf object that the third party vendor is marked as Sealed. So I can't mark the class as [Serializable]
|
59

If you want the serialized data to be really compact, you can write serialization methods yourself. That way you will have a minimum of overhead.

Example:

public class MyClass {

   public int Id { get; set; }
   public string Name { get; set; }

   public byte[] Serialize() {
      using (MemoryStream m = new MemoryStream()) {
         using (BinaryWriter writer = new BinaryWriter(m)) {
            writer.Write(Id);
            writer.Write(Name);
         }
         return m.ToArray();
      }
   }

   public static MyClass Desserialize(byte[] data) {
      MyClass result = new MyClass();
      using (MemoryStream m = new MemoryStream(data)) {
         using (BinaryReader reader = new BinaryReader(m)) {
            result.Id = reader.ReadInt32();
            result.Name = reader.ReadString();
         }
      }
      return result;
   }

}

7 Comments

what is i have several ints to write, and several string?
@Smith: Yes, you can do that, just write them after each other. The BinaryWriter will write them in a format that the BinaryReader can read, as long as you write and read them in the same order.
what is the difference between BinaryWriter/Reader and using a BinaryFormatter
@Smith: Using BinaryWriter/Reader you do the serialisation/deserialisation yourself, and you can write/read only the data that is absolutely needed, as compact as possible. The BinaryFormatter uses reflection to find out what data to write/read, and uses a format that works for all possible cases. It also includes the meta information about the format in the stream, so that adds even more overhead.
@Smith: You can cast the enum to int (or if you have specified any other type as storage for the enum) and write it. When you read it you can cast it to the enum type.
|
38

Use of binary formatter is now considered unsafe. see --> Docs Microsoft

Just use System.Text.Json:

To serialize to bytes:

JsonSerializer.SerializeToUtf8Bytes(obj);

To deserialize to your type:

JsonSerializer.Deserialize(byteArray);

3 Comments

seems that System.Text.Json is not available for .Net Framework
as of .net 5 this is recommended.
System.Text.Json package is available from .net core 3.1
32

Well a cast from myObject to byte[] is never going to work unless you've got an explicit conversion or if myObject is a byte[]. You need a serialization framework of some kind. There are plenty out there, including Protocol Buffers which is near and dear to me. It's pretty "lean and mean" in terms of both space and time.

You'll find that almost all serialization frameworks have significant restrictions on what you can serialize, however - Protocol Buffers more than some, due to being cross-platform.

If you can give more requirements, we can help you out more - but it's never going to be as simple as casting...

EDIT: Just to respond to this:

I need my binary file to contain the object's bytes. Only the bytes, no metadata whatsoever. Packed object-to-object. So I'll be implementing custom serialization.

Please bear in mind that the bytes in your objects are quite often references... so you'll need to work out what to do with them.

I suspect you'll find that designing and implementing your own custom serialization framework is harder than you imagine.

I would personally recommend that if you only need to do this for a few specific types, you don't bother trying to come up with a general serialization framework. Just implement an instance method and a static method in all the types you need:

public void WriteTo(Stream stream)
public static WhateverType ReadFrom(Stream stream)

One thing to bear in mind: everything becomes more tricky if you've got inheritance involved. Without inheritance, if you know what type you're starting with, you don't need to include any type information. Of course, there's also the matter of versioning - do you need to worry about backward and forward compatibility with different versions of your types?

4 Comments

Is it more correct for me to refer to this as "protobuf-csharp-port" (Google-code), or "dotnet-protobufs" (Git)?
I need my binary file to contain the object's bytes. Only the bytes, no metadata whatsoever. Packed object-to-object. So I'll be implementing custom serialization.
The risk of zero metadata is that you are then very version-intolerant, as it has very few ways of allowing flexibility before it is too late. Protocol buffers is pretty data-dense. Do you really need that extra turn of the screw?
@Marc: And of course for integers, PB can end up being denser than the raw bytes...
22

I took Crystalonics' answer and turned them into extension methods. I hope someone else will find them useful:

public static byte[] SerializeToByteArray(this object obj)
{
    if (obj == null)
    {
        return null;
    }
    var bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

public static T Deserialize<T>(this byte[] byteArray) where T : class
{
    if (byteArray == null)
    {
        return null;
    }
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(byteArray, 0, byteArray.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = (T)binForm.Deserialize(memStream);
        return obj;
    }
}

1 Comment

14

You are really talking about serialization, which can take many forms. Since you want small and binary, protocol buffers may be a viable option - giving version tolerance and portability as well. Unlike BinaryFormatter, the protocol buffers wire format doesn't include all the type metadata; just very terse markers to identify data.

In .NET there are a few implementations; in particular

I'd humbly argue that protobuf-net (which I wrote) allows more .NET-idiomatic usage with typical C# classes ("regular" protocol-buffers tends to demand code-generation); for example:

[ProtoContract]
public class Person {
   [ProtoMember(1)]
   public int Id {get;set;}
   [ProtoMember(2)]
   public string Name {get;set;}
}
....
Person person = new Person { Id = 123, Name = "abc" };
Serializer.Serialize(destStream, person);
...
Person anotherPerson = Serializer.Deserialize<Person>(sourceStream);

4 Comments

Even "terse markers" are still metadata. My understanding of what the OP wanted was nothing but the data in the object. So, for example, if the object was a struct with 2 32-bit integers, then he would expect the result to be a byte array of 8 bytes.
@user316117 which is then a real pain for versioning. Each approach has advantages and disadvantages.
There is a way to avoid using the Proto* attributes ? The entities I want to use are in a 3rd party library.
4

This worked for me:

byte[] bfoo = (byte[])foo;

foo is an Object that I'm 100% certain that is a byte array.

Comments

4

I found Best Way this method worked correcly for me Use Newtonsoft.Json

public TData ByteToObj<TData>(byte[] arr){
                    return JsonConvert.DeserializeObject<TData>(Encoding.UTF8.GetString(arr));
    }

public byte[] ObjToByte<TData>(TData data){
            var json = JsonConvert.SerializeObject(data);
            return Encoding.UTF8.GetBytes(json);
}

Comments

2

Take a look at Serialization, a technique to "convert" an entire object to a byte stream. You may send it to the network or write it into a file and then restore it back to an object later.

2 Comments

I think chuckhlogan explicitly declined that (Formatter==Serialization).
@Henk - it depends what the reasons are; he mentioned the extra info, which I take to be type metadata and field info; you can use serialization without that overhead; just not with BinaryFormatter.
1

To access the memory of an object directly (to do a "core dump") you'll need to head into unsafe code.

If you want something more compact than BinaryWriter or a raw memory dump will give you, then you need to write some custom serialisation code that extracts the critical information from the object and packs it in an optimal way.

edit P.S. It's very easy to wrap the BinaryWriter approach into a DeflateStream to compress the data, which will usually roughly halve the size of the data.

1 Comment

Unsafe code isn't enough. C# and CLR still won't let you take a raw pointer to a managed object even in unsafe code, or put two object references in a union.
1

I believe what you're trying to do is impossible.

The junk that BinaryFormatter creates is necessary to recover the object from the file after your program stopped.
However it is possible to get the object data, you just need to know the exact size of it (more difficult than it sounds) :

public static unsafe byte[] Binarize(object obj, int size)
{
    var r = new byte[size];
    var rf = __makeref(obj);
    var a = **(IntPtr**)(&rf);
    Marshal.Copy(a, r, 0, size);
    return res;
}

this can be recovered via:

public unsafe static dynamic ToObject(byte[] bytes)
{
    var rf = __makeref(bytes);
    **(int**)(&rf) += 8;
    return GCHandle.Alloc(bytes).Target;
}

The reason why the above methods don't work for serialization is that the first four bytes in the returned data correspond to a RuntimeTypeHandle. The RuntimeTypeHandle describes the layout/type of the object but the value of it changes every time the program is ran.

EDIT: that is stupid don't do that --> If you already know the type of the object to be deserialized for certain you can switch those bytes for BitConvertes.GetBytes((int)typeof(yourtype).TypeHandle.Value) at the time of deserialization.

Comments

1

This method returns an array of bytes from an object.

private byte[] ConvertBody(object model)
        {
            return Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(model));
        }

Comments

0

I found another way to convert an object to a byte[], here is my solution:

IEnumerable en = (IEnumerable) myObject;
byte[] myBytes = en.OfType<byte>().ToArray();

Regards

1 Comment

I dont think , this method converts object to byte[], rather it finds the type , here byte in the object , reports that properties back
0

Because of this compiler warning:

SerialDeserializerDefaultConcrete.cs(50, 17): [SYSLIB0011] 'BinaryFormatter.Serialize(Stream, object)' is obsolete: 'BinaryFormatter serialization is obsolete and should not be used. See https://aka.ms/binaryformatter for more information.'

I have moved to the json solutions for later .net (core based) target frameworks.

And because of the tension with pre-system.text.json and system.text.json.. i have created a "both" answer.

Note, my answer is a mixture of everything everything else above.

I have added : Interface and Concrete encapsulation. I believe in "write to an interface, not a concrete".

namespace MyStuff.Interfaces
{
public interface ISerialDeserializer<T> where T : new()
{
    byte[] SerializeToByteArray(T obj);

    T Deserialize(byte[] byteArray);
}
}



#if NET6_0_OR_GREATER
using System.Text.Json;
#endif

#if !NET6_0_OR_GREATER
using System;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;
#endif


using MyStuff.Interfaces;

namespace MyStuff.Concrete
{
    public class SerialDeserializerDefaultConcrete<T> : ISerialDeserializer<T> where T : new()
    {
#if NET6_0_OR_GREATER
        public byte[] SerializeToByteArray(T obj)
        {
            if (obj == null)
            {
                return null;
            }

            return JsonSerializer.SerializeToUtf8Bytes(obj);
        }

        public T Deserialize(byte[] byteArray)
        {
            if (byteArray == null)
            {
                return default(T);
            }

            return JsonSerializer.Deserialize<T>(byteArray);
        }
#endif

#if !NET6_0_OR_GREATER
        public byte[] SerializeToByteArray(T obj)
        {
            if (obj == null)
            {
                return null;
            }

            var bf = new BinaryFormatter();
            using (var ms = new MemoryStream())
            {
                bf.Serialize(ms, obj);
                return ms.ToArray();
            }
        }

        public T Deserialize(byte[] byteArray)
        {
            if (byteArray == null)
            {
                return default(T);
            }

            using (var memStream = new MemoryStream())
            {
                var binForm = new BinaryFormatter();
                memStream.Write(byteArray, 0, byteArray.Length);
                memStream.Seek(0, SeekOrigin.Begin);
                var obj = (T) binForm.Deserialize(memStream);
                return obj;
            }
        }
#endif
    }
}

and the target-frameworks of my csproj:

<PropertyGroup>
    <TargetFrameworks>netstandard2.0;netstandard2.1;net6.0</TargetFrameworks>
</PropertyGroup>

You could also (probably) use NewtonSoft for pre-6.0 frameworks. Newtonsoft references has been sometimes problematic, thus why I went with the in MemoryStream version for pre-System.Text.Json frameworks.

Comments

0

Ever since spans have been added, you have been able to use two MemoryMarshal functions that can get all bytes of a value type. Under the hood, it is just a little bit of casting. Just like you asked, there are no extra allocations going down to the bytes unless you copy them to an array or another span.

Be careful, because struct objects passed here must have an explicit layout. Otherwise, there is no guarantee that layouts may not remain the same across runtimes.

Here is an example of the two functions in use to get the bytes of one:

public static Span<byte> GetBytes<T>(ref T o)
    where T : unmanaged
{
    var singletonSpan = MemoryMarshal.CreateSpan(ref o, 1);
    var bytes = MemoryMarshal.AsBytes(singletonSpan);
    return bytes;
}

The unmanaged constraint guarantees the object is not and does not contain a reference type. If the constraint was any degree less, this method would lead to undefined behavior. Imagine trying to deserialize an object and it contains a dangling pointer now.

The first function, MemoryMarshal.CreateSpan, creates a span with the 0th element being the passed o arg and a length. This is a view over the same area in memory, so any changes within the span will be reflected in o and all proceeding elements and vice versa.

The second function, MemoryMarshal.AsBytes, takes a span and creates a span of bytes. This span is a view over the argument span so any changes to the bytes will be reflected within the given span and vice versa.

2 Comments

Are you sure this works for objects/reference types? I get the error T must be a non-nullable value type so it seems like this only works for value types and not reference types.
@kkuilla It won't work with objects, hence the generic's struct constraint. There aren't enough guarantees to say that an object will contain only the fields you've declared in an order that will remain the same across runtimes. You can force it with a struct using an explicit struct layout with attributes. I believe that's about as safe as it gets in the unsafe area. JSON should be fine for almost all use cases though.
0

The methods in BitConverter will convert the various types (bool, short, int, long) etc to/from an array of bytes. BinaryWriter and BinaryReader have ReadBytes and Write methods that you can pass the results from BitConverter.

I have a serialiser class that takes an object and converts to and from byte[]. A snippet of it is here:

public class Serialiser
{
        public static byte[] GetBytes<T>(T value) => GetBytes(typeof(T), value);


        /// <summary>
        /// Converts the supplied value to an array of bytes.
        /// </summary>
        /// <param name="type">The type of the value.</param>
        /// <param name="value">The value.</param>
        /// <returns>An array of bytes that is the object.</returns>
        /// <exception cref="NotSupportedException">The type of object or it members is not a supported type.</exception>
        public static byte[] GetBytes(Type type, object value)
        {
            switch (Type.GetTypeCode(type))
            {
                case TypeCode.Boolean: return BitConverter.GetBytes((bool)value);
                case TypeCode.Byte: return BitConverter.GetBytes((byte)value);
                case TypeCode.SByte: return BitConverter.GetBytes((sbyte)value);
                case TypeCode.Char: return BitConverter.GetBytes((char)value);
                case TypeCode.Int16: return BitConverter.GetBytes((short)value);
                case TypeCode.UInt16: return BitConverter.GetBytes((ushort)value);
                case TypeCode.Int32: return BitConverter.GetBytes((int)value);
                case TypeCode.UInt32: return BitConverter.GetBytes((uint)value);
                case TypeCode.Int64: return BitConverter.GetBytes((long)value);
                case TypeCode.UInt64: return BitConverter.GetBytes((ulong)value);
                case TypeCode.Single: return BitConverter.GetBytes((float)value);
                case TypeCode.Double: return BitConverter.GetBytes((double)value);
                case TypeCode.String:
                    var strBytes = ASCIIEncoding.ASCII.GetBytes((string)value);
                    return strBytes;
                case TypeCode.Object:
                    var builder = new ByteBuilder();
                    foreach (var prop in type.GetProperties(BindingFlags.Public | BindingFlags.Instance))
                    {
                        }
                        var bytes = GetBytes(prop.PropertyType, prop.GetValue(value));
                        if (!builder.HasSpace((ushort)bytes.Length))
                            builder.IncreaseCount(bytes.Length);
                        builder.Add(bytes);
                    }
                    return builder.ToByteArray();
                case TypeCode.DateTime:
                default:
                    throw new NotSupportedException($"{type.Name} is not a supported type");
            }
        }

        public static T To<T>(byte[] bytes) => (T) To(typeof(T), bytes);


        /// <summary>
        /// Converts the supplied array of <see cref="System.Byte"/>s to the specified type.
        /// </summary>
        /// <param name="type">The type of object</param>
        /// <param name="bytes">The bytes.</param>
        /// <param name="start">The start index into the array of bytes.</param>
        /// <returns>The object of the specified type.</returns>
        /// <exception cref="NotSupportedException">The object or one of its properties are not supported.</exception>
        public static object To(Type type, byte[] bytes, int start = 0)
        {
            switch (Type.GetTypeCode(type))
            {
                case TypeCode.Boolean: return BitConverter.ToBoolean(bytes, start);
                case TypeCode.Byte: return bytes[start];
                case TypeCode.SByte: return (sbyte)bytes[start];
                case TypeCode.Char: return BitConverter.ToChar(bytes, start);
                case TypeCode.Int16: return BitConverter.ToInt16(bytes, start);
                case TypeCode.UInt16: return BitConverter.ToUInt16(bytes, start);
                case TypeCode.Int32: return BitConverter.ToInt32(bytes, start);
                case TypeCode.UInt32: return BitConverter.ToUInt32(bytes, start);
                case TypeCode.Int64: return BitConverter.ToInt64(bytes, start);
                case TypeCode.UInt64: return BitConverter.ToUInt64(bytes, start);
                case TypeCode.Single: return BitConverter.ToSingle(bytes, start);
                case TypeCode.Double: return BitConverter.ToDouble(bytes, start);

                case TypeCode.Object:
                    var target = Activator.CreateInstance(type);
                    foreach (var prop in type.GetProperties(BindingFlags.Public | BindingFlags.Instance))
                    {
                        prop.SetValue(target, To(prop.PropertyType, bytes, start));
                        start += SizeOf(prop.PropertyType);
                    }
                    return target;

                case TypeCode.String:
                case TypeCode.DateTime:
                default:
                    throw new NotSupportedException($"{type.Name} is not a supported type");
            }
        }
}

Extension methods to BinaryReader and BinaryWriter complete the support for reading & writing to your stream of choice. In the code above, ByteBuilder is one of our classes for handling byte arrays with similar methods to StringBuilder, alternatively use arrays and resize when needed.

Comments

-1

You can use below method to convert list of objects into byte array using System.Text.Json serialization.

private static byte[] CovertToByteArray(List<object> mergedResponse)
{
var options = new JsonSerializerOptions
{
 PropertyNameCaseInsensitive = true,
};
if (mergedResponse != null && mergedResponse.Any())
{
return JsonSerializer.SerializeToUtf8Bytes(mergedResponse, options);
}

 return new byte[] { };
}

1 Comment

OP asked for a compact representation. This is not possible with JSON.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.