1

I have this piece of code:

byte[] bytes = ...

// Here my bytes.Lenght is 181 (for example)

var str = UTF8Encoding.UTF8.GetString(bytes);
bytes = UTF8Encoding.UTF8.GetBytes(str);

// Here my bytes.Lenght is 189

Why?
How can I convert correctly the string to byte[]?

Edit: An example

public class Person 
{
    public string Name { get; set; }
    public uint Age { get; set; }
}

...

Person p = new Person { Name = "Mary", Age = 24 };

string str;
byte[] b1, b2;

using (var stream = new MemoryStream())
{
    new BinaryFormatter().Serialize(stream, p);
    b1 = stream.ToArray();
    str = UTF8Encoding.UTF8.GetString(b1);
}

b2 = UTF8Encoding.UTF8.GetBytes(str);
1
  • 1
    And were the original 181 bytes a valid UTF8 sequence? There is a syntax and a set of rules involved here. Commented Oct 23, 2012 at 14:31

3 Answers 3

2
// Here my bytes.Lenght is 181 (for example)    
// Here my bytes.Lenght is 189

That can happen.

How can I convert correctly the string to byte[] ?

A difference in size does not mean the conversion is invalid. The initial sequence might have been though.

If you want to preserve the size, use ASCII encoding.


After the expanding edit:

new BinaryFormatter().Serialize(stream, p);
b1 = stream.ToArray();
str = UTF8Encoding.UTF8.GetString(b1);
b2 = UTF8Encoding.UTF8.GetBytes(str);

You make the assumption that a BinaryFormatter will apply UTF8 encoding to strings.
It probably does not. It will add extra data (markers and size fields) to the stream.

So your 2 conversion (Serialize and GetString ) are just not compatible.

Aside from a difference in size, when you display the result it will probably contain some 'strange' characters.


Second Edit:

When I deserialize the new byte array (b2) it trows an Exception

Right. What you actually need is Convert.ToBase64String(), not UTF8.GetString()

Base64 strings can be stored and transported as strings and then converted back to byte[] again.

Sign up to request clarification or add additional context in comments.

1 Comment

When I deserialize the new byte array (b2) it trows an Exception, so I have to convert correctly my source (the string).
1

If you want to serialize an arbitrary byte[] to and from a string, don't use UTF8 encoding, use Base64.

Comments

1

Don't try to convert binary data to string with UTF8.GetString(or any encoding). Use Convert.ToBase64String and Convert.FromBase64String instead

1 Comment

@Nick if your bytes are not a valid UTF8 sequence it may result in strange behaviours as you see. Every byte(s) doesn't have to have a valid string representation. See this for base64 encoding.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.