1

I’m trying to convert a byte[] to a string and back using Encoding.Unicode. Sometimes Encoding.Unicode is able to convert the byte[] to a string and sometimes the output is != the input. What am I doing wrong?

Thanks for your help.

public static void Main(string[] args)
{
    Random rnd = new Random();
    while(true)
    {
        Int32 random = rnd.Next(10, 20);
        Byte[] inBytes = new Byte[random];
        for(int i = 0; i < random; i++)
            inBytes[i] = (Byte)rnd.Next(0, 9);

        String inBytesString = Encoding.Unicode.GetString(inBytes, 0, inBytes.Length);
        Byte[] outBytes = Encoding.Unicode.GetBytes(inBytesString);

        if(inBytes.Length != outBytes.Length)
            throw new Exception("?");
        else
        {
            for(int i = 0; i < inBytes.Length; i++)
            {
                if(inBytes[i] != outBytes[i])
                    throw new Exception("?");
            }
        }
        Console.WriteLine("OK");
    }
}
2

3 Answers 3

6

You cannot use Encoding for that: you must use something like Convert.ToBase64String / Convert.FromBase64String.

Encoding assumes the byte[] is formatted according to specific rules, which are not the case for a random non-string byte[].

To summarise:

An Encoding turns an arbitrary string to/from a formatted byte[]

Base-64 turns an arbitrary byte[] to/from a formatted string

Sign up to request clarification or add additional context in comments.

Comments

0
you cannot use encoding use base64

using base64 u can safely convert bytes to a string and back

base64 guaranteed to not to get "invalid" unicode sequences like:
first half of a surrogate pair without the second half use like this:

string base64 = Convert.ToBase64String(bytes);
byte[] bytes = Convert.FromBase64String(base64);

1 Comment

strictly speaking, base64 is an encoding.
0

Here is an example where I changed and image to a bit array and then converted it back to a readable string.

protected bool isImageCMYK(HttpPostedFile image, Stream fileContent)
    {
            //creating byte array
        byte[] imageToByteArray = new byte[image.ContentLength];

            //filling the byte array
        fileContent.Read(imageToByteArray, 0 , image.ContentLength);

            //convering byte array back to a readable string
        UTF8Encoding byteToString = new UTF8Encoding();
        string imageString = byteToString.GetString(imageToByteArray);

        return imageString.ToLower().Contains("cmyk");
    }

here is the edited code which results in an output of "OK"

public static void Main(string[] args)
        {
            Random rnd = new Random();
            while (true)
            {
                Int32 random = rnd.Next(10, 20);
                Byte[] inBytes = new Byte[random];
                for (int i = 0; i < random; i++)
                    inBytes[i] = (Byte)rnd.Next(0, 9);

                UTF8Encoding inBytesString = new UTF8Encoding(); 
                string byteString = inBytesString.GetString(inBytes, 0, inBytes.Length);
                //Byte[] outBytes = Encoding.Unicode.GetBytes(inBytesString);
                Byte[] outBytes = inBytesString.GetBytes(byteString);

                if (inBytes.Length != outBytes.Length)
                    throw new Exception("?");
                else
                {
                    for (int i = 0; i < inBytes.Length; i++)
                    {
                        if (inBytes[i] != outBytes[i])
                            throw new Exception("?");
                    }
                }
                Console.WriteLine("OK");
            }

5 Comments

If the incoming data is binary data (rather than UTF8 data), this is not a valid implementation. This is only correct if the data is UTF-8 text (or ASCII text).
Right so if you are using integers like above you should be fine right?
Integers how? Strings of integer values? integers encoded into binary? If so, what encoding? 4-byte LE? 4-byte BE? "varint"? Something else?
read it wrong but I edited my code using the example and received identical byte arrays
yes, but that is coincidence because the single-byte values 0-9 happen to all be valid in UTF-8. Now try round-tripping the three bytes (hex) ff ff ff. Or just a single (hex) ff.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.