How to split byte array

Question

its a simple code to split a byte array and see how it works. But the problem is I get weird outputs.

    public static void SplitArayUsingLinq()

    {
        int i = 3;
        string data = "123456789";
        byte[] largeBytes = Encoding .Unicode .GetBytes (data);
        byte[] first = largeBytes.Take(i).ToArray();
        byte[] second = largeBytes.Skip(i).ToArray();
        string firststring = Encoding.Unicode .GetString (first);
        string secondstring = Encoding.Unicode.GetString(second);
        Console.WriteLine(" first : " +firststring);
        Console.WriteLine(" second : " +secondstring);

    }

when the value of i=3 I get this:

enter image description here

and when the value of i=4 I get this:

enter image description here

In both cases I get weird outputs. It seems that whatever the value of i is given, the program seems to consider its half. Can anyone tell me why is it happening? exactly where is the problem?

Scott Chamberlain · Accepted Answer · 2014-05-19 20:30:28Z

4

Unicode uses two bytes per character, so only even values of i will work and it will take half the number of letters. If you just want to split a string doing String.SubString will be a lot easier.

int i = 3;
string data = "123456789";
string firststring = data.SubString(0,i);
string secondstring = data.SubString(i+1);

Console.WriteLine(" first : " +firststring);
Console.WriteLine(" second : " +secondstring);

answered May 19, 2014 at 20:30

Scott Chamberlain

128k37 gold badges299 silver badges447 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Giliweed Over a year ago

no I don't want to split string. I just want to see how byte array is split and want to see how the "take()" and "skip()" works. thats why my code is the way it is. thanks anyway.

Scott Chamberlain Over a year ago

I would recommend printing out largeBytes in hex so you can understand what is going on.

BradleyDotNET Over a year ago

@Giliweed Just to make sure you know, this isn't magic. UTF8 uses one byte per character.

Scott Chamberlain Over a year ago

Well, UTF8 uses one byte per character if the charater is in the standard ASCII set. As soon as you start doing stuff like 今日は it will break again.

Giliweed Over a year ago

@ScottChamberlain, so, what is the solution now for the Chinese characters? what should I use to make it possible for any character?

|

Giliweed · Accepted Answer · 2014-05-19 20:57:12Z

0

I just changed the Unicode to UTF8 and the problem is solved. Thanks everyone who answered and commented.

UPDATE:

the correct code is :

    public static void SplitArayUsingLinq()

    {
        int i = 3;
        string data = "123456789";
        byte[] largeBytes = Encoding.UTF8.GetBytes (data);
        byte[] first = largeBytes.Take(i).ToArray();
        byte[] second = largeBytes.Skip(i).ToArray();
        string firststring = Encoding.UTF8.GetString (first);
        string secondstring = Encoding.UTF8.GetString(second);
        Console.WriteLine(" first : " +firststring);
        Console.WriteLine(" second : " +secondstring);

    }

edited May 19, 2014 at 20:57

answered May 19, 2014 at 20:52

Giliweed

5,1959 gold badges30 silver badges35 bronze badges

Collectives™ on Stack Overflow

How to split byte array

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related