1

its a simple code to split a byte array and see how it works. But the problem is I get weird outputs.

    public static void SplitArayUsingLinq()

    {
        int i = 3;
        string data = "123456789";
        byte[] largeBytes = Encoding .Unicode .GetBytes (data);
        byte[] first = largeBytes.Take(i).ToArray();
        byte[] second = largeBytes.Skip(i).ToArray();
        string firststring = Encoding.Unicode .GetString (first);
        string secondstring = Encoding.Unicode.GetString(second);
        Console.WriteLine(" first : " +firststring);
        Console.WriteLine(" second : " +secondstring);

    }

when the value of i=3 I get this:

enter image description here

and when the value of i=4 I get this:

enter image description here

In both cases I get weird outputs. It seems that whatever the value of i is given, the program seems to consider its half. Can anyone tell me why is it happening? exactly where is the problem?

0

2 Answers 2

4

Unicode uses two bytes per character, so only even values of i will work and it will take half the number of letters. If you just want to split a string doing String.SubString will be a lot easier.

int i = 3;
string data = "123456789";
string firststring = data.SubString(0,i);
string secondstring = data.SubString(i+1);

Console.WriteLine(" first : " +firststring);
Console.WriteLine(" second : " +secondstring);
Sign up to request clarification or add additional context in comments.

6 Comments

no I don't want to split string. I just want to see how byte array is split and want to see how the "take()" and "skip()" works. thats why my code is the way it is. thanks anyway.
I would recommend printing out largeBytes in hex so you can understand what is going on.
@Giliweed Just to make sure you know, this isn't magic. UTF8 uses one byte per character.
Well, UTF8 uses one byte per character if the charater is in the standard ASCII set. As soon as you start doing stuff like 今日は it will break again.
@ScottChamberlain, so, what is the solution now for the Chinese characters? what should I use to make it possible for any character?
|
0

I just changed the Unicode to UTF8 and the problem is solved. Thanks everyone who answered and commented.

UPDATE:

the correct code is :

    public static void SplitArayUsingLinq()

    {
        int i = 3;
        string data = "123456789";
        byte[] largeBytes = Encoding.UTF8.GetBytes (data);
        byte[] first = largeBytes.Take(i).ToArray();
        byte[] second = largeBytes.Skip(i).ToArray();
        string firststring = Encoding.UTF8.GetString (first);
        string secondstring = Encoding.UTF8.GetString(second);
        Console.WriteLine(" first : " +firststring);
        Console.WriteLine(" second : " +secondstring);

    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.