1

I have a unicode text with some unicode characters say,"Hello, world! this paragraph has some unicode characters."

I want to convert this paragraph to binary string i.e in binary digits with datatype string. and after converting, I also want to convert that binary string back to unicode string.

4
  • Duplicate of stackoverflow.com/questions/1615559/… Commented Jun 20, 2016 at 11:44
  • @buffjape That is something else, its not a duplicate of what i want. What I want is shown in following example: Input: Hi, this text is in unicode. Output: 11000010111100101111 (digits in string datatype) Output2: Hi, this text is in unicode. Hope this will explain you my problem. Commented Jun 20, 2016 at 18:11
  • Is that example that you are providing here exact? "Hi, this text is in unicode." is in no way equal to any possible representation of "11000010111100101111" Commented Jun 21, 2016 at 10:54
  • @pijemcolu If you look at the marked answer, it is exactly what i wanted. Commented Jun 21, 2016 at 10:58

2 Answers 2

3

If you're simply looking for a way to decode and encode a string into byte[] and not actual binary then i would use System.Text

The actual example from msdn:

      string unicodeString = "This string contains the unicode character Pi (\u03a0)";

  // Create two different encodings.
  Encoding ascii = Encoding.ASCII;
  Encoding unicode = Encoding.Unicode;

  // Convert the string into a byte array.
  byte[] unicodeBytes = unicode.GetBytes(unicodeString);

  // Perform the conversion from one encoding to the other.
  byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

  // Convert the new byte[] into a char[] and then into a string.
  char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
  ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
  string asciiString = new string(asciiChars);

  // Display the strings created before and after the conversion.
  Console.WriteLine("Original string: {0}", unicodeString);
  Console.WriteLine("Ascii converted string: {0}", asciiString);

Don't forget

using System;
using System.Text;
Sign up to request clarification or add additional context in comments.

Comments

2

Since there are several encodings for the Unicode character set, you have to pick: UTF-8, UTF-16, UTF-32, etc. Say you picked UTF-8. You have to use the same encoding going both ways.

To convert to a binary string:

String.Join(
    String.Empty, // running them all together makes it tricky.
    Encoding.UTF8
        .GetBytes("Hello, world! this paragraph has some unicode characters.")
        .Select(byt => Convert.ToString(byt, 2).PadLeft(8, '0'))) // must ensure 8 digits.

And back again:

Encoding.UTF8.GetString(
    Regex.Split(
        "010010000110010101101100011011000110111100101100001000000111011101101111011100100110110001100100001000010010000001110100011010000110100101110011001000000111000001100001011100100110000101100111011100100110000101110000011010000010000001101000011000010111001100100000011100110110111101101101011001010010000001110101011011100110100101100011011011110110010001100101001000000110001101101000011000010111001001100001011000110111010001100101011100100111001100101110"
        ,"(.{8})") // this is the consequence of running them all together.
    .Where(binary => !String.IsNullOrEmpty(binary)) // keeps the matches; drops empty parts 
    .Select(binary => Convert.ToByte(binary, 2))
    .ToArray())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.