0

In order to send a chunk of bits from a 4 words String, I'm doing getting the byte array from the String and calculating the bit string.

StringBuilder binaryStr = new StringBuilder();

byte[] bytesFromStr = str.getBytes("UTF-8");
for (int i = 0, l = bytesFromStr.length; i < l; i++) {
    binaryStr.append(Integer.toBinaryString(bytesFromStr[i]));
}

String result = binaryStr.toString();

The problem appears when I want to do the reverse operation: converting a bit string to a Java String encoded using UTF-8.

Please, Is there someone that can explain me the best way to do that?

Thanks in advance!

8
  • I think this is a duplicate of: stackoverflow.com/questions/5499924/…, at the very least I think it will help. Commented Jul 9, 2016 at 17:23
  • 1
    It's impossible to reverse that operation. You can't possibly know if 100011010100110101100100 is the representation of 3 bytes, or 4, or 5, or... What are you trying to achieve? Why are you doing that? Commented Jul 9, 2016 at 17:27
  • 2
    If you have string "1a" then it is build from characters 1 and a which are placed in Unicode Table at positions 49, 97. In binary form they should be represented as 0110001 1100001. But result of Integer.toBinaryString(49) is 110001 not 0110001 (leading 0 is ignored). So as JB Nizet pointed out, it is impossible to detect if 111 represents 1 1 1 or 11 1 or 1 11 or 111. Anyway what you are doing here looks like XY problem Commented Jul 9, 2016 at 17:33
  • If I have 4 words encoded with UFT-8 means that I have 4 bytes, if I'm not wrong. In that case I think I can reverse the operation. That is for a PoC about steganography and data exfiltration. Commented Jul 9, 2016 at 17:36
  • "If I have 4 words encoded with UFT-8 means that I have 4 bytes" what makes you think so? Can you point us to some resource which gave you that idea? What you are saying can be interpreted as "utf-8 writes one word on one byte" but try to think about how many words are out there, and how many numbers byte can hold. Commented Jul 9, 2016 at 17:37

2 Answers 2

2

TL;DR Don't use toBinaryString(). See solution at the end.


Your problem is that Integer.toBinaryString() doesn't return leading zeroes, e.g.

System.out.println(Integer.toBinaryString(1));   // prints: 1
System.out.println(Integer.toBinaryString(10));  // prints: 1010
System.out.println(Integer.toBinaryString(100)); // prints: 1100100

For your purpose, you want to always get 8 bits for each byte.

You also need to prevent negative values from causing errors, e.g.

System.out.println(Integer.toBinaryString((byte)129)); // prints: 11111111111111111111111110000001

Easiest way to accomplish that is like this:

Integer.toBinaryString((b & 0xFF) | 0x100).substring(1)

First, it coerces the byte b to int, then retains only lower 8 bits, and finally sets the 9th bit, e.g. 129 (decimal) becomes 1 1000 0001 (binary, spaces added for clarity). It then excludes that 9th bit, in effect ensuring that leading zeroes are in place.

It's better to have that as a helper method:

private static String toBinary(byte b) {
    return Integer.toBinaryString((b & 0xFF) | 0x100).substring(1);
}

In which case your code becomes:

StringBuilder binaryStr = new StringBuilder();
for (byte b : str.getBytes("UTF-8"))
    binaryStr.append(toBinary(b));
String result = binaryStr.toString();

E.g. if str = "Hello World", you get:

0100100001100101011011000110110001101111001000000101011101101111011100100110110001100100

You could of course just do it yourself, without resorting to toBinaryString():

StringBuilder binaryStr = new StringBuilder();
for (byte b : str.getBytes("UTF-8"))
    for (int i = 7; i >= 0; i--)
        binaryStr.append((b >> i) & 1);
String result = binaryStr.toString();

That will probably run faster too.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank's @Andreas. I will do some test with your implementation avoiding 'toBinaryString()' and trying to recover the information.
0

Thanks @Andreas for your code. I test using your function and "decoding" again to UTF-8 using this:

StringBuilder revealStr = new StringBuilder();
for (int i = 0; i < result.length(); i += 8) {
    revealStr.append((char) Integer.parseUnsignedInt(result.substring(i, i + 8), 2));
} 

Thanks for all folks to help me.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.