1

I have a byte array that's contains a string at the end of the array, and the beginning of the array is padded with zeroes. I'm using the following code to convert it to a string:

String myText = new String(byteArray, "UTF-8");

However, I'm getting a bunch of weird characters prepended to the string, due to the 0 padding. How do I get rid of it?

2
  • 1
    Just look through the array to find where you start getting non-zero bytes, then call the constructor overload that allows you to specify the part of the array to convert? Have you tried that yet? Commented Mar 3, 2017 at 8:39
  • @user1118764 Hey Did you find any solution for this. In my case null bytes could be in any place? Commented Dec 31, 2019 at 6:33

5 Answers 5

2

Use the String(byte[], int, int, String) constructor.

The first int is an offset through the byte[]: just look for the first non-zero byte; the second int is the number of bytes. So, call like:

new String(
    byteArray, firstNonNullByte, byteArray.length - firstNonNullByte, "UTF-8");
Sign up to request clarification or add additional context in comments.

4 Comments

What if I don't know how long the valid string within the String is, and hence, don't know how many zeroes are padded?
@user1118764 "I have a byte array that's contains a string at the end of the array, and the beginning of the array is padded with zeroes." If that's not the case, you've stated your problem poorly.
What I mean is, let's say my byteArray is 128 bytes. It could contain "Hello" or "Bye", or any other string up to 128 characters. I do not know what the string is in advance. I guess I could loop through the byteArray and figure out the index to the first non-zero character, is that what you mean?
Well, yes. To find the first non-zero byte, loop through until you find a non-zero byte.
-1

No need to loop to find where the padding ends, you can fix the string using regex. Index juggling with loops is dangerous, because it would a perfect place to introduce by-one error one day.

String myText = (new String(byteArray, "UTF-8")).replaceAll("^\\x00*", "");

Regex means:

  • at the beginning of string (^)
  • character with hexadecimal code 0 (\x00, and \ should be escaped in java, so \\x00)
  • zero or more times (*)

Comments

-1

My solution would be to remove zeros from the beginning of the array:

public byte[] trim(byte[] bytes) {
        int i = 0;
        while (i<bytes.length && bytes[i] == 0) {
            i++;
        }

        return Arrays.copyOfRange(bytes, i, bytes.length);        
 }

2 Comments

Aside from the fact that copying the array is unnecessary, this code is broken (it might fail with an ArrayIndexOutOfBoundsException).
@AndyTurner ok. There was a bug but I fixed it. And yes, copying the array is not unnecessary but it can be useful when you want to do some additional operations on it after converting to string. of course it depends on needs
-1

You could use apache org.apache.commons.lang3.ArrayUtils.

int firstNonNullByte = ArrayUtils.lastIndexOf(byteArray, 0) + 1;

7 Comments

The null bytes are at the start, not the end.
What if the array ends with \0?
it wasn't specified. What if array has 0 in the middle?
Indeed, another problem with using lastIndexOf. The question states "the beginning of the array is padded with zeroes", these are the ones you need to get rid of.
+ "I have a byte array that's contains a string at the end of the array". I don't see any issues with zeros at the beginning and sting at the end. In another case you should just compress array by removing all the zeros.
|
-1

I would try to remove the leading zeroes and then just use the remaining part of the byte array that is useful:

public class Test {

 public static byte[] removeZeroes(byte[] data) {
    int i;
    for(i = 0; i < data.length; i++) {
        if(data[i] != '\0') {
            break;
        }
    }
    return Arrays.copyOfRange(data, i, data.length);
}

public static void main(String args[]) {
    byte[] byteArray = new byte[10];
    byteArray[0] = '\0';
    byteArray[1] = '\0';
    byteArray[2] = '\0';
    byteArray[3] = '\0';
    byteArray[4] = 's';
    byteArray[5] = 't';
    byteArray[6] = 'r';
    byteArray[7] = 'i';
    byteArray[8] = 'n';
    byteArray[9] = 'g';
    byteArray = removeZeroes(byteArray);

    try {
        String myText = new String(byteArray, "UTF-8");
        System.out.println(myText);
    }
    catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

}

1 Comment

I think the OP means the byte array is padded with 0 rather than '0' (i.e. byte value of 48).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.