0

I have the following problem. I'm using Java to create a byte array from a file. So I do the following:

byte[] myByteArray = Files.readAllBytes(filename);

However, for many of the bytes it is returning incorrect/negative values.

For instance, if I test using javascript, to read every byte of a file e.g.

    function readbytes(s){
    var f = new File(s);
    var i,a,c;
    var d = [];
    if (f.isopen) {
        c = f.eof;
        for(i=0;i<c ;i++){ 
            a = f.readbytes(1); 
            d.push(a);
        }   
        f.close();
        return d;
    } else {
        post("could not open file: " + s + "n");
    }
}

(readbytes is a function in the program Im using that gives the byte at a specific position).

This returns the correct bytes

So Im wondering, why does java return incorrect codes? Is this something to do with unsigned values?

1
  • No, but it has everything to do with signed values, since Java doesn't have unsigned values (barring char). Commented May 24, 2016 at 10:37

1 Answer 1

2

Java doesn't know unsigned bytes. For instance the unsigned byte 255 would be printed as its signed version -1. In memory however, the actual value would be the same and represented as 255.

If you'd like to convert a byte to its unsigned representation, you may use the bitwise AND operator.

For instance:

bytes[x] & 0xff

Java doesn't know about bytes at runtime either for any operand that may be pushed onto the Java virtual machine's stack. In fact every operation you apply to an integral value results in an integer. That's why ((byte)-1) & 0xff) results in an integer and its value is 255. If you would like to store that value back into a byte, you'd have to cast it to byte again, which of course, is -1.

byte x = -1; // java is friendly enough to insert the implicit cast here
System.out.println(x); // -1
System.out.println(x & 0xff); // 255
byte y = (byte)(x & 0xff); // must add (byte) cast
System.out.println(y); // -1

Also keep in mind that technically the output you see is different but the content is still the same since you can map from Java's signed byte always to its unsigned representation. Ideally, you'd use something like DataInputStream which offers you int readUnsignedByte().

Sign up to request clarification or add additional context in comments.

3 Comments

Hi Joa, I'm wondering how to convert to unsigned bytes? I'm hoping I can end up with the same array of int as I get in the javascript case. Is there a way somehow to loop through the byte array and change each value to the unsigned equivalent?
You can always create a new array of integers, and manually copy all values. int[] ubytes = new int[bytes.length]; for(int i = 0; i < bytes.length; ++i) { ubytes[i] = bytes[i] & 0xff; }
This was very helpful Joa. Thank you.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.