2

We are processing a byte[] as shown below (the file is POST'ed to a web server, this code is running in Glassfish) and have found that some files have a byte-order mark (BOM, a three-byte sequence 0xEF,0xBB,0xBF, see: http://en.wikipedia.org/wiki/Byte_order_mark) at the beginning, and we want to remove this BOM. How would we detect and remove a BOM in this code? Thanks.

  private final void serializePayloadToFile(File file, byte[] payload) throws IOException {

    FileOutputStream fos;
    DataOutputStream dos;

    fos = new FileOutputStream(file, true); // true for append
    dos = new DataOutputStream(fos);

    dos.write(payload);
    dos.flush();
    dos.close();
    fos.close();

    return;
  }  

3 Answers 3

2

How would we detect [...]

There's obviously no way to tell for sure if the three bytes are three random bytes or three bytes representing a BOM.

You could check if the array starts with 0xEF, 0xBB, 0xBF and in that case skip them.

[...] and remove a BOM in this code?

Something like this should do:

int off = payload.length >= 3
       && payload[0] == 0xEF
       && payload[1] == 0xBB
       && payload[2] == 0xBF ? 3 : 0

dos.write(payload, off, payload.length - off);
Sign up to request clarification or add additional context in comments.

2 Comments

Don't forget to test payload.length>2
There is a way to be sure that the bytes are a BOM; if the file is encoded using UTF-8 and it starts with 0xEF 0xBB 0xBF, then those three bytes are a BOM.
1

DataOutputStream has a write() method with offsets and length

public void write(byte[] b, int off, int len);

So test for the byte order mark and set off (and len) appropriately.

Comments

0

The simplest solution seems to be adding another OutputStream implementation between dos and fos and buffering the first few bytes there, before actually committing them to fos. You might or might not want to throw them away, depending on their values.

2 Comments

Thanks, but sounds perhaps a little too complicated...?
Yet it will work in more complicated cases than handling a byte array (like redirecting streams, etc.) :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.