2

I was expecting that new Buffer(buffer.toString()) would always be byte-for-byte equal. However, I am encountering a case where it is not true.

First, a case where it is true:

var buf1 = new Buffer(32);                                                                                                                                                                                  
for (var i = 0 ; i < 32 ; i++) {                                                                                                                                                                            
  buf1[i] = i;                                                                                                                                                                                              
} 

console.log(buf1);                                                                                                                                                                                          
console.log(new Buffer(buf1.toString())); 

<Buffer 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f>
<Buffer 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f>

However, here is a case where it is not true:

var buf2 = crypto.createHmac('sha256', 'key')                                                                                                                                                                 
    .update('string')                                                                                                                                                                                        
    .digest();

console.log(buf2);                                                                                                                                                                                          
console.log(new Buffer(buf2.toString()));

<Buffer 97 d1 5b ea ba 06 0d 07 38 ec 75 9e a3 18 65 17 8a b8 bb 78 1b 2d 21 07 64 4b a8 81 f3 99 d8 d6>
<Buffer ef bf bd ef bf bd 5b ef bf bd ef bf bd 06 0d 07 38 ef bf bd 75 ef bf bd ef bf bd 18 65 17 ef bf bd ef bf bd ef bf bd 78 1b 2d 21 07 64 4b ef bf bd ef ... >

What is different about buf2 that makes new Buffer(buf2.toString()) not byte-equivalent to buf2?

6
  • @DaveNewton huh? It encodes into utf8, and then new Buffer(str) decodes utf8 by default—I thought. The toString() implementation here is Buffer.prototype.toString(). Commented Dec 2, 2015 at 17:18
  • @dimadima Totally wasn't paying attention; sorry. Commented Dec 2, 2015 at 17:23
  • Wouldn't you want to convert it to something more string-y like hex or base64 before converting the buffer into a string? And what precisely are you trying to accomplish? The string representation of the buffer isn't going to have identical bytes to the buffer itself unless the buffer contains a reasonable string already. Commented Dec 2, 2015 at 17:30
  • @DaveNewton: yeah the problem cropped up when I was trying to use github.com/joaquimserafim/base64-url/blob/… and then just wanted to get that back into a Buffer(). Commented Dec 2, 2015 at 17:32
  • @DaveNewton why wouldn't the string representation encode/decode 1-to-1 to the original buffer? I mean, the encoded, printed string may contain bytes that don't have glyphs you can read, but I thought that didn't matter in terms of a round-trip coding. Commented Dec 2, 2015 at 17:34

1 Answer 1

4

A Buffer is an object as far as JS is concerned, so you're comparing object references. Since the two Buffers are not actually the same instance, that kind of equality check (== or ===) will never be true.

For comparing Buffer contents you could use something like buffer.equals(buffer2) if you have node v0.12 or newer. For older node versions, you will have to use a loop to check byte-by-byte.

Additional explanation:

Calling .toString() converts the binary data to UTF-8. If there are invalid UTF-8 characters in that data, those characters will typically be replaced by the replacement character of \uFFFD. When this replacement occurs, the content is now different, causing equals() to return false. In fact, you can see this in the second Buffer (the instances of ef bf bd).

Sign up to request clarification or add additional context in comments.

3 Comments

In the second case above, console.log(buf2.equals(new Buffer(buf2.toString()))) prints false. Do you know why?
Calling .toString() converts the binary data to UTF-8. If there are invalid UTF-8 characters in that data, those characters will typically be replaced by the replacement character of \uFFFD. When this replacement occurs, the content is now different, causing equals() to return false. In fact, you can see this in the second Buffer (the instances of ef bf bd).
Oh my goodness. Yes I forgot how utf-8 works. How embarassing :D. If you feel like updating your answer I'll upvote/accept.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.