1

I am trying to write a program that reads in binary files using C++. I am running into some unusual output that I was hoping for some help with.

I have a binary file that starts with these 4 bytes:

A1 B2 C3 D4 (verified using hexdump -C)

And here is the code I am using to read these 4 bytes in:

#include <iostream> // for reading and writing files
#include <fstream>

using namespace std;

char * buffer;
unsigned int LENGTH = 4;

int main(int argc, char ** argv)
{
    // open the binary file
    ifstream infile ("dump", ios::binary | ios::in);

    // create a buffer
    buffer = new char[LENGTH];

    // read LENGTH bytes from the file
    infile.read(buffer, LENGTH);

    // append the null byte to terminate the string
    buffer[LENGTH] = '\0';

    // loop over the 4 bytes read, and print
    for(int i = 0; i < LENGTH; i++)
    {
       printf("Buffer[%d] is %X\n", i, buffer[i]);
    }

    delete[] buffer;
    infile.close();


    return 0;
}

This program gives me these actual results:

Buffer[0] is FFFFFFA1
Buffer[1] is FFFFFFB2
Buffer[2] is FFFFFFC3
Buffer[3] is FFFFFFD4

BUT, I would expect these results:

Buffer[0] is A1
Buffer[1] is B2
Buffer[2] is C3
Buffer[3] is D4

Can anyone explain to me where the 3 0xFF bytes are coming from ? It only seems to be affecting the first 4 bytes of the file, the next 4 bytes print out as expected without any 0xFF bytes prepended.

1
  • 1
    You are reading bytes and printing them as integers. If the bytes are larger than 127 per value, the integers expanded will be negative then (the sign is the first bit of the byte, which is set in all values you read). The conversion is done in the call to the printf function (default conversion char to int). The off-by-one error doesn't matter here, but should be of course corrected. Commented Sep 3, 2012 at 14:45

2 Answers 2

8

The char buffer[i] is default promoted when passed as a variable argument. To get the right value, say static_cast<unsigned char>(buffer[i]).

Moreover, buffer[LENGTH] is out of bounds and thus undefined behaviour.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks, that was exactly the problem. Just out of curiosity why is a static_cast here better than a regular cast. (unsigned char)?
@HunterMcMillen: You're confused: static_cast is the regular cast. What you said is barely C++, and mostly evil. :-)
I haven't really had much experience with C++. I have worked with C a little bit and am just used to seeing those C style casts. So static_cast is the recommended way of casting in C++?
@HunterMcMillen: Yes. And especially if you come from C, it's worthwhile to always step back, have a coffee and say to yourself three times that "C++ is not C".
@HunterMcMillen - static_cast is the recommended way of doing the particular casts that it supports. That's the benefit of the four built-in C++ casts: each does a particular set of things, unlike a C-style cast, which does pretty much everything and sometimes overreaches.
5

You can't read LENGTH bytes of content, and terminate at offset LENGTH in a buffer of size LENGTH. This is an off-by-one error.

buffer = new char[LENGTH];

This gives you space for LENGTH characters, with indices 0 through LENGTH - 1. So this:

buffer[LENGTH] = '\0';

writes outside the allocated memory, invoking undefined behavior.

1 Comment

Thank you for pointing this out, I was more concerned about with the numbers were wrapping around to negative. But this is an important piece that I missed also.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.