2

In trying to rework my logic in response to this question. I have decided to serialize protocol buffer objects using message-size + protobuf-object-after-SerializeToArray pairs,(don't worry if you don't get what I am talking about). Anyhow my implementation doesn't work. So I decided to see how c++ fstream works. It's a semantic nightmare, I can't be sure if I need to use seekg to reposition the position handle after each read (or perhaps even after each write). I am only using write() and get() methods. The following contrived program is failing, why is it failing, and would I need seekg in this context ?

#include <fstream>
#include <boost/cstdint.hpp>
#include <iostream>

void write()
{
  boost::uint8_t one = (boost::uint32_t )255;
  boost::uint8_t two = (boost::uint32_t) 254;
  boost::uint8_t three =(boost::uint32_t) 253;

  std::fstream file("test", std::fstream::out | std::fstream::binary | std::fstream::trunc);

  file.write( (char *) &one, sizeof(one));
  file.write( (char *) &two, sizeof(two));
  file.write( (char *) &three, sizeof(two));

  std::cout << file.tellg() << std::endl;
  file.flush();
  file.close();
}

void read()
{
  boost::uint8_t one=0;
  boost::uint8_t two=0;
  boost::uint8_t three=0;

  std::fstream file("test", std::fstream::in | std::fstream::binary);


  file.get((char *) & one, sizeof(one)); 
  file.get((char *) & two, sizeof(two)); 
  file.get((char *) & three, sizeof(three)); 

  std::cout << file.tellg() << std::endl;

  std::cout << (boost::uint32_t) one << ":" << (boost::uint32_t) two  << ":" << (boost::uint32_t)three<< std::endl;
  file.close();
}


int main()
{
  write();
  read();  
}

The output is:

3
-1
0:0:0

C++ binary file io is making me feel sad and foolish :(

5
  • 1
    boost::uint8_t is unsigned char. To output the value on std::cout, you should cast it to int (or any other non-char integer type). Commented Dec 1, 2011 at 22:02
  • Yeah I just figured that out as well, thanks, updated the question. Commented Dec 1, 2011 at 22:08
  • Hexdumping the resulting file gives me feff 00fd (keep in mind endianess :), so the writing part seems to be working, at least. Commented Dec 1, 2011 at 22:09
  • 1
    Use a debugger instead of relying on cout, and check fstream state after all usages in the read code. Commented Dec 1, 2011 at 22:10
  • @steve The debugger reasons about things in memory, not things on file, or logical usage issues of libraries ? :D Commented Dec 1, 2011 at 22:12

3 Answers 3

4

Instead of istream::get, you should use istream::read.

The former extracts characters until either (n - 1) characters have been extracted or the delimiting character is found. The later just reads unformated data from file.

Sign up to request clarification or add additional context in comments.

Comments

2

fstream::get() is tailored towards text. It expects the size parameter to account for a trailing nul in the buffer. Pass sizeof(one) + 1 as the size. It will also stop reading on a '\n'. You can change what character is considered the delimiter, but it doesn't seem you can use "no delimiter, please". If you want raw binary data, use fstream::read().

When reading single bytes, you can also use

one = (boost::uint8_t) file.get();
two = (boost::uint8_t) file.get();
three = (boost::uint8_t) file.get();

But that is naturally no good for data of size > 1. You'll need fstream::read() for those.

file.read((char *) & one, sizeof(one));
file.read((char *) & two, sizeof(two));
file.read((char *) & three, sizeof(three));

Result:

3
3
255:254:253

7 Comments

Jeesus christ, back to ansi-c for file IO I guess :/, this is a minefield. Problem is I got to write in the protocol buffer binary string and the prepend the size in as binary. I remember the ability to alter the behaviour of cout to print integers as hex and such, is there a way to get the fstream to behave dumber ? (i.e., no tokenizing of any sort) ?
Nono, I don't know the c++ standard library well enough, always using QT :P Use fstream::read(), like J. Calleja says.
but that causes some tokenization to happen in the stdlib/stl code :(, I'm preprocessing a text-file into a binary format to get away from text processing :D
No, fstream::read() does no tokenization. fstream::get() does.
thanks for the perseverance :D celt got it right first tho, and he is new, so I will give him the tick :D
|
1

istream::get is not meant for binary I/O, but for textual I/O. Especially, file.get(ptr, n) reads a C string, i.e. reads only n-1 characters max, and then null-terminates. Moreover, reading will stop if you ever encounter a '\n' in your stream (not what you want in binary I/O). Note that if you were checking the stream state (always a good idea when doing I/O), you'd find that the first read attempt already resulted in an error.

You should instead use read and write for binary I/O (or alternatively, work with the corresponding stream buffers directly).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.