1

I've written several ints, char[]s and the such to a data file with BinaryWriter in C#. Reading the file back in (in C#) with BinaryReader, I can recreate all of the pieces of the file perfectly.

However, attempting to read them back in with C++ yields some scary results. I was using fstream to attempt to read back the data and the data was not reading in correctly. In C++, I set up an fstream with ios::in|ios::binary|ios::ate and used seekg to target my location. I then read the next four bytes, which were written as the integer "16" (and reads correctly into C#). This reads as 1244780 in C++ (not the memory address, I checked). Why would this be? Is there an equivalent to BinaryReader in C++? I noticed it mentioned on msdn, but that's Visual C++ and intellisense doesn't even look like c++, to me.

Example code for writing the file (C#):

    public static void OpenFile(string filename)
    {
        fs = new FileStream(filename, FileMode.Create);
        w = new BinaryWriter(fs);

    }

    public static void WriteHeader()
    {
        w.Write('A');
        w.Write('B');
    }

    public static byte[] RawSerialize(object structure)
    {
        Int32 size = Marshal.SizeOf(structure);
        IntPtr buffer = Marshal.AllocHGlobal(size);
        Marshal.StructureToPtr(structure, buffer, true);
        byte[] data = new byte[size];
        Marshal.Copy(buffer, data, 0, size);
        Marshal.FreeHGlobal(buffer);
        return data;
    }

    public static void WriteToFile(Structures.SomeData data)
    {
        byte[] buffer = Serializer.RawSerialize(data);
        w.Write(buffer);
    }

I'm not sure how I could show you the data file.

Example of reading the data back (C#):

        BinaryReader reader = new BinaryReader(new FileStream("C://chris.dat", FileMode.Open));
        char[] a = new char[2];
        a = reader.ReadChars(2);
        Int32 numberoffiles;
        numberoffiles = reader.ReadInt32();
        Console.Write("Reading: ");
        Console.WriteLine(a);
        Console.Write("NumberOfFiles: ");
        Console.WriteLine(numberoffiles);

This I want to perform in c++. Initial attempt (fails at first integer):

 fstream fin("C://datafile.dat", ios::in|ios::binary|ios::ate);
 char *memblock = 0;
 int size;
 size = 0;
 if (fin.is_open())
 {
  size = static_cast<int>(fin.tellg());
  memblock = new char[static_cast<int>(size+1)];
  memset(memblock, 0, static_cast<int>(size + 1));

  fin.seekg(0, ios::beg);
  fin.read(memblock, size);
  fin.close();
  if(!strncmp("AB", memblock, 2)){ 
   printf("test. This works."); 
  }
  fin.seekg(2); //read the stream starting from after the second byte.
  int i;
  fin >> i;

Edit: It seems that no matter what location I use "seekg" to, I receive the exact same value.

5
  • Can you show us a fragment of code (or whole code) and an example of binary file? Commented Oct 6, 2009 at 13:39
  • I've posted some code. Not sure where I could upload the binary file to. Commented Oct 6, 2009 at 13:50
  • you reading chris.dat in your c# reader and datafile.dat in your C++ reader... Commented Oct 6, 2009 at 13:53
  • @Andy, the name discrepancies are just from my testing back and forth. Commented Oct 6, 2009 at 13:58
  • Try writing just a single int to avoid having to worry about character sizes. Write it out, see if you can read it, and report back what the file looks like in a hex editor. Commented Oct 6, 2009 at 14:20

4 Answers 4

5

You realize that a char is 16 bits in C# rather than the 8 it usually is in C. This is because a char in C# is designed to handle Unicode text rather than raw data. Therefore, writing chars using the BinaryWriter will result in Unicode being written rather than raw bytes.

This may have lead you to calculate the offset of the integer incorrectly. I recommend you take a look at the file in a hex editor, and if you cannot work out the issue post the file and the code here.

EDIT1
Regarding your C++ code, do not use the >> operator to read from a binary stream. Use read() with the address of the int that you want to read to.

int i;
fin.read((char*)&i, sizeof(int));

EDIT2
Reading from a closed stream is also going to result in undefined behavior. You cannot call fin.close() and then still expect to be able to read from it.

Sign up to request clarification or add additional context in comments.

1 Comment

A c/c++ char can handle unicode in the form of a utf-8 string.
3

This may or may not be related to the problem, but...

When you create the BinaryWriter, it defaults to writing chars in UTF-8. This means that some of them may be longer than one byte, throwing off your seeks.

You can avoid this by using the 2 argument constructor to specify the encoding. An instance of System.Text.ASCIIEncoding would be the same as what C/C++ use by default.

2 Comments

The problem with ASCIIEncoding is that it silently corrupts non ASCII characters.
You must NEVER C#'s string type for such an interop. Unless you know what you are doing which is not the case now. Use byte[]. Even advanced programmers like me are scared of the String type. Use the Ecoding variants for converting a string to Byte Array, then write its size and its data. Read it back in C++ using Yacoby's approach, and some Unicode Library, like icu.
1

There are many thing going wrong in your C++ snippet. You shouldn't mix binary reading with formatted reading:

  // The file is closed after this line. It is WRONG to read from a closed file.
  fin.close();

  if(!strncmp("AB", memblock, 2)){ 
   printf("test. This works."); 
  }

  fin.seekg(2); // You are moving the "get pointer" of a closed file
  int i;

  // Even if the file is opened, you should not mix formatted reading
  // with binary reading. ">>" is just an operator for reading formatted data.
  // In other words, it is for reading "text" and converting it to a 
  // variable of a specific data type.
  fin >> i;

1 Comment

Much thanks. I haven't worked with this kind of stuff in a long time and need these pointed out :)
1

It's been a while but I'll quote it and hope it's accurate:

  • Int16 is written as 2 bytes and padded.
  • Int32 is written as Little Endian and zero padded
  • Floats are more complicated: it takes the float value and dereferences it, getting the memory address's contents which is a hexadecimal

5 Comments

The int32 being little endian and 0 padded, could this be causing some issues? Can you elaborate at all? (Sorry, haven't checked the link yet. it might be elaborated on in there)
Looks like it was C++ char related and nothing to do with integers, except for the offset
The page is 404 not found.
@ChangmingSun thanks i've updated the link, if you could un-downvote
@ChrisS, your answer will scare him. Int16 and Int32, just like floats, are directly blittable since they are native types. BinaryWriter writes them as is, and memcpy()-ing them aroung in corresponding C++ types: int16_t, int32_t and float are the proper way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.