Read binary file of unknown size with mixed data types in python

Question

I need to read binary files which consist of 19 float32 numbers followed by a unknown number of uint32 numbers. How can I read such a file in python?

In Matlab the equivalent looks like this:

fid = fopen('myFile.bin','r');
params = fread(fid,19,'float');
data = fread(fid,'uint32');
fclose(fid);

AGN Gazer · Accepted Answer · 2017-07-20 19:54:53Z

6

Use numpy.fromfile() method and pass a file handle to it with the corresponding number of items to read.

import numpy as np
with open('myFile.bin', 'rb') as f:
    params = np.fromfile(f, dtype=np.float32, count=19)
    data = np.fromfile(f, dtype=np.int32, count=-1) # I *assumed* here your ints are 32-bit

Postpend .tolist() to closing paranthesis of fromfile() (like this: np.fromfile(...).tolist()) if you want to get standard Python lists instead of numpy arrays.

edited Jul 20, 2017 at 19:54

answered Jul 20, 2017 at 19:23

AGN Gazer

8,4272 gold badges31 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

mcExchange Over a year ago

np.fromfile does require the number of elements it should read, so this method doesn't work in my case since the file size is unknown

AGN Gazer Over a year ago

@mcExchange Not true. Notice -1 in the argument list - I forgot to add it to my original answer - editing right now.

AGN Gazer Over a year ago

@mcExchange Actually, if you read docs (see link in the answer), counts is set to -1 by default and that is why I did not explicitly put it in the second call. From docs: count : int Number of items to read. -1 means all items (i.e., the complete file). and numpy.fromfile(file, dtype=float, count=-1, sep='') So, technically, last -1 in the second fromfile() is not required to read the rest of integer numbers.

AGN Gazer Over a year ago

@mcExchange "...this method doesn't work in my case since the file size is unknown" Have you actually tried it?

Radosław Załuska Over a year ago

Note that by using numpy as proposed in this answer you will gain performance boost compared to solution from my answer.

|

Radosław Załuska · Accepted Answer · 2017-07-20 19:19:21Z

1

For reading binary file I recomend using struct package

The solution can be written as follows:

import struct

f = open("myFile.bin", "rb")

floats_bytes = f.read(19 * 4)
# here I assume that we read exactly 19 floats without errors

# 19 floats in array
floats_array = struct.unpack("<19f", floats_bytes)

# convert to list beacause struct.unpack returns tuple
floats_array = list(floats_array)

# array of ints
ints_array = []

while True:
    int_bytes = r.read(4)

    # check if not eof
    if not int_bytes:
        break

    int_value = struct.unpack("<I", int_bytes)[0]

    ints_array.append(int_value)

f.close()

Note that I assumed your numbers are stored in little endian byte order, so I used "<" in format strings.

answered Jul 20, 2017 at 19:19

Radosław Załuska

5565 silver badges12 bronze badges

2 Comments

mcExchange Over a year ago

Wow, this looks intricate. The timing seems to be a bit slow though. Loading the same file takes 36ms in pyhton while in Matlab it takes < 1 ms. Still thanks for your proposal, I will check if it's fast enough for my application... I'll let you know

Whynote Over a year ago

I get how this answers the first part of the question about the fixed amount of float, but how do you deal with an "unknown number of int" as mentioned in the question?

Collectives™ on Stack Overflow

Read binary file of unknown size with mixed data types in python

2 Answers 2

7 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related