Reading several arrays in a binary file with numpy

Question

I'm trying to read a binary file which is composed by several matrices of float numbers separated by a single int. The code in Matlab to achieve this is the following:

fid1=fopen(fname1,'r');
for i=1:xx
    Rstart= fread(fid1,1,'int32');        #read blank at the begining
    ZZ1 = fread(fid1,[Nx Ny],'real*4');   #read z
    Rend  = fread(fid1,1,'int32');        #read blank at the end
end

As you can see, each matrix size is Nx by Ny. Rstart and Rend are just dummy values. ZZ1 is the matrix I'm interested in.

I am trying to do the same in python, doing the following:

Rstart = np.fromfile(fname1,dtype='int32',count=1)
ZZ1 = np.fromfile(fname1,dtype='float32',count=Ny1*Nx1).reshape(Ny1,Nx1)
Rend = np.fromfile(fname1,dtype='int32',count=1)

Then, I have to iterate to read the subsequent matrices, but the function np.fromfile doesn't retain the pointer in the file.

Another option:

with open(fname1,'r') as f:
   ZZ1=np.memmap(f, dtype='float32', mode='r', offset = 4,shape=(Ny1,Nx1))
   plt.pcolor(ZZ1)

This works fine for the first array, but doesn't read the next matrices. Any idea how can I do this?

I searched for similar questions but didn't find a suitable answer.

Thanks

numpy.fromfile also accepts a file object as first argument. — user2379410
– user2379410, Commented May 5, 2016 at 10:37
Perfect. That was exactly what I was looking for. Important to point that file should be opened in binary mode, e.g fid=open(fname,'rb') . Thank you! — jcdoming
– jcdoming, Commented May 5, 2016 at 15:47

Eelco Hoogendoorn · Accepted Answer · 2016-05-05 18:29:44Z

2

The cleanest way to read all your matrices in a single vectorized statement is to use a struct array:

dtype = [('start', np.int32), ('ZZ', np.float32, (Ny1, Nx1)), ('end', np.int32)]
with open(fname1, 'rb') as fh:
    data = np.fromfile(fh, dtype)
print(data['ZZ'])

answered May 5, 2016 at 18:29

Eelco Hoogendoorn

10.8k1 gold badge46 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

jcdoming Over a year ago

Beautiful answer! I was not sure how composite dtype worked. Thank you.

jcdoming Over a year ago

The downside I think is that it loads all the arrays at once, which I should avoid with big files.

Eelco Hoogendoorn Over a year ago

you can still add a count parameter to fromfile I think, no?

jcdoming · Accepted Answer · 2016-05-05 15:51:04Z

1

There are 2 solutions for this problem.

The first one:

for i in range(x):
    ZZ1=np.memmap(fname1, dtype='float32', mode='r', offset = 4+8*i+(Nx1*Ny1)*4*i,shape=(Ny1,Nx1))

Where i is the array you want to get.

The second one:

fid=open('fname','rb')
for i in range(x):
    Rstart = np.fromfile(fid,dtype='int32',count=1)
    ZZ1 = np.fromfile(fid,dtype='float32',count=Ny1*Nx1).reshape(Ny1,Nx1)
    Rend = np.fromfile(fid,dtype='int32',count=1)

So as morningsun points out, np.fromfile can receive a file object as an argument and keep track of the pointer. Notice that you must open the file in binary mode 'rb'.

edited May 5, 2016 at 15:51

answered May 4, 2016 at 21:03

jcdoming

3613 silver badges10 bronze badges

Collectives™ on Stack Overflow

Reading several arrays in a binary file with numpy

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related