I have to read a binary file which contains 1300 images of 320*256 of uint8 pixels and convert this to a numpy array. Data convert from byte with struct.unpack is on the following form : b'\xbb\x17\xb4\x17\xe2\x17\xc3\x17\xd3\x17'. The saved data is on the following form:
Main header / Frame header1 / Frame1 / Frame header2 / Frame2 / etc.
Sorry I can't give you the file.
EDIT : new version of the code (3Go during manipulation, 1,5Go use in RAM at final) -- Thanks to Paul
import struct, numpy as np, matplotlib.pyplot as plt
filename = 'blabla'
with open(filename, mode="rb") as f:
# Initialize variables
width = 320
height = 256
frame_nb_octet = width * height * 2
count_frame = 1300
fmt = "<" + "H" * width * height # little endian and unsigned short
main_header_size = 4000
frame_header_size = 100
data = []
tab = []
# Read all images (<=> all the file to read once)
data.append(f.read())
data = data[0]
# -------------- BEFORE --------------
# # Convert bytes into int (be careful to pass main/fram headers)
# for indice in range(count_frame):
# ind_start = main_header_size + indice * (frame_header_size + frame_nb_octet) + frame_header_size
# ind_end = ind_start + frame_nb_octet
# tab.append(struct.unpack(fmt, data[ind_start:ind_end]))
# images = np.resize(np.array(tab), (count_frame, height, width))
# ------------------------------------
# Convert bytes into float (because after, mean, etc) passing main/frame headers
dt = np.dtype(np.uint16)
dt = dt.newbyteorder(('<'))
array = np.empty((frame_nb_octet, count_frame), dtype=float)
for indice in range(count_frame):
offset = main_header_size + indice * (frame_header_size + frame_nb_octet) + frame_header_size
array[:, indice] = np.frombuffer(data, dtype=dt, count=frame_nb_octet, offset=offset)
array = np.resize(array, (height, width, count_frame))
# Plotting first image to verify data
fig = plt.figure()
# plt.imshow(np.squeeze(images[0, :, :]))
plt.imshow(np.squeeze(array[:, :, 0]))
plt.show()
Performances:
- Before: 4Go RAM and 10 seconds
- After first edit : 3Go RAM during manipulation, 1.5Go final, and 4 seconds
Is there other way to convert faster my data, or using less RAM ?
Thank you in advance for your help/advice.
struct.unpacktrynp.frombuffer(buf, dtype)directly on thebytesobject.