5

I want to convert a Python float into a byte array, encoding it as a 32 bit little-endian IEEE floating point number, in order to write it to a binary file.

What is the modern Pythonic way to do that in Python 3? For ints I can do my_int.to_bytes(4,'little'), but there is no to_bytes method for floats.

It's even better if I can do this in one shot for every float in a numpy array (with dtype numpy.float32). But note that I need to get it as a byte array, not just write the array to a file immediately.

There are some similar-sounding questions, but they seem mostly to be about getting the hex digits, not writing to a binary file.

5
  • docs.python.org/3.7/library/struct.html Commented Nov 12, 2019 at 7:16
  • The right tools for manipulating individual, native Python scalars are usually not the right tools for manipulating NumPy arrays. If you want a NumPy solution, I recommend specifically asking about NumPy and leaving regular Python types out (and expect to get non-NumPy answers anyway from people who don't know NumPy). Commented Nov 12, 2019 at 7:18
  • @user2357112 I'd be happy with a non-numpy answer, since I'm writing the floats one at a time. I mentioned numpy mostly because a numpy solution won't hurt (I'm importing it anyway) and might be useful to know in the future. Commented Nov 12, 2019 at 7:22
  • 1
    You might want to try to find a way to avoid writing them one at a time. That'll be slow. Commented Nov 12, 2019 at 7:24
  • @user2357112 you're right. Luckily, the numpy solutions have enabled me to do that :) Commented Nov 12, 2019 at 7:37

4 Answers 4

6

NumPy arrays come with a tobytes method that gives you a dump of their raw data bytes:

arr.tobytes()

You can specify an order argument to use either C-order (row major) or F-order (column major) for multidimensional arrays.

Since you want to dump the bytes to a file, you may also be interested in the tofile method, which dumps the bytes to a file directly:

arr.tofile(your_file)

tofile always uses C-order.

If you need to change endianness, you can use the byteswap method. (newbyteorder has a more convenient signature, but doesn't change the underlying bytes, so it won't affect tobytes.)

import sys
if sys.byteorder=='big':
    arr = arr.byteswap()
data_bytes = arr.tobytes()
Sign up to request clarification or add additional context in comments.

4 Comments

How can I specify the endian-ness?
@Nathaniel: Answer expanded.
Thanks! From hpaulj's answer it seems I can also specify endianness in the dtype, as another way to do it
How do you go backto thenumpy array from this dump?
3

You could use struct to pack the bytes like,

>>> import struct
>>> struct.pack('<f', 3.14) # little-endian
b'\xc3\xf5H@'
>>> struct.pack('>f', 3.14) # big-endian
b'@H\xf5\xc3'

Comments

1

With the right dtype you can write the array's data buffer to a bytestring or to a binary file:

In [449]: x = np.arange(4., dtype='<f4')                                        
In [450]: x                                                                     
Out[450]: array([0., 1., 2., 3.], dtype=float32)
In [451]: txt = x.tostring()                                                    
In [452]: txt                                                                   
Out[452]: b'\x00\x00\x00\x00\x00\x00\x80?\x00\x00\x00@\x00\x00@@'
In [453]: x.tofile('test')                                                                                                                           
In [455]: np.fromfile('test','<f4')                                             
Out[455]: array([0., 1., 2., 3.], dtype=float32)
In [459]: with open('test','br') as f: print(f.read())                          
b'\x00\x00\x00\x00\x00\x00\x80?\x00\x00\x00@\x00\x00@@'

Change endedness:

In [460]: x.astype('>f4').tostring()                                            
Out[460]: b'\x00\x00\x00\x00?\x80\x00\x00@\x00\x00\x00@@\x00\x00'

3 Comments

How can I specify the endian-ness?
'<f4' versus '>f4'
Great, thanks, this works very well. For future reference, tostring is a compatibility alias for tobytes.
0

There are save/savez methods in numpy:

Store data to disk, and load it again:

>>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
>>> np.load('/tmp/123.npy')
array([[1, 2, 3],
       [4, 5, 6]])

Store compressed data to disk, and load it again:

>>> a=np.array([[1, 2, 3], [4, 5, 6]])
>>> b=np.array([1, 2])
>>> np.savez('/tmp/123.npz', a=a, b=b)
>>> data = np.load('/tmp/123.npz')
>>> data['a']
array([[1, 2, 3],
       [4, 5, 6]])
>>> data['b']
array([1, 2])
>>> data.close()

3 Comments

save differs from tofile in that it also saves the shape and dtype in an initial data block.
@hpaulj shape and dtype don't take much space, but play the crucial role in flawless reading and converting the data back instead of getting a bunch of binary garbage.
As a complete file format, .npy is definitely more informative and useful than just a raw byte dump. As an intermediate representation or a component of a larger file, a raw byte dump may be significantly more useful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.