serializing float32 values in python and deserializing them in C++

Question

I have a bunch of 32-bit floating point values in a Python script, need to store them to disc and load them in a C++ tool. Currently they are written in human-readable format. However the loss of precision is too big for my (numeric) application.

How do I best store (and load) them without loss?

json only has one way to represent numbers, so that pretty much limits your options, no? json.org I would recommend using a serialization library, however. Don't try to figure this stuff out yourself -- it can get complicated. — xaxxon
– xaxxon, Commented Sep 26, 2017 at 9:33
@xaxxon Thanks for the remark. I can use base64 to store any binary format in json. I edit my question accordingly. — Tobias Hermann
– Tobias Hermann, Commented Sep 26, 2017 at 10:03
The JSON part is completely meaningless to your question at that point, since you can literally store anything as a base-64 encoded string. I suggest just removing that part entirely as it doesn't add anything except potential confusion. — xaxxon
– xaxxon, Commented Sep 26, 2017 at 10:06

Alexis Pierru · Accepted Answer · 2017-09-26 11:17:01Z

2

You can use float.hex in python to get the hexadecimal representation of your number, then read it using the std::hexfloat stream manipulator in C++.

answered Sep 26, 2017 at 11:17

Alexis Pierru

3422 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Tobias Hermann Over a year ago

Cool, thanks. Do you know if this is portable between different systems? The documentation for std::hexfloat does not mention IEEE etc.

Alexis Pierru Over a year ago

According to the c++14 standard, std::hexfloat is equivalent to a call of str.setf(ios_base::fixed | ios_base::scientific, ios_base::floatfield), and ios_base::fixed | ios_base::scientific is a format equivalent to stdio's %A. According to the C11 standard, the result is correctly rounded if FLT_RADIX is a power of 2.

Alexis Pierru Over a year ago

Also, the Python documentation states that float.hex and float.fromhex are compatible with stdio's %a/%A.

Netch Over a year ago

Well, this is better than decimal representation because avoids expensive conversion.

Netch · Accepted Answer · 2021-08-02 05:52:22Z

2

Currently they are written in the default human-readable format.

That's the reason. If require e.g. "%.9e" or "%.9g", it will print float32 values with precision enough to restore (provided there is no errors in conversions).

Also, for Python3, repr() gives a shortest decimal representation for a value that is converted back to the same binary value; but this is not true for Python2.

Answering a question from comments,

If I understand correctly, there are values that have a finite floating point representation in base 2, but the base 10 representation is infinite,

No, it is finite anyway, but just longer. For example (Python3)

>>> math.pi
3.141592653589793
>>> '%.60g' % math.pi
'3.141592653589793115997963468544185161590576171875'

but, all digits after the final significant one do not matter (remember Python float is double precision):

>>> float('3.1415926535897931')
3.141592653589793
>>> float('3.1415926535897932')
3.141592653589793
>>> float('3.1415926535897933')
3.141592653589793
>>> float('3.1415926535897933') - float('3.1415926535897931')
0.0

Finally, hexadecimal is more reliable if available in your code. I expect your Python and C++ versions fresh enough to support %a (stdio), float.hex (Python).

edited Aug 2, 2021 at 5:52

answered Sep 26, 2017 at 10:53

Netch

4,6221 gold badge23 silver badges37 bronze badges

7 Comments

Tobias Hermann Over a year ago

Thanks, but I would like to store them without any loss. I edited my question accordingly.

Netch Over a year ago

Yep. For float32, 8 significant decimal digits are enough for no loss.

Tobias Hermann Over a year ago

If I understand correctly, there are values that have a finite floating point representation in base 2, but the base 10 representation is infinite, i.e. one would need an infinite number of digits after the decimal separator to represent them. Short: No, there should always be a loss in precision when using your approach.

Mark Dickinson Over a year ago

@Netch: you need 9 significant digits, not 8. For example, 0x1.fffffep+9 = 1023.99993896484375 and 0x1.fffffcp+9 = 1023.9998779296875 are distinct exactly-representable float32 values, but their decimal representations agree when rounded to 8 significant digits: both give 1023.9999.

Tobias Hermann Over a year ago

@Netch Ah OK, thanks. I think I now understand. 2 is a prime factor of 10 (base in decimal), so one can express every binary float in decimal. Is it just the other way that can make problems, since for example 1/5 can be expressed cleanly in decimal but not in binary since 5 is a prime factor of 10 but not of 2.

|

Collectives™ on Stack Overflow

serializing float32 values in python and deserializing them in C++

2 Answers 2

4 Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related