pandas assert_frame_equal behavior

Question

I am attempting to compare two DataFrames with pandas testing assert_frame_equal. These frames contain floats that I want to compare to some user defined precision.

The check_less_precise argument from assert_frame_equal seems to suggest that I can specify the number of digits after the decimal point to compare. To quote the API Reference page -

check_less_precise: Specify comparison precision. Only used when check_exact is False. 5 digits (False) or 3 digits (True) after decimal points are compared. If int, then specify the digits to compare

API Reference

However, This doesn't seem to work when the floats are less than 1.

This raises an AssertionError

import pandas as pd

expected = pd.DataFrame([{"col": 0.1}])
output = pd.DataFrame([{"col": 0.12}])
pd.testing.assert_frame_equal(expected, output, check_less_precise=1)

while this does not

expected = pd.DataFrame([{"col": 1.1}])
output = pd.DataFrame([{"col": 1.12}])
pd.testing.assert_frame_equal(expected, output, check_less_precise=1)

can someone help explain this behavior, is this a bug?

Tim · Accepted Answer · 2022-03-22 20:46:25Z

10

check_less_precise works more like relative tolerance. See details below.

I dug through the source code and found out what is happening. Eventually the function decimal_almost_equal gets called which looks like this in normal Python (its in Cython).

def decimal_almost_equal(desired, actual, decimal):
    return abs(desired - actual) < (0.5 * 10.0 ** -decimal)

See the source code here Here is actual call to the function:

decimal_almost_equal(1, fb / fa, decimal)

Where in this example

fa = .1
fb = .12
decimal = 1

So the function call becomes

decimal_almost_equal(1, 1.2, 1)

Which decimal_almost_equal evaluates as

abs(1 - 1.2) < .5  * 10 ** -1

Or

.2 < .05

Which is False.

So the comparison is based on percentage difference and not total difference it seems.

If you want an absolute comparison, check out np.allclose.

np.allclose(expected, output, atol=.1)
True

edited Mar 22, 2022 at 20:46

Tim

3,4671 gold badge19 silver badges27 bronze badges

answered Sep 19, 2017 at 16:38

Ted Petrou

62.4k19 gold badges139 silver badges139 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

RoachLord Over a year ago

So your not going to have much luck comparing anything to fa = 0.0

Ted Petrou Over a year ago

I think it takes a different path for 0. There is much more to the story than this little snippet.

michcio1234 Over a year ago

Thanks Ted, I started to thinking that I'm blind or something, trying to figure out why I'm getting assertion errors although they shouldn't appear...

Tim Over a year ago

I was confused by decimal_almost_equal, which seemed like an absolute tolerance, but upon closer inspection, the call decimal_almost_equal(1, fb / fa, decimal) is really relative tolerance! Just want to make this note here in case others may be confused as well.

Collectives™ on Stack Overflow

pandas assert_frame_equal behavior

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related