6

I'm writing unittest for a method that returns a dataframe, but, while testing the output using:

self.asserEquals(mock_df, result)

I'm getting ValueError:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Right now I'm comparing properties that serves the purpose now,

self.assertEqual(mock_df.size, result.size)
self.assertEqual(mock_df.col_a.to_list(), result.col_a.to_list())
self.assertEqual(mock_df.col_b.to_list(), result.col_b.to_list())
self.assertEqual(mock_df.col_c.to_list(), result.col_c.to_list())

but curious how do I assert dataframes.

3 Answers 3

8

The accepted answer from @Mahi did not work for me. It failed for two Dataframes that should have been equal. Not sure why.

As I discovered here under "DataFrame equality", there are some functions built into Pandas for testing.

The following worked for me. I tested it several times, but not exhaustively, to make sure it would work repeatedly.

import unittest
import pandas as pd

class test_something(unittest.TestCase):
    def test_method(self):
        #... create dataframes df1 and df2...
        pd.testing.assert_frame_equal(df1,df2)

Here is related pandas reference for the above function.

Sign up to request clarification or add additional context in comments.

1 Comment

assert_frame_equal is good for testing. If the test fails, it also shows the differences between the two dataframes.
6
import unittest
import pandas as pd

class TestDataFrame(unittest.TestCase):
    def test_dataframe(self):
        df1 = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
        df2 = pd.DataFrame({'a': [1, 2], 'b': [3.0, 4.0]})
        self.assertEqual(True, df1.equals(df2))

if __name__ == '__main__':
    unittest.main()

3 Comments

Thanks, Mahi, though it's working in some cases. For a few I get AssertionError visually the data looks the same and is indexed properly.
I also had an issue with this approach not working. I found an alternate approach that worked and posted that as an alternate answer.
I think you can simplify slightly using self.assertTrue(df1.equals(df2))
0

The answer of @BioData41 is working for me, however assert_frame_equal is throwing an exception if the comparison is failing.

For a simple boolean result, I am using this:

test_df1 = df1.reset_index(drop=True)[list(df2.columns)]
test_df2 = df2.reset_index(drop=True)
test_df1.compare(test_df2).empty # Should return True or False

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.