2

I am trying to unit test my code. I have a method that given a MySQL query, returns the result as a pandas dataframe. Note that in the database, all returned values in created and external_id are NULL. Here is the test:

def test_get_data(self):

    ### SET UP

    self.report._query = "SELECT * FROM floor LIMIT 3";
    self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
    self.d = {'id': p.Series([1, 2, 3]),
              'facility_id': p.Series([1, 1, 1]),
              'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
              'created': p.Series(['None', 'None', 'None']),
              'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
              'external_id': p.Series(['None', 'None', 'None'])
              }
    self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
    self.df.fillna('None')
    print(self.df)
    ### CODE UNDER TEST

    result = self.report.get_data(self.report._cursor_web)
    print(result)
    ### ASSERTIONS

    assert_frame_equal(result, self.df)

Here is the console output (note the print statements in the test code. The manually constructed dataframe is on top, the one derived from the function being tested is on the bottom):

.   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
    assert_frame_equal(result, self.df)
   File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
  File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
  File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)

AssertionError: DataFrame.iloc[:, 3] are different

DataFrame.iloc[:, 3] values are different (100.0 %)
[left]:  [None, None, None]
[right]: [None, None, None]

----------------------------------------------------------------------
Ran 1 test in 0.354s

FAILED (failures=1)

By my reckoning, column 'created' contains three string values of 'None' in both the left and right dataframes. Why is it asserting not equal?

2
  • 1
    Maybe one of them is not the string "None" but just NoneType None? Commented Jun 6, 2017 at 20:25
  • 1
    @ayhan that was it, thank you! I hadn't yet encountered NoneType. Please post your answer as an answer and I'll accept it! Commented Jun 6, 2017 at 20:31

1 Answer 1

1

Python also has a built-in constant None that is different from the string 'None'. From the docs:

None

The sole value of the type NoneType. None is frequently used to represent the absence of a value, as when default arguments are not passed to a function. Assignments to None are illegal and raise a SyntaxError.

In the case of comparing None against 'None' (None == 'None') the result will be False. Therefore, assert_frame_equal will raise an AssertionError if one of the DataFrames contains None but the other contains 'None'.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.