I'm trying to make unit-test that deals with csv files using python unittest framework.
I want to test such cases as columns names match, values in columns match, etc.
I know that there are more convenient libraries for it, like datatest and pytest , but I can use only unittest in my project.
Guess I'm using wrong unittest.TestCase methods, and send data in the wrong format.
Please advise how to do it better way.
db.csv example:
TIMESTAMP TYPE VALUE YEAR FILE SHEET
0 02-09-2018 Index 45 2018 tq.xls A01
1 13-05-2018 Index 21 2018 tq.xls A01
2 22-01-2019 Index 9 2019 aq.xls B02
Here is code example:
import pandas as pd
import unittest
class DFTests(unittest.TestCase):
def setUp(self):
test_file_name = 'db.csv'
try:
data = pd.read_csv(test_file_name,
sep = ',',
header = 0)
except IOError:
print('cannot open file')
self.fixture = data
#Check column names
def test_columns(self):
self.assertEqual(
self.fixture.columns,
{'TIMESTAMP', 'TYPE', 'VALUE','YEAR','FILE','SHEET'},
)
#Check timestamp format
def test_timestamp(self):
self.assertRaisesRegex(
self.fixture['TIMESTAMP'],
r'\d{2}-\d{2}-\d{4}'
)
#Check year values
def test_year_values(self):
self.assertIn(
self.fixture['YEAR'],
{2018, 2019, 2020},
)
if __name__ == '__main__':
unittest.main()
Errors:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
TypeError: assertRaisesRegex() arg 1 must be an exception type or tuple of exception types
TypeError: 'Series' objects are mutable, thus they cannot be hashed
Any help is appreciated.