I work with large csv files and wanted to test if we can sum a numeric column using Python. I generated a random data set:
id,first_name,last_name,email,gender,money
1,Clifford,Casterou,[email protected],Male,53
2,Ethyl,Millichap,[email protected],Female,58
3,Jessy,Stert,[email protected],Female,
4,Doy,Beviss,[email protected],Male,80
5,Josee,Rust,[email protected],Female,13
6,Hedvige,Ahlf,[email protected],Female,67
On line 3 you will notice that value is missing(i removed that data on purpose to test.)
I wrote the code :
import csv
with open("mock_7.txt","r+",encoding='utf8') as fin:
headerline = fin.readline()
amount = 0
debit = 0
value = 0
for row in csv.reader(fin):
# var = row.rstrip()
value =row[5].replace('',0)
value= float(value)
debit+=value
print (debit)
I got the error :
Traceback (most recent call last):
File "sum_csv1_v2.py", line 11, in <module>
value+= float(value)
TypeError: must be str, not float
As i am new to Python, my plan was to convert the empty cells with zero but I think i am missing something here. Also my script is based on comma separated files but i'm sure it wont work for other delimited files. Can you help me improve this code?