2

I work with large csv files and wanted to test if we can sum a numeric column using Python. I generated a random data set:

id,first_name,last_name,email,gender,money
1,Clifford,Casterou,[email protected],Male,53
2,Ethyl,Millichap,[email protected],Female,58
3,Jessy,Stert,[email protected],Female,    
4,Doy,Beviss,[email protected],Male,80
5,Josee,Rust,[email protected],Female,13
6,Hedvige,Ahlf,[email protected],Female,67

On line 3 you will notice that value is missing(i removed that data on purpose to test.)

I wrote the code :

import csv
with open("mock_7.txt","r+",encoding='utf8') as fin:
    headerline = fin.readline()

    amount = 0
    debit = 0
    value = 0
    for row in csv.reader(fin):
    #     var = row.rstrip()
        value =row[5].replace('',0)
        value= float(value)
        debit+=value
    print (debit)

I got the error :

Traceback (most recent call last):
  File "sum_csv1_v2.py", line 11, in <module>
    value+= float(value)
TypeError: must be str, not float

As i am new to Python, my plan was to convert the empty cells with zero but I think i am missing something here. Also my script is based on comma separated files but i'm sure it wont work for other delimited files. Can you help me improve this code?

2
  • Opps sorry my bad. I have updated the question description. I was working with other files so i forgot to edit the script. Commented Jun 8, 2018 at 11:26
  • You could have a look at pandas. It can solve your problem in one line or two. Commented Jun 8, 2018 at 11:54

2 Answers 2

1

The original exception, now lost in the edit history,

TypeError: replace() argument 2 must be str, not int

is the result of str.replace() expecting string arguments, but you're passing an integer zero. Instead of replace you could simply check for empty string before conversion:

value = row[5]
value = float(value) if value else 0.0

Another option is to catch the potential ValueError:

try:
    value = float(row[5])

except ValueError:
    value = 0.0

This might hide the fact that the column contains "invalid" values other than just missing values.

Note that had you passed string arguments the end result would probably not have been what you expected:

In [2]: '123'.replace('', '0')
Out[2]: '0102030'

In [3]: float(_)
Out[3]: 102030.0

As you can see an empty string as the "needle" ends up replacing around each and every character in the string.


The latest exception in the question, after fixing the other errors, is the result of the float(value) conversion working and

value += float(value)

being equal to:

value = value + float(value)

and as the exception states, strings and floats don't mix.

Sign up to request clarification or add additional context in comments.

3 Comments

Hi, another problem has occurred is that my script is working for comma separated files but its not working for '|' delimited files. Any solution to that ?
Pass delimiter='|' to csv.reader(). And please refrain from adding up to the question. A question should be a bout a specific programming problem: stackoverflow.com/help/on-topic. If you have new questions, ask them as such.
I ran the query on an actual data and i realized it was not the problem of the delimiter but the values are something like this '12.4 ' . Can this blank be removed using strip? and where do i place the strip command ?
0

Problem with your code is you're calling replace() without checking if its row[5] is empty or not.

Fixed code:

import csv
with open("mock_7.txt","r+",encoding='utf8') as fin:
    headerline = fin.readline()

    amount = 0
    debit = 0
    value = 0
    for row in csv.reader(fin):
    #     var = row.rstrip()
        if row[5].strip() == '':
            row[5] = 0
        value = float(row[5])
        value += float(value)
        debit += value
    print (debit)

output:

542.0

1 Comment

Sorry, I did have a double sum them. I have removed the extra add in value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.