1

I'm trying to sum a column in a csv file using python. Here's a sample of the csv data;

Date,Profit/Losses
Jan-2010,867884
Feb-2010,984655
Mar-2010,322013
Apr-2010,-69417
May-2010,310503
Jun-2010,522857
Jul-2010,1033096
Aug-2010,604885
Sep-2010,-216386

I want to sum the Profit/Losses column. I am using the following code but it's returning a 0. Where could I be going wrong?

import os
import csv

# Path to collect data from the csv file in the Resources folder
pybank_csv = os.path.join("resources", "budget_data.csv")

with open(pybank_csv, 'r') as csvfile:       
   csvreader = csv.reader(csvfile, delimiter=',')
   next(csvfile, None)    
   t = sum(float(row[1]) for row in csvreader)

   #print the results
   print(f"Total: {t}")
6
  • The code seems to be perfectly fine. Please check that you are opening the right file and that Profit/Losses column doesn't sum up to 0 in reality (for example by leaving just a few first rows in it) Commented Dec 4, 2020 at 22:13
  • @fdermishin thanks for that. I am opening the correct file and the total of the column in the CSV file is not 0. Commented Dec 4, 2020 at 22:17
  • I ran the code on the sample you provided and got output Total: 4360090.0 Commented Dec 4, 2020 at 22:21
  • I've added this code ' row_count = sum(1 for row in csvreader)' and it returns '9' so I know it's reading the correct file but I'm still getting a 0 Commented Dec 4, 2020 at 22:32
  • 1
    ooh yes that could be where I am going wrong. I was reading the rows counts and then trying to sum using the 'code' csvreader. Commented Dec 4, 2020 at 22:59

1 Answer 1

1

Easiest way is to use pandas library.

Use pip install pandas to install pandas on your machine

and then

import pandas as pd
df = pd.read_csv('your_filename.csv')
sumcol = df['Profit/Losses'].sum()
print(sumcol)

The sum is in sumcol object now. For future reference, If your task is to work with the data provided in csv file, pandas is a blessing. This library provides you with thousands of different types of operations you could perform on your data. Refer Pandas Website for more info.

If you want to make use of csv package only then you can read the csv as a dict and then sum the Profit/Loss entry of dict for each row

total = 0
with open('your_filename.csv', newline='') as csvfile:
    data = csv.DictReader(csvfile)
    for row in data:
        total = total + int(row['Profit/Losses'])
print(total)

Or If you want to use reader instead of dict reader, you need to ignore first row. Something like this

total = 0
with open('your_filename.csv', newline='') as csvfile:
    data = csv.reader(csvfile)
    for row in data:
        if not str(row[1]).startswith('P'):
            total = total + int(row[1])
 print(total)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for that. This is for a class assignment and the instructor doesn't want us to use pandas just yet. That's for next week. Just wondering why the code is returning a 0
I tested my code and sumcol prints 4360090 as it should. Also I will edit my answer to accommodate for 'csv library only' code
Thanks for the help. It now works. I'm new to python and also stackoverflow so I really appreciate your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.