Python - loop through a csv file row values

Question

I have a csv file like the following (test.csv) with two columns.

From column 1, I wanted to read current row and compare it with the value of the previous row. If it is greater OR equal, continue comparing and if the value of the current cell is smaller than the previous row - then i wanted to the value of the second column in the same row.

Next I wanted to divided the larger value we got in column 1 by the value in the same cell of column two. Let me make it clear.

For example in the table given above: the smaller value we will get depending on my requirement from Column 1 is 327 (because 327 is smaller than the previous value 340) - and then we take 500 (which is the corresponding cell value on column 2). Finally we divide 340 by 500 and get the value 0.68. My python script should exit right after we print the value to the console.

Currently, I am using the following script in bash, and it works fine

awk -F, '$1<p && $2!=0{ 
val=$2/p    
if(val>=0.8 && val<=0.9)
    {
        print "A"
    }
else if(val==0.7)
    {
        print "B"
    }
else if(val>=0.5 && val <0.7)
    {
        print "C" 

    }
else if(val==0.5)
    {
        print "E"
    }
else
    {
        print "D" 
    }
exit
}
{ 
    p=$1 
}' test.csv

but I wanted to do it with python and i would appreciate for any help. Here is my approach

import csv

f = open("test.csv", "r+")
ff = csv.reader(f)

previous_line = ff.next()
while(True):
    try:
        current_line = ff.next()
        if previous_line <= current_line:
            print "smaller value"
    except StopIteration:
        break

You should specify which exception you're expecting to get so that if you get a different one the program will still return useful information: for example except StopIteration: — TLOwater
– TLOwater, Commented May 8, 2017 at 12:42

BoarGules · Accepted Answer · 2017-05-16 17:01:38Z

3

I recommend you use csv.Reader's built-in iteration rather than calling .next() directly. And your code should not test normal floats for equality. In any language, that's not just a Python thing. Also, a calculated value of 0.79 will result in D which may not be what you intend.

from __future__ import division
import csv

def category(val):
    if 0.8 < val <= 0.9:
        return "A"
    #Note: don't test val == 0.7: you should never test floats for equality
    if abs(val - 0.7) < 1e-10:
        return "B"
    if 0.5 < val < 0.7:
        return "C"
    if abs(val - 0.5) < 1e-10:
        return "E"
    return "D"

with open(r"E:\...\test.csv", "r") as csvfile:
    ff = csv.reader(csvfile)

    previous_value = 0
    for col1, col2 in ff:
        if not col1.isdigit():
            continue
        value = int(col1)
        if value >= previous_value:
            previous_value = value
            continue
        else:
            result = previous_value / int(col2)
            print category(result)
            break

Edit in response to a change to the OP's request

from __future__ import division
import csv

def category(val):
    if 0.8 < val <= 0.9:
        return "A"
    #Note: don't test val == 0.7: you should never test floats for equality
    if abs(val - 0.7) < 1e-10:
        return "B"
    if 0.5 < val < 0.7:
        return "C"
    if abs(val - 0.5) < 1e-10:
        return "E"
    return "D"

with open(r"E:\...\test.csv", "r") as csvfile:
    ff = csv.reader(csvfile)

    results = []
    previous_value = 0
    for col1, col2 in ff:
        if not col1.isdigit():
            continue
        value = int(col1)
        if value >= previous_value:
            previous_value = value
            continue
        else:
            result = previous_value / int(col2)
            results.append(result)
            print category(result)
            previous_value = value
    print (results)
    print (sum(results))
    print (category(sum(results) / len(results)))

I've had to guess at the logic you want to use for resetting the previous value because your original had the loop break at the first result. And I've no idea how you want end-of-file handled. This revision produces the following output:

C
D
A
A
A
D
[0.68, 0.7604651162790698, 0.86, 0.8820058997050148, 0.8477611940298507, 0.9129032258064517]
4.94313543582
A

As you can see, there are definitely more than two values in results.

edited May 16, 2017 at 17:01

answered May 8, 2017 at 13:07

BoarGules

17.1k3 gold badges29 silver badges46 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Mahsolid Over a year ago

How about if we want to sum or do average of result? I tried it like this: sum_value = sum(result) but i found the following erro: TypeError: 'float' object is not iterable

BoarGules Over a year ago

sum_value = sum(result) won't work because result is not a list of floats, it is a single value. It gets freshly assigned in the for loop on each iteration. To do what you want, store each calculated value of result in a list called results then call sum(results) before break.

BoarGules Over a year ago

Store each calculated value in results. What results=[result] does is store the last calculated value. You need results.append(result), and somewhere earlier in your program, before the loop, you need to set up results as an empty list: results = [], otherwise you will get a NameError.

BoarGules Over a year ago

category() is the function that turns your number into a letter. If you just want the value, then print ("Results:%.2f" %(sum(results) / len(results)))

BoarGules Over a year ago

@Brown Sorry, I don't consider myself sufficiently experienced to answer numpy questions.

|

titiro89 · Accepted Answer · 2017-05-08 13:29:32Z

0

col_1 = []
col_2 = []
with open("test.csv", "r+") as f:
    for elem in f.readlines():
        col_1.append(float(elem.split(",")[0]))
        col_2.append(float(elem.split(",")[1]))

condition = True
i=0
while condition:
    if (col_1[i+1]-col_1[i]<0):
        print col_1[i]/col_2[i+1]
        condition = False
    i+=1

edited May 8, 2017 at 13:29

answered May 8, 2017 at 13:15

titiro89

2,1081 gold badge23 silver badges32 bronze badges

Comments

Mr.Pacman · Accepted Answer · 2017-05-08 16:03:27Z

If it is a .csv file working with pandas could give you more control.

import numpy as np
import pandas as pd

pd.read_csv("filename.csv") # to read a .csv file into a dataframe

However, for this example I am not using pd.read_csv() function. Instead, I am creating a dataframe from a 2D numpy array like so,

dataframe = pd.DataFrame(np.array([[338,800],
    [338,550],
    [339,670],
    [340,600], 
    [327,500],
    [301,430],
    [299,350],
    [284,339],
    [284,338],
    [283,335],
    [283,330],
    [283,310],
    [282,310],
    [282,300],
    [282,300],
    [283,290]]))

Now that I have a dataframe object, I can manipulate it just like other object types in python. I can call pandas specific functions to work on the dataframe for the results I want.

def compare_and_divide(df):
    for i in range(len(df)-1):
        # df[0] for all values in col 0 .iloc[i] for value in row
        if df[0].iloc[i+1] >= df[0].iloc[i]:                                 
            continue     
        else:
            df[0].iloc[i+1] = df[0].iloc[i]

    return df[0].div(df[1]) # .div() function to divide values in col 0 by col 1

compare_and_divide(dataframe)   

0     0.422500
1     0.614545
2     0.505970
3     0.566667
4     0.680000 # 340/500 value mentioned in the question
5     0.790698
6     0.971429
7     1.002950
8     1.005917
9     1.014925
10    1.030303
11    1.096774
12    1.096774
13    1.133333
14    1.133333
15    1.172414
dtype: float64

Collectives™ on Stack Overflow

Python - loop through a csv file row values

3 Answers 3

9 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

9 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related