comparing columns pandas python

Question

I have a csv file with 5 columns and many rows in the following format:

BAL 27  DEN 49  2013-09-05T20:30:00

I want to compare the 2 scores and return the name of the winner as a 6th column

I tried this:

from pandas import read_csv
Games = open("games.csv","rb")
df = read_csv(Games, header=None)
#print df
#print df[0]

if df[3] > df[1]:
    print df[2]
else:
    print df[0]

I am getting an ValueError: The truth value of a Series is ambiguous

Any ideas how I can accomplish my goal?

Noah · Accepted Answer · 2014-02-27 21:17:56Z

Basically, you have to remember that the boolean df["home"] > df["guest"] is a vector -- you can take advantage of this to assign the home team name to each row where the vector is True. You could try something like this:

Simulate some data:

In [22]: df = pandas.DataFrame({"home":[10,13,7,24,17], 
"guest":[13, 7, 7, 30, 17], 
"home_name":list("ABCDE"), 
"guest_name":list("abcde")})

Make a new column, and assign the guest name to each row that has the guest score greater than the home score (note that the other rows in the "winner" column will be NaN after the first assignment, and will get filled in progressively):

In [23]: df.loc[df["guest"]>df["home"], "winner"] = df["guest_name"]

In [24]: df.loc[df["guest"]<df["home"], "winner"] = df["home_name"]

In [25]: df.loc[df["guest"]==df["home"], "winner"] = "tie"

In [26]: df
Out[26]: 
  home_name guest_name  home  guest winner
0         A          a    10     13      a
1         B          b    13      7      B
2         C          c     7      7    tie
3         D          d    24     30      d
4         E          e    17     17    tie

Alvaro Fuentes · Accepted Answer · 2014-02-27 20:32:40Z

3

The problem with your code is that df[3] > df[1] returns a pandas.Series of booleans and as the message says The truth value of a Series is ambiguous.

Try this:

df[6] = df[0] #sets default value
df.loc[df[3]>df[1],6] = df[2] #change when second wins

Then you can do print df or print df[6].

Also you can do the reading part more easy: df = read_csv('games.csv', delim_whitespace=True,header=None)

edited Feb 27, 2014 at 20:32

answered Feb 27, 2014 at 20:27

Alvaro Fuentes

17.5k4 gold badges59 silver badges68 bronze badges

2 Comments

kegewe Over a year ago

this worked thanks. Note, it stopped working after I tried the easier reading method

Alvaro Fuentes Over a year ago

Yes, I supposed that. I just wanted you to know that you can read from csv directly using pandas. Hope it helps.

sabbahillel · Accepted Answer · 2014-02-28 04:07:49Z

0

An example as to how I processed a csv file

ifile = open('myinputfile', 'rb')
infile = csv.DictReader(ifile)
for row in infile:
   process-row(row)

Notice that you have to loop over each row in the infile. Similarly, your df is the set of file rows and you must loop over them to get each row in order to compare the columns.

edited Feb 28, 2014 at 4:07

answered Feb 27, 2014 at 20:29

sabbahillel

4,4351 gold badge22 silver badges37 bronze badges

2 Comments

kegewe Over a year ago

Ok, then how would I apply that in my case?

sabbahillel Over a year ago

@kegewe Now that you have the row, you should have a list of column values which can now be compared. print the row and you will see what I mean. The comparison of each column value will follow.

Collectives™ on Stack Overflow

comparing columns pandas python

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related