5

I have a csv file with 5 columns and many rows in the following format:

BAL 27  DEN 49  2013-09-05T20:30:00   

I want to compare the 2 scores and return the name of the winner as a 6th column

I tried this:

from pandas import read_csv
Games = open("games.csv","rb")
df = read_csv(Games, header=None)
#print df
#print df[0]

if df[3] > df[1]:
    print df[2]
else:
    print df[0]

I am getting an ValueError: The truth value of a Series is ambiguous

Any ideas how I can accomplish my goal?

3 Answers 3

6

Basically, you have to remember that the boolean df["home"] > df["guest"] is a vector -- you can take advantage of this to assign the home team name to each row where the vector is True. You could try something like this:

Simulate some data:

In [22]: df = pandas.DataFrame({"home":[10,13,7,24,17], 
"guest":[13, 7, 7, 30, 17], 
"home_name":list("ABCDE"), 
"guest_name":list("abcde")})

Make a new column, and assign the guest name to each row that has the guest score greater than the home score (note that the other rows in the "winner" column will be NaN after the first assignment, and will get filled in progressively):

In [23]: df.loc[df["guest"]>df["home"], "winner"] = df["guest_name"]

In [24]: df.loc[df["guest"]<df["home"], "winner"] = df["home_name"]

In [25]: df.loc[df["guest"]==df["home"], "winner"] = "tie"

In [26]: df
Out[26]: 
  home_name guest_name  home  guest winner
0         A          a    10     13      a
1         B          b    13      7      B
2         C          c     7      7    tie
3         D          d    24     30      d
4         E          e    17     17    tie
Sign up to request clarification or add additional context in comments.

Comments

3

The problem with your code is that df[3] > df[1] returns a pandas.Series of booleans and as the message says The truth value of a Series is ambiguous.

Try this:

df[6] = df[0] #sets default value
df.loc[df[3]>df[1],6] = df[2] #change when second wins

Then you can do print df or print df[6].

Also you can do the reading part more easy: df = read_csv('games.csv', delim_whitespace=True,header=None)

2 Comments

this worked thanks. Note, it stopped working after I tried the easier reading method
Yes, I supposed that. I just wanted you to know that you can read from csv directly using pandas. Hope it helps.
0

An example as to how I processed a csv file

ifile = open('myinputfile', 'rb')
infile = csv.DictReader(ifile)
for row in infile:
   process-row(row)

Notice that you have to loop over each row in the infile. Similarly, your df is the set of file rows and you must loop over them to get each row in order to compare the columns.

2 Comments

Ok, then how would I apply that in my case?
@kegewe Now that you have the row, you should have a list of column values which can now be compared. print the row and you will see what I mean. The comparison of each column value will follow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.