2

I wanted to create a new column of a dataframe based on existing columns, however I want it to be conditional on another existing column in my dataframe. The following code is not working. Does anyone know why?

if CV['keyword'] == 0:
    CV['left out'] = (CV['Prediction Numerator'] - (CV['Rate'] *10000))/(CV['Prediction Denominator'] - 10000)
else:
    CV['left out'] = (CV['Prediction Numerator'] - (CV['Rate'] *10000 * 10))/(CV['Prediction Denominator'] - (10000 * 10))

I'm getting the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\bwei\Downloads\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\pandas\core\generic.py", line 709, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Here's a snippet of the first 4 columns of my dataframe.

        Zip  keyword  Prediction Numerator  Prediction Denominator  
0     01001        0        7650546.693200            40002.558782   
1     01001        0        7650546.693200            40002.558782   
2     01001        0        7650546.693200            40002.558782   
3     01001        0        7650546.693200            40002.558782   
4     01002        0            157.951741                0.718621   
5     01002        0            157.951741                0.718621   
6     01005        0        3600150.148240            20000.671431   
7     01005        0        3600150.148240            20000.671431   
8     01007        0        6932235.816260            30000.936191   
9     01007        0        6932235.816260            30000.936191   
10    01007        0        6932235.816260            30000.936191   

Thanks, Ben

0

4 Answers 4

4

This should work:

CV.loc[CV['keyword']==0,'left out']=expression1
CV.loc[CV['keyword']!=0,'left out']=expression2
Sign up to request clarification or add additional context in comments.

2 Comments

This was exactly what I want, thanks! Just so I understand what it's doing, it's essentially looking at locations where keyword = 0, and then setting left out equal to each given expression?
Exactly - might not be fastest as you filter 2x but simplest to read/understand
1

Instead of CV['keyword'] == 0, you should use 'keyword' in CV.columns to see if there is a column named "keyword" in CV.

3 Comments

No the keyword column is always there, I just want to say if keyword = 0 for a specific then perform operation a and if it's equal to 0, perform operation b.
You're going to have to describe more precisely what you want. What does it mean to compare an entire column to 0? There are many possible interpretations. Please clarify your question.
I edited my question to give more clarity. For example, at row 0, keyword = 0, so for that I'd want ['left out'] to be equal to the if, if keyword == 1, I'd want ['left out'] to be equal to the else.
1

When you write

if CV['keyword'] == 0:

then CV['keyword'] is a column, and comparing it to 0 returns a boolean series. You cannot perform an if on such a series (which value would determine if it's True or False?), and hence the error.

Fortunately, CV.columns works pretty much like a Python list, so you can check membership using it.

Comments

0

What you want is

CV['left out'] = np.where(CV['keyword'] == 0,
  (CV['Prediction Numerator'] - (CV['Rate'] *10000))/(CV['Prediction Denominator'] - 10000), 
   (CV['left out'] = (CV['Prediction Numerator'] - (CV['Rate'] * 10000 * 10))/(CV['Prediction Denominator'] - (10000 * ))
)

2 Comments

using the same if else control flow?
You have two formulas and you want to use the first formula for rows where CV['keyword'] == 0 and the other formula for rows where that's not the case, is that correct? np.where does the if else element wise in a vectorized way. You would have to loop through your rows using if you want to use if else which would be inefficient.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.