35

I have a dataframe that looks like

df

viz  a1_count  a1_mean     a1_std
n         3        2   0.816497
y         0      NaN        NaN 
n         2       51  50.000000

I want to convert the "viz" column to 0 and 1, based on a conditional. I've tried:

df['viz'] = 0 if df['viz'] == "n" else 1

but I get:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

2 Answers 2

33

You're trying to compare a scalar with the entire series which raise the ValueError you saw. A simple method would be to cast the boolean series to int:

In [84]:
df['viz'] = (df['viz'] !='n').astype(int)
df

Out[84]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

You can also use np.where:

In [86]:
df['viz'] = np.where(df['viz'] == 'n', 0, 1)
df

Out[86]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

Output from the boolean comparison:

In [89]:
df['viz'] !='n'

Out[89]:
0    False
1     True
2    False
Name: viz, dtype: bool

And then casting to int:

In [90]:
(df['viz'] !='n').astype(int)

Out[90]:
0    0
1    1
2    0
Name: viz, dtype: int32
Sign up to request clarification or add additional context in comments.

3 Comments

stumbled upon this post while researching something. 2 years later, there may be new options now. In my code just used this: pd.to_numeric(myDF['myDFCell'], errors='coerce'). This is probably newer pandas syntax. The coerce flag tells it to convert that which cannot be converted to number to NA so it won't throw errors.
@TMWP that maybe true but the OP doesn't want to convert 'n' to a numeric which would convert in this case to NaN so this is slightly different use case here
Right. Just a useful addition. I came to this post while researching how to convert a string column in Pandas to numeric. This was the closest hit to that question that stack overflow came up with.
4

From @TMWP's comment above:

pd.to_numeric(myDF['myDFCell'], errors='coerce')

It works like a charm and is a quick and simple one liner

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.