1

My data frame has a column called 'a' and it may contain 'apple' and 'orange'. What I want is to extract them if they exist, otherwise label 'others'.

I can simply loop over the rows and extract them. However, I saw some usage of numpy.where() for similar purposes, but only two categories.

result = numpy.where(df['a'].str.contains('apple'), 'apple', 'others')

Is it possible to apply it here for the case of 3 categories? In other words, result should contain entries of 'apple', 'orange', or 'others'.

Is there some better way to do it than simply looping?

2 Answers 2

3

Simply look for items that are apple or mango with np.in1d to create a boolean mask, which could then be used with np.where to set rest of them as others. Thus, we would have -

df['b'] = np.where(np.in1d(df.a,['apple','orange']),df.a,'others')

For cases when you might be looking to work with strings that have those names as part of bigger strings, you can use str.extract (caught this idea from @jezrael's solution, I hope that's okay!) and then use np.where, like so -

strings = df.a.str.extract('(orange|apple)')
df['b'] = np.where(np.in1d(strings,['apple','orange']),strings,'others')

Sample run -

In [294]: df
Out[294]: 
             a
0  apple-shake
1       orange
2  apple-juice
3        apple
4        mango
5       orange
6       banana

In [295]: strings = df.a.str.extract('(orange|apple)')

In [296]: df['b'] = np.where(np.in1d(strings,['apple','orange']),strings,'others')

In [297]: df
Out[297]: 
             a       b
0  apple-shake   apple
1       orange  orange
2  apple-juice   apple
3        apple   apple
4        mango  others
5       orange  orange
6       banana  others
Sign up to request clarification or add additional context in comments.

Comments

2

Use str.extract with fillna:

df = pd.DataFrame({'a': ['orange','apple','a']})
print (df)
        a
0  orange
1   apple
2       a

df['new'] = df.a.str.extract('(orange|apple)', expand=False).fillna('others')
print (df)
        a     new
0  orange  orange
1   apple   apple
2       a  others

1 Comment

I want the result to be one of the 3 possibilities: 'apple', 'orange' or 'others'.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.