Python - Lambda function on multiple columns

Question

I have a pandas dataframe that I would prefer to use a lambda function rather than a loop to solve my problem.

The problem is as such;

df = pd.DataFrame({'my_fruits':['fruit', 'fruit', 'fruit', 'fruit', 'fruit'],
         'fruit_a': ['apple', 'banana', 'vegetable', 'vegetable', 'cherry'],
         'fruit_b': ['vegetable', 'apple', 'vegeatble', 'pineapple', 'pear']})

If I apply the following loop;

for i in np.arange(0,len(df)):
    if df['fruit_a'][i] == 'vegetable' or df['fruit_b'][i] == 'vegetable':
        df['my_fruits'][i] = 'not_fruit'

I am able to get the result that I want. This is that if either of the fruit_a or fruit_b columns containing the value vegetable, I want the my_fruits column to be equal to not_fruit.

How can I possible set this up in a lamda function. Was not able to understand how two columns inputs can be used to change a different columns values. Thanks!

I don't get the question. A lambda expression is simply an alternative syntax for defining a function in the special case of when the function body consists of only return <expression>. A function is not an alternative for a for loop. The alternative to certain special cases of for loop is a comprehension, but your loop is not such a special case. — Terry Jan Reedy
– Terry Jan Reedy, Commented Jan 19, 2017 at 21:07

jezrael · Accepted Answer · 2017-01-19 20:59:47Z

3

You can use Series.mask by boolean mask:

mask = (df['fruit_a'] == 'vegetable') | (df['fruit_b'] == 'vegetable')
print (mask)
0     True
1    False
2     True
3     True
4    False
dtype: bool


df.my_fruits = df.my_fruits.mask(mask, 'not_fruits')
print (df)
     fruit_a    fruit_b   my_fruits
0      apple  vegetable  not_fruits
1     banana      apple       fruit
2  vegetable  vegetable  not_fruits
3  vegetable  pineapple  not_fruits
4     cherry       pear       fruit

Another solution for mask is compare all selected columns by vegetable and then get all True at least in one column by any:

print ((df[['fruit_a', 'fruit_b']] == 'vegetable'))
  fruit_a fruit_b
0   False    True
1   False   False
2    True    True
3    True   False
4   False   False

mask = (df[['fruit_a', 'fruit_b']] == 'vegetable').any(axis=1) 
print (mask)
0     True
1    False
2     True
3     True
4    False
dtype: bool

answered Jan 19, 2017 at 20:59

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jim mako Over a year ago

Much appreciated for the alternative method

jezrael Over a year ago

Thank you for accepting, yes, another method is better if many columns.

roman · Accepted Answer · 2017-01-19 21:10:47Z

3

you can do this with apply method:

>>> df.my_fruits = df.apply(lambda x: 'not_fruit' if x['fruit_a'] == 'vegetable' or x['fruit_b'] == 'vegetable' else x['my_fruits'], axis=1)
0    not_fruit
1        fruit
2    not_fruit
3    not_fruit
4        fruit

Or you can do it like this:

>>> df.my_fruits[(df['fruit_a'] == 'vegetable') | (df['fruit_b'] == 'vegetable')] = 'not_fruit'
>>> df
     fruit_a    fruit_b  my_fruits
0      apple  vegetable  not_fruit
1     banana      apple      fruit
2  vegetable  vegeatble  not_fruit
3  vegetable  pineapple  not_fruit
4     cherry       pear      fruit

edited Jan 19, 2017 at 21:10

answered Jan 19, 2017 at 21:08

roman

118k30 gold badges205 silver badges209 bronze badges

3 Comments

roman Over a year ago

agreed, just wanted to show how it could be done with lambda function

jezrael Over a year ago

Sure, alternative solution is better.

jim mako Over a year ago

Thanks, this at least shows me how it could be done with apply. Thanks

piRSquared · Accepted Answer · 2017-01-19 22:13:04Z

2

Using pd.Series.where and checking if 'vegetable' in one step combined with any.
where is opposite of mask which is why I use the negation of cond.
Otherwise, this is very similar in spirit to jezrael's answer

cond = df[['fruit_a', 'fruit_b']].eq('vegetable').any(1)
df.my_fruits = df.my_fruits.where(~cond, 'not_fruit')

Answered from my phone. Please forgive typos.

edited Jan 19, 2017 at 22:13

answered Jan 19, 2017 at 22:07

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Collectives™ on Stack Overflow

Python - Lambda function on multiple columns

3 Answers 3

2 Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related