Python apply lambda with multiple columns

Question

Can anyone tell me why it doesn't work and how to fix it ?

I'm trying to use a lambda function to choose the value of a column based on a condition on another column.

df = pd.DataFrame({'A': [4, 8, 2, 7, 4],
                   'B': [8, 10, 3, 4, 1],
                   'C': [10, 8, 2, 6, 2]})

df

`df.apply(lambda x: x['B'] if x['A'].isin([1,2,3,4,5]) else x['C'])`

KeyError                                  Traceback (most recent call last)
c:\xxxxxx\xxxxxx\xxx Cellule 19 in <cell line: 1>()
----> 1 df.apply(lambda x: x['B'] if x['A'].isin([1,2,3,4,5]) else x['C'])

File c:\Anaconda\envs\xxxxx\xxxx.py:8839, in DataFrame.apply(self, func, axis, raw, result_type, args, **kwargs)
   8828 from pandas.core.apply import frame_apply
   8830 op = frame_apply(
   8831     self,
   8832     func=func,
   (...)
   8837     kwargs=kwargs,
   8838 )
-> 8839 return op.apply().__finalize__(self, method="apply")

File c:\Anaconda\xxxxxlib\site-packages\pandas\core\apply.py:727, in FrameApply.apply(self)
    724 elif self.raw:
    725     return self.apply_raw()
--> 727 return self.apply_standard()

File c:\Anaconda\envs\xxxx\pandas\core\apply.py:851, in FrameApply.apply_standard(self)
    850 def apply_standard(self):
--> 851     results, res_index = self.apply_series_generator()
    853     # wrap results
    854     return self.wrap_results(results, res_index)
...
    388     self._check_indexing_error(key)
--> 389     raise KeyError(key)
    390 return super().get_loc(key, method=method, tolerance=tolerance)

KeyError: 'A'

Naveed · Accepted Answer · 2022-10-25 14:57:48Z

1

you need to specify the axis=1 attribute. refer to dataframe.apply

df.apply(lambda x: x['B'] if x['A'].isin([1,2,3,4,5]) else x['C'], axis=1)

answered Oct 25, 2022 at 14:57

Naveed

11.7k2 gold badges16 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Pierre098 Over a year ago

Thank you very much, it works df.apply(lambda x: x['B'] if x['A'] in [1,2,3,4,5] else x['C'], axis=1)

mozway · Accepted Answer · 2022-10-25 15:07:58Z

1

Do not use apply, this is a waste of pandas' vectorial capabilities.

Use instead:

df['new'] = df['B'].where(df['A'].isin([1,2,3,4,5]), df['C'])

# or
df['new'] = df['B'].where(df['A'].between(1, 5, inclusive='both'), df['C'])

Or with numpy:

import numpy as np
df['new'] = np.where(df['A'].isin([1,2,3,4,5]), df['B'], df['C'])

output:

   A   B   C  new
0  4   8  10    8
1  8  10   8    8
2  2   3   2    3
3  7   4   6    6
4  4   1   2    1

answered Oct 25, 2022 at 15:07

mozway

267k13 gold badges56 silver badges106 bronze badges

1 Comment

manwong0606 Over a year ago

I like this method avoiding apply function. (asked in another post you replied before), if the new column is trying to return the 'column name' of each row with the second largest value (i.e. similar topic here with returning value depending on mutiple column), is that possible using this where function?

Collectives™ on Stack Overflow

Python apply lambda with multiple columns

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related