12

I am creating a sample dataframe:

tp = pd.DataFrame({'source':['a','s','f'], 
                   'target':['b','n','m'], 
                   'count':[0,8,4]})

And creating a column 'col' based on condition of 'target' column >> same as source, if matching condition, else to a default, as below:

tp['col'] = tp.apply(lambda row:row['source'] if row['target'] in ['b','n'] else 'x')

But it's throwing me this error: KeyError: ('target', 'occurred at index count')

How can I make it work, without defining a function?

0

1 Answer 1

25

You need to use axis=1 to tell Pandas you want to apply a function to each row. The default is axis=0.

tp['col'] = tp.apply(lambda row: row['source'] if row['target'] in ['b', 'n'] else 'x',
                     axis=1)

However, for this specific task, you should use vectorised operations. For example, using numpy.where:

tp['col'] = np.where(tp['target'].isin(['b', 'n']), tp['source'], 'x')

pd.Series.isin returns a Boolean series which tells numpy.where whether to select the second or third argument.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.