python dataframe column apply a function [duplicate]

Question

I have a dataframe

import pandas as pd
data = {'A': ['SA01', '0007', 'SA06', '0198', 'SA06'], 
        'B': [2012, 2012, 2013, 2014, 2014], }
df = pd.DataFrame(data)

df = A     B
     SA01  2012
     0007  2012
     SA06  2013
     0198  2014
     SA06  2014

I want to use df.apply or other functions of pandas to add a df['C'] as follows:

df = A     B     C
     SA01  2012  M
     0007  2012  F
     SA06  2013  M
     0198  2014  F
     SA06  2014  M

If df['A'] contains substring 'SA' then df['C'] is 'M' else 'F'. How to solve?

jezrael · Accepted Answer · 2018-09-12 12:33:57Z

2

Use numpy.where with boolean mask created by contains or startswith:

df['new'] = np.where(df['A'].str.contains('SA'), 'M', 'F')
#alternative solution
#df['new'] = np.where(df['A'].str.startswith('SA'), 'M', 'F')
print (df)
      A     B new
0  SA01  2012   M
1  0007  2012   F
2  SA06  2013   M
3  0198  2014   F
4  SA06  2014   M

answered Sep 12, 2018 at 12:33

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

python dataframe column apply a function [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related