3

Input:

import pandas as pd 
data = [['tom', 'Delhi', 'Jaipur'], ['nick', 'Delhi', 'Delhi'], ['juli', '', 'Noida'], ['rob', 'Gurugram', ''], ['dan', '', '']] 
df = pd.DataFrame(data, columns = ['Name', 'City1', 'City2']) 

   Name     City1   City2
0   tom     Delhi  Jaipur
1  nick     Delhi   Delhi
2  juli             Noida
3  rob   Gurugram        
4  dan            

Expected Output: If values are same take any, if not the take any non-null if possible

   Name      City
0   tom     Delhi
1  nick     Delhi
2  juli     Noida
3  rob   Gurugram        
4  dan          

I tried looking for merge column here , but it didn't help in my case.

1
  • empty values are misisng values? Like data = [['tom', 'Delhi', 'Jaipur'], ['nick', 'Delhi', 'Delhi'], ['juli', np.nan, 'Noida'], ['rob', 'Gurugram', np.nan], ['dan', np.nan, np.nan]] df = pd.DataFrame(data, columns = ['Name', 'City1', 'City2']) ? Commented May 4, 2020 at 7:42

2 Answers 2

4

Use if empty values are empty strings use numpy.where with DataFrame.pop for extract columns:

df['City'] = np.where(df['City1'].eq(''), df.pop('City2'), df.pop('City1'))
print (df)
   Name      City
0   tom     Delhi
1  nick     Delhi
2  juli     Noida
3   rob  Gurugram
4   dan   

If empty values are NaNs use DataFrame.pop with Series.fillna:

data = [['tom', 'Delhi', 'Jaipur'],
        ['nick', 'Delhi', 'Delhi'], 
        ['juli', np.nan, 'Noida'], 
        ['rob', 'Gurugram', np.nan],
        ['dan', np.nan, np.nan]] 
df = pd.DataFrame(data, columns = ['Name', 'City1', 'City2'])


df['City'] = df.pop('City1').fillna(df.pop('City2'))
print (df)
   Name      City
0   tom     Delhi
1  nick     Delhi
2  juli     Noida
3   rob  Gurugram
4   dan       NaN 

If possible multiple columns City is possible replace empty string to missing values, back filling missing rows and select first column by position:

df1 = (df.set_index('Name')
         .replace('',np.nan)
         .bfill(axis=1)
         .iloc[:, 0]
         .reset_index(name='City'))
print (df1)
   Name      City
0   tom     Delhi
1  nick     Delhi
2  juli     Noida
3   rob  Gurugram
4   dan       NaN
Sign up to request clarification or add additional context in comments.

Comments

0

Try:

df['City'] = df['City1'].replace('',df['City2'])
df.drop(['City1', 'City2'], axis=1, inplace=True)
df
   Name      City
0   tom     Delhi
1  nick     Delhi
2  juli     Noida
3   rob  Gurugram
4   dan

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.