2

I want to merge 2 csv file with a similar column but different header name.

a.csv:

id name country
1 Cyrus MY
2 May US

b.csv:

user_id  gender 
1 female
2 male

What I need is, c.csv:

id name country gender
1 Cyrus MY female
2 May US male

But the result I get when I use the below code

import csv
import pandas as pd

df1 = pd.read_csv('a.csv')
df2 = pd.read_csv('b.csv')

df3 = pd.merge(df1,df2, left_on=['id'],right_on=['user_id'], how='outer')
df3.to_csv('c.csv',index=False)

The result I get:

id name country user_id gender
1 Cyrus MY 1 female
2 May US 2 male
1
  • I guess df3.drop('user_id ', inplace=True, axis=1) would solve your problem. Commented Mar 16, 2018 at 3:57

2 Answers 2

1

You could rename the user_id column in df2 to id. Since the name is the same, it won't be duplicated.

df2 = pd.read_csv('b.csv').rename(columns={'user_id': 'id'})
df3 = pd.merge(df1, df2, on='id', how='outer')

Otherwise you can drop the user_id column adter the merge.

df3 = df3.drop('user_id', axis=1)
Sign up to request clarification or add additional context in comments.

Comments

1

You can do with merge

df1.merge(df2,left_on='id',right_on='user_id')
Out[35]: 
   id   name country  user_id  gender
0   1  Cyrus      MY        1  female
1   2    May      US        2    male

Or concat

pd.concat([df1.set_index('id'),df2.set_index('user_id')],1).reset_index()
Out[38]: 
   index   name country  gender
0      1  Cyrus      MY  female
1      2    May      US    male

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.