0

I have two dataframes wi,

Site_ID Town
1235    Fitzroy
2344    Glen Iris

another with site_id and business name.

Site_ID Business Name
1235    BAC
2344    RFG

I would like to have only matched records upon joining two df's like this. After performing below merge function,

merge_df_rf1 = pd.merge(df1.drop_duplicates(), df2, on='site_id' ,how='inner')

I am getting this output.

Site_ID Business Name   Town
1235    BAC            Fitzroy
1235    BAC            Fitzroy
2354    RFG            Glen Iris
2354    RFG            Glen Iris

Not sure where I am going wrong with my join statement.

Any help on this will be highly appreciated.

Thank you in advance for the support!

1
  • I cannot reproduce this behaviour with the provided DataFrames. Your code does not produce duplicates on my end. Commented Aug 27, 2021 at 3:01

1 Answer 1

1

Just try specifying on only:

>>> df1.merge(df2, on='Site_ID').drop_duplicates()
   Site_ID       Town Business Name
0     1235    Fitzroy           BAC
1     2344  Glen Iris           RFG
>>> 
Sign up to request clarification or add additional context in comments.

1 Comment

Sorry about the late update on this. I did try above. but still I am getting duplicates

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.