2

I have a df1 as:

enter image description here

There are a lot of duplicating values for SUBJECT_ID as shown in the picture. I have a df2 to merge from, but I want to merge it on unique SUBJECT_ID. For now I only know how to merge to entire SUBJECT_ID through this code:

df1 = pd.merge(df1,df2[['SUBJECT_ID', 'VALUE']], on='SUBJECT_ID', how='left' )

But this will merge on every SUBJECT_ID. I just need unique SUBJECT_ID. Please help me with this.

6

2 Answers 2

2

I think you will find your answer with the merge documentation.

It's not fully clear what you want, but here are some examples that may contain the answer you are looking for:

import pandas as pd
df1 = pd.read_csv('temp.csv')
display(df1)

SUBJECT_ID = [31, 32, 33]
something_interesting = ['cat', 'dog', 'fish']
df2 = pd.DataFrame(list(zip(SUBJECT_ID, something_interesting)), 
                   columns =['SUBJECT_ID', 'something_interesting']) 
display(df2)

enter image description here

df_keep_all = df1.merge(df2, on='SUBJECT_ID', how='outer')
display(df_keep_all)

enter image description here

df_keep_df1 = df1.merge(df2, on='SUBJECT_ID', how='inner')
display(df_keep_df1)

enter image description here

df_thinned = pd.merge(df1.drop_duplicates(), df2, on='SUBJECT_ID', how='inner')
display(df_thinned)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

2

You can use pandas drop function for it using this function you can remove all duplicate values for column or columns.

df2 = df.drop_duplicates(subset=['SUBJECT_ID'])`

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.