1

Assume I have two Dataframes:

DF1: DATA1, DATA1, DATA2, DATA2

DF2: DATA2

I want to exclude all existence of data in DF2 while keeping duplicates in DF1, what should I do?

Expected result: DATA1, DATA1

2 Answers 2

1

Use left anti When you join two DataFrame using Left Anti Join (leftanti), it returns only columns from the left DataFrame for non-matched records.

df3 = df1.join(df2, df1['id']==df2['id'], how='left_anti')
Sign up to request clarification or add additional context in comments.

2 Comments

This is awesome, and what if I want to keep all data in DF1 that matches to DF2 which means expected result as DATA2, DATA2? I tried to switch the order but doesn't seem to work, it gives me empty result stackoverflow.com/questions/64311964/…
if u want matching records do inner join
0

df1.except(df2) will give you rows that present in df1 but not in df2

enter image description here

credits: https://sanori.github.io/2019/08/Compare-Two-Tables-in-SQL/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.