I have two data frames df1 and df2 as shown below:
df1:
| company | occupation | |
|---|---|---|
| 0 | A | Administrator |
| 1 | B | Engineer |
| 2 | C | Engineer |
| 3 | D | Account |
| 4 | E | Administrator |
| 5 | F | Engineer |
df2:
| occupation | description | |
|---|---|---|
| 0 | Account | balance |
| 1 | Engineer | database |
| 2 | Administrator | chores |
| 3 | Administrator | calling |
| 4 | Engineer | frontend |
| 5 | Engineer | backendend |
What I want:
| company | occupation | description | |
|---|---|---|---|
| 0 | A | Administrator | chores |
| 1 | B | Engineer | database |
| 2 | C | Engineer | frontend |
| 3 | D | Account | balance |
| 4 | E | Administrator | calling |
| 5 | F | Engineer | backendend |
I tried pd.merge(df1,df2,how="inner"), but always get duplicates row:
| company | occupation | description | |
|---|---|---|---|
| 0 | A | Administrator | chores |
| 1 | A | Administrator | calling |
| 2 | E | Administrator | chores |
| 3 | E | Administrator | calling |
| 4 | B | Engineer | database |
| 5 | B | Engineer | frontend |
| 6 | B | Engineer | backendend |
| 7 | C | Engineer | database |
| 8 | C | Engineer | frontend |
| 9 | C | Engineer | backendend |
| 10 | F | Engineer | database |
| 11 | F | Engineer | frontend |
| 12 | F | Engineer | backendend |
| 13 | D | Account | balance |
code:
import pandas as pd
df1 = pd.DataFrame({"company":["A","B","C","D","E","F"],"occupation":["Administrator","Engineer","Engineer","Account","Administrator","Engineer"]})
df2 = pd.DataFrame({"occupation":["Account","Engineer","Administrator","Administrator","Engineer","Engineer"],"description":["balance","database","chores","calling","frontend","backendend"]})
df3 = pd.DataFrame({"company":["A","B","C","D","E","F"],"occupation":["Administrator","Engineer","Engineer","Account","Administrator","Engineer"],"description":["chores","database","balance","frontend","calling","backendend"]})
df4 = pd.merge(df1,df2,how="inner")
display(df1)
display(df2)
display(df3)
display(df4)
df1to the corresponding occurrence indf2e.g. 1st Engineer is assigned 'database'. If yes, then your desired output maybe inaccurate?frontendandbalancemight be swapped.