1

If I have 2 lists or data frame (pandas) in python how do I merge / match / join them?

For example:

List / DF 1:

Table_Name  Table_Alias
  tab_1          t1
  tab_2          t2
  tab_3          t3

List / DF 2:

Table_Alias   Variable_Name
    t1            Owner
    t1            Owner_Id
    t2            Purchase_date
    t3            Maintenance_cost

Desired Result:

Table_Name   Table_Alias   Variable_Name
   tab_1         t1            Owner
   tab_1         t1            Owner_Id
   tab_2         t2            Purchase_date
   tab_3         t3            Maintenance_cost

NOTE : If I was doing this in R, I'd use something like:

df3 <- merge(df1, df2, by = 'Table_Alias', all.y = T)

What's the best way to do this in python?

2 Answers 2

2

You want an 'outer' merge:

In [9]:
df.merge(df1, how='outer')

Out[9]:
  Table_Name Table_Alias     Variable_Name
0      tab_1          t1             Owner
1      tab_1          t1          Owner_Id
2      tab_2          t2     Purchase_date
3      tab_3          t3  Maintenance_cost

It will match on overlapping columns from both dfs and return the union of the matching rows.

Sign up to request clarification or add additional context in comments.

Comments

-1

I would simply use pd.merge(df1, df2, how='outer',on='alias')

df1 = pd.DataFrame({ "table_name":['tab1',"tab2","tab3"],"talias ['t1','t2','t3']})
df2 = pd.DataFrame({"talias":['t1',"t1","t2",'t3'], "vname,['Owner','Owner_Id','Purchase_date','Maintenance_cost']})


pd.merge(df1,df2,how='outer', on='talias')


Out:
    Table_Alias Table_Name  Variable_Name
0   t1  tab1    Owner
1   t1  tab1    Owner_Id
2   t2  tab2    Purchase_date
3   t3  tab3    Maintenance_cost

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.