1

I have two dataframes. I want to merge them on keys typeA and typeB. I want to merge them on any one of two keys

# df_a
  typeA typeB value
0   b    a      3
1   c    d      4


# df_b
  typeA typeB value
0   a   b       1
1   c   d       2
pd.merge(df_a,df_b,on=['typeA','typeB'])
typeA   typeB   value_x value_y
0   c   d        4            2

but the result I desired is

typeA   typeB   value_x value_y
0 c      d       4            2
1 a      b       3            1

As long as the type pair matches,I merge them together. That means I want

   (df_a['typeA']=df_b['typeA'] And df_a['typeB']=df_b['typeB']) or (df_a['typeA']=df_b['typeB'] And df_a['typeB']=df_b['typeA'])

I thought it could be done by switch the column names of df_b and do merge process again. After that combine two merge result together. Just wondering if there is more efficient way to solve this problem.

1
  • What would the output be if instead of c/d in df_a - you had a/d? Commented Nov 25, 2016 at 8:00

1 Answer 1

2

One possible solution is sorted columns for join before merge:

df_a[['typeA','typeB']] = df_a[['typeA','typeB']].apply(sorted, axis=1)
df_b[['typeA','typeB']] = df_b[['typeA','typeB']].apply(sorted, axis=1)
print (df_a)
  typeA typeB  value
0     a     b      3
1     c     d      4

print (df_b)
  typeA typeB  value
0     a     b      1
1     c     d      2

df1 = pd.merge(df_a,df_b,on=['typeA','typeB'])
print (df1)
  typeA typeB  value_x  value_y
0     a     b        3        1
1     c     d        4        2
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.