1

I have two dataframes:

>>> df1
[Output]: col1   col2   col3   col4
           a     abc     10    str1
           b     abc     20    str2
           c     def     20    str2
           d     abc     30    str2

>>> df2
[Output]: col1   col2   col3   col5   col6
           d     abc     30    str6    47
           b     abc     20    str5    66
           c     def     20    str7    53
           a     abc     10    str5    21

Below is what I want to generate:

>>> df_merged
[Output]: col1   col2   col5
           a     abc    str5
           b     abc    str5 
           c     def    str7
           d     abc    str6

I don't want to generate more than 4 rows and that is usually what happens when I try to merge the dataframes. Thanks for the tips!

2
  • I dont quite understand what your trying to merge on. Just col1? col1 and 2? In your example it actually wouldn't matter. Commented Jul 23, 2019 at 23:32
  • Why merge? I see only a sorted df2 with subsetted columns? Commented Jul 23, 2019 at 23:44

2 Answers 2

1

Use .merge by subselecting the correct columns and using col1 & col2 as key columns:

df1[['col1', 'col2']].merge(df2[['col1', 'col2', 'col5']], on=['col1', 'col2'])

  col1 col2  col5
0    a  abc  str5
1    b  abc  str5
2    c  def  str7
3    d  abc  str6
Sign up to request clarification or add additional context in comments.

Comments

1
df_merged = pd.DataFrame()
df_merged['col1'] = df1['col1'][0:3]
df_merged['col2'] = df1['col2'][0:3]
df_merged['col5'] = df2['col5'][0:3]

Does that help with what you're looking for?

1 Comment

My actual dataframes are much bigger that's why I wanted to use ` merge()`

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.