1

I have Pandas Dataframe df1 as:

ID | c1 | c2 | c3
-----------------
1  | A  | B  | 32
2  | C  | D  | 34
3  | A  | B  | 11
4  | E  | F  | 3

And df2:

ID | c1 | c2
------------
1  | A  | B
2  | C  | D
3  | E  | F

There is foreign key between df1 and df2 on columns (c1, c2). Join look like:

pd.merge(df1, df2, left_on=['c1','c2'], right_on = ['c1','c2'])

Result is:

ID_x| c1 | c2 | c3 | ID_y
-------------------------
1   | A  | B  | 32 | 1
2   | C  | D  | 34 | 2
3   | A  | B  | 11 | 1
4   | E  | F  | 3  | 3

I want to replace (c1,c2) in df1 with df2.id. Expected FINAL df1 is:

ID| c3 | df2_id
---------------
1 | 32 | 1
2 | 34 | 2
3 | 11 | 1
4 | 3  | 3

In other words I want to add column 'df2_id' in df1(filled with df2.id value for this row) and drop columns (c1,c2)(they are not necessary anymore).

I have idea to do that by:

  1. save result from merge in df1
  2. drop unnecessary columns (c1,c2)
  3. rename 'ID_y' to 'df2_id' and 'ID_x' to 'ID'

Is there any better solution?

1 Answer 1

2

We could make a one liner out of your steps by making use of suffixes argument and on instead of left_on, right_on plus using method chaining with drop:

df1.merge(df2, on=['c1','c2'], suffixes=['_1', '_2']).drop(['c1', 'c2'], axis=1)

Output

   ID_1  c3  ID_2
0     1  32     1
1     3  11     1
2     2  34     2
3     4   3     3

To make it exactly like OP's output:

df1.merge(df2, on=['c1','c2'], suffixes=['', '_2']).drop(['c1', 'c2'], axis=1).rename(columns={"id_2": "df2_id"})
Sign up to request clarification or add additional context in comments.

2 Comments

Wow, cool! :) To be perfect I would only change suffixes=['_1', '_2'] to suffixes=['', '_2'] and add in the end .rename(columns={"id_2": "df2_id"})

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.