1

I have the first pandas.DataFrame

        first_key  second_key
0               0           1
1               0           1
2               0           2
3               0           3
4               0           3

and also the second pandas.DataFrame

              key      status
0               1       'good'
1               2        'bad'
2               3       'good'

And I want to get the following pandas.DataFrame

        first_key  second_key  status
0               0           1  'good'
1               0           1  'good'
2               0           2   'bad'
3               0           3  'good'
4               0           3  'good'

How to do this?

4 Answers 4

3

Use map by Series created from second DataFrame:

df['status'] = df['second_key'].map(df1.set_index('key')['status'])
print (df)
   first_key  second_key  status
0          0           1  'good'
1          0           1  'good'
2          0           2   'bad'
3          0           3  'good'
4          0           3  'good'
Sign up to request clarification or add additional context in comments.

Comments

3

pd.DataFrame.join

df1.join(df2.set_index('key'), on='second_key')

   first_key  second_key  status
0          0           1  'good'
1          0           1  'good'
2          0           2   'bad'
3          0           3  'good'
4          0           3  'good'

Comments

2

you can also use merge() method:

In [75]: d1.merge(d2.rename(columns={'key':'second_key'}))
Out[75]:
   first_key  second_key  status
0          0           1  'good'
1          0           1  'good'
2          0           2   'bad'
3          0           3  'good'
4          0           3  'good'

.map() method shown in @jezrael's answer is more preferrable (and more efficient) if you want to add a single column. Use .merge() method if you need to add multiple columns.

Comments

0

Here is another example using the merge function (rather than the merge method).

In [16]: merged = pd.merge(df1, df2, how="inner", left_on="second_key", right_on="key")

In [17]: merged.drop("key", axis=1, inplace=True)

In [18]: merged
Out[18]: 
   first_key  second_key status
0          0           1   good
1          0           1   good
2          0           2    bad
3          0           3   good
4          0           3   good

In [19]:

IMHO, using the function makes the transformation slightly more explicit. But this is obviously a matter or taste...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.