1

I have two dataframes with names df1 and df2.

df1=

   col1   col2  count
0   1      36   200
1   12     15   200
2   13     17   100

df2=

    product_id  product_name
0      1            abc
1      2            xyz
2      3            aaaa
3      12           qwert 
4      13           sed
5      15           qase
6      36           asdf
7      17           zxcv

The entries in col1 and col2 are product_id from df2.

I want to make a new dataframe 'df3', which has the following columns and entries.

df3=

   col1 | col1_name | col2 | col2_name | count
0   1   |   abc     |   36 |    asdf   |  200
1   12  |   qwert   |   15 |    qase   |  200
2   13  |   sed     |   17 |    zxcv   |  100

i.e add a col1_name and col2_name wherever product_id from df2 is equal to col1 & col2 values.

Is it possible to do so with:

df3 = pd.concat([df1, df2], axis=1)

My knowledge of Pandas df and Python is beginner level. Is there a way to do so? Thanks in advance.

0

1 Answer 1

3

I think you can use map by dict generated from df2 and then sort columns names by sort_index:

d = df2.set_index('product_id')['product_name'].to_dict()
print (d)
{1: 'abc', 2: 'xyz', 3: 'aaaa', 36: 'asdf', 17: 'zxcv', 12: 'qwert', 13: 'sed', 15: 'qase'}

df1['col1_name'] = df1.col1.map(d)
df1['col2_name'] = df1.col2.map(d)
df1 = df1.sort_index(axis=1)
print (df1)
   col1 col1_name  col2 col2_name  count
0     1       abc    36      asdf    200
1    12     qwert    15      qase    200
2    13       sed    17      zxcv    100

df1 = df1.drop(['col1','col2'], axis=1)
print (df1)
  col1_name col2_name  count
0       abc      asdf    200
1     qwert      qase    200
2       sed      zxcv    100
Sign up to request clarification or add additional context in comments.

16 Comments

yes sir,this is working, apart from that , if i only want to display col1_name | col2_name | count in my final result. is there a better way than to df1.drop('col1','col2')
appreciated,a nice solution using map()
i tried using the same logic on a bigger set of data and my col1_name and col2_name are coming NaN
You get NaN if some values are not in df2.
You can test it with sample and d = {1: 'abc', 2: 'xyz', 3: 'aaaa'}
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.