4

I have a dataframe df_in like so:

import pandas as pd
import numpy as np
dic_in = {'A':['A1','A1','A1','L3','A3','A3','B1','B1','B1','B2','A2'],
       'B':['xxx','ttt','qqq','nnn','lll','nnn','eee','xxx','qqq','bbb','sss'],
       'C':['fas','efe','pfo','scs','grj','rpo','cbb','asf','asc','wq3','mls']}
df_in = pd.DataFrame(dic_in)

I also have a another dataframe which is called df_map:

dic_map = {'X':['A1' ,'A1' ,'A1' ,'A2' ,'A3' ,'B1' ,'B1' ,'B1' ,'B1' ,'B2' ,'B3' ,'B3' ,'L1', 'L3' ,'L3'],
           'Y':['qqq','ttt','xxx','sss','lll','eee','qqq','xxx','zzz','bbb','mmm','ooo','kkk','nnn','ttt']}
df_map = pd.DataFrame(dic_map)

My goal is to study every single row[['A','B']] of df_in; if the couple of items is identified within df_map, then I extract the value of the corresponding index and I set it to another column in the first dataframe.

Ex: the couple A1 - xxx is found in map in the 0; therefore I will place a 0 next to the couple A1 - xxx. If a couple is not found then I will place NaN.

The result should look like this:

    Idx   A    B    C
0     2  A1  xxx  fas
1     1  A1  ttt  efe
2     0  A1  qqq  pfo
3    13  L3  nnn  scs
4     4  A3  lll  grj
5   NaN  A3  nnn  rpo
6     5  B1  eee  cbb
7     7  B1  xxx  asf
8     6  B1  qqq  asc
9     9  B2  bbb  wq3
10    3  A2  sss  mls

Can you suggest me a smart and efficient way to reach my goal?

1 Answer 1

4

You can use merge with reset_index, last remove columns by drop:

print (pd.merge(df_in, 
                df_map.reset_index(), 
                left_on=['A','B'], 
                right_on=['X','Y'], 
                how='left').drop(['X','Y'], axis=1))

     A    B    C  index
0   A1  xxx  fas    2.0
1   A1  ttt  efe    1.0
2   A1  qqq  pfo    0.0
3   L3  nnn  scs   13.0
4   A3  lll  grj    4.0
5   A3  nnn  rpo    NaN
6   B1  eee  cbb    5.0
7   B1  xxx  asf    7.0
8   B1  qqq  asc    6.0
9   B2  bbb  wq3    9.0
10  A2  sss  mls    3.0

Another solution, thank you Julien Marrec:

df_in.merge(df_map.reset_index()
                  .set_index(['X','Y']), 
            left_on=['A','B'], 
            right_index=True, 
            how='left')

Last if want change order of columns:

df = pd.merge(df_in, 
              df_map.reset_index(), 
              left_on=['A','B'], 
              right_on=['X','Y'], 
              how='left').drop(['X','Y'], axis=1)
cols = df.columns[-1:].tolist() + df.columns[:-1].tolist()
print (cols)
['index', 'A', 'B', 'C']

df = df[cols]
print (df)
    index   A    B    C
0     2.0  A1  xxx  fas
1     1.0  A1  ttt  efe
2     0.0  A1  qqq  pfo
3    13.0  L3  nnn  scs
4     4.0  A3  lll  grj
5     NaN  A3  nnn  rpo
6     5.0  B1  eee  cbb
7     7.0  B1  xxx  asf
8     6.0  B1  qqq  asc
9     9.0  B2  bbb  wq3
10    3.0  A2  sss  mls
Sign up to request clarification or add additional context in comments.

6 Comments

You beat me to it, I came up with df_in.merge(df_map.reset_index().set_index(['X','Y']), left_on=['A','B'], right_index=True, how='left')
What if I want to shift the index by 1 during the operation in such a way that the mapped indexes values are all +1?
By shift the index, do you mean after the merge, df['index'] += 1?
Or before merge? like shifted = df_map.reset_index() shifted['index'] = shifted['index'].shift() and then df = pd.merge(df_in, shifted, left_on=['A','B'], right_on=['X','Y'], how='left').drop(['X','Y'], axis=1) ?
well I was thinking to do it during the merge possibly... if it is not possible then i believe doing it before is better
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.