1

I have two Excel files, say, wb1.xlsx and wb2.xlsx.

wb1.xlsx

adsl    svc_no    port_stat    adsl.1    Comparison result
2/17
2/24
2/27
2/33
2/37
3/12

wb2.xlsx

caller_id    status    adsl    Comparison result
n/a          SP        2/37    Not Match
n/a          RE        2/24    Not Match
n/a          SP        2/27    Match
n/a          SP        2/33    Not Match
n/a          SP        2/17    Match

What I want to do is match the adsl of wb2.xlsx to wb1.xlsx and get the other values to the other columns.

My expected output is to update wb1.xlsx with the values from wb2.xlsx

adsl    svc_no    port_stat    adsl.1    Comparison result
2/17    n/a       SP           2/17      Match
2/24    n/a       RE           2/24      Not Match
2/27    n/a       SP           2/27      Match
2/33    n/a       SP           2/33      Not Match
2/37    n/a       SP           2/37      Not Match
3/12 

Upon searching, I was able to check that pd.merge() is able to do the matching.

I tried it this way:

result = pd.merge(df2, pri_df, on=['adsl', 'adsl'])

Unfortunately, it creates new columns and do not update the existing. Also, it only gets the values that it was able to matched and disregard the other rows.

I also tried to get the indices of the columns in wb2.xlsx and assigned it to the columns wb1.xlsx but it just copied it literally.

Any reference that would help will do.

2 Answers 2

2

I suggest use intersection with combine_first:

print (df1)
   adsl  svc_no  port_stat  adsl.1  Comparison result
0  2/17     NaN        NaN     NaN                NaN
1  2/24     NaN        NaN     NaN                NaN
2  2/27     NaN        NaN     NaN                NaN
3  2/33     NaN        NaN     NaN                NaN
4  2/37     NaN        NaN     NaN                NaN
5  3/12     NaN        NaN     NaN                NaN

print (df2)
   caller_id port_stat  adsl Comparison result
0        NaN        SP  2/37         Not Match
1        NaN        RE  2/24         Not Match
2        NaN        SP  2/27             Match
3        NaN        SP  2/33         Not Match
4        NaN        SP  2/17             Match

df2 = df2.rename(columns={'status':'port_stat'})
d = {'adsl.1': lambda x: x['adsl']}
df2 = df2.assign(**d)
print (df2)
   caller_id port_stat  adsl Comparison result adsl.1
0        NaN        SP  2/37         Not Match   2/37
1        NaN        RE  2/24         Not Match   2/24
2        NaN        SP  2/27             Match   2/27
3        NaN        SP  2/33         Not Match   2/33
4        NaN        SP  2/17             Match   2/17

df22 = df2[df2.columns.intersection(df1.columns)]
print (df22)
  port_stat  adsl Comparison result adsl.1
0        SP  2/37         Not Match   2/37
1        RE  2/24         Not Match   2/24
2        SP  2/27             Match   2/27
3        SP  2/33         Not Match   2/33
4        SP  2/17             Match   2/17

result = (df22.set_index('adsl')
              .combine_first(df1.set_index('adsl'))
              .reset_index()
              .reindex(columns=df1.columns))
print (result)
   adsl  svc_no port_stat adsl.1 Comparison result
0  2/17     NaN        SP   2/17             Match
1  2/24     NaN        RE   2/24         Not Match
2  2/27     NaN        SP   2/27             Match
3  2/33     NaN        SP   2/33         Not Match
4  2/37     NaN        SP   2/37         Not Match
5  3/12     NaN       NaN    NaN               NaN
Sign up to request clarification or add additional context in comments.

11 Comments

How do you suggest that I would be able to wb1.xlsx with the values of *wb2.xlsx
@RickyAguilar - I change answer, can you check it?
Yes. It's like their key.
@RickyAguilar - Super, so solution should working nice.
Nope. The final format is the df1. The other values of empty columns will come from df2 that matched from df2['adsl'] to df1['adsl']
|
1

You can use isin function of pandas:

result = df2.loc[df2['adsl'].isin(pri_df['adsl'])]

Hope this will work for you.

3 Comments

I see, it matches the dataframes from two Excel files. How do you suggest I will get the values from wb2.xlsx and update it to wb1.xlsx?
You can create a new excel file instead of updating the same. and then dump the previous one.
Yes, that will be more likely to happen, although how will I be able to bring the other columns of wb2.xlsx to new excel file?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.