Merging data between two dataframes for multiple columns

Question

I've got two Excel Spreadsheets. The first has multiple columns, of which three (named differently) have common data. My second spreadsheet contains a key along with the data I want to bring into the first spreadsheet. For example:

Table 1:

Target Company ID	Subsidiary ID	Parent Company ID
1	2	3

Table 2:

Company ID	Company Size
1	Small
2	Medium
3	Large

With Excel I would do a series of Vlookups, but Table 1 is huge and is taking a long time to calculate. I would like to generate a Table such as:

Target Company ID	Target Size	Subsidiary ID	Sub Size	Parent Company ID	Parent Size
1	Small	2	Medium	3	Large

I've tried pd.merge, but it is not as targeted as a vlookup would be in assign a value to the specific column

SeaBean · Accepted Answer · 2021-10-01 15:05:48Z

5

You can make a mapping of Company ID to Company Size from df2 by .set_index() and specify the column.

Then, use .map() to map (like looking up) the Company ID in df1 to the mapping for each column, as follows:

mapping = df2.drop_duplicates('Company ID').set_index('Company ID')['Company Size']

for col in df1.columns:
    df1[f'{col.split()[0]} Size'] = df1[col].map(mapping)

Result:

print(df1)

   Target Company ID  Subsidiary ID  Parent Company ID Target Size Subsidiary Size Parent Size
0                  1              2                  3       Small          Medium       Large

In case you have other columns in the dataframes, you can select the related columns by .filter() for all columns with column names end with ID, as follows:

mapping = df2.drop_duplicates('Company ID').set_index('Company ID')['Company Size']

# select column automatically by `.filter` or enter manually as a list
cols = df1.filter(regex=r'ID$').columns    

for col in cols:
    df1[f'{col.split()[0]} Size'] = df1[col].map(mapping)

In case you want to put the related columns together, you can further use .sort_index(), as follows:

df1 = df1.sort_index(axis=1)

Result:

print(df1)

   Parent Company ID Parent Size  Subsidiary ID Subsidiary Size  Target Company ID Target Size
0                  3       Large              2          Medium                  1       Small

edited Oct 1, 2021 at 15:05

answered Oct 1, 2021 at 14:14

SeaBean

23.4k3 gold badges16 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

14 Comments

premiumcopypaper Over a year ago

I get an error: InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Pascal G. Bernard Over a year ago

try : mapping = df2.drop_duplicates('Company ID').set_index('Company ID')['Company Size']

SeaBean Over a year ago

@premiumcopypaper Your table 2 has duplicated entries of Company ID ? If yes, the code provided by Pascal can be used.

SeaBean Over a year ago

@PascalG.Bernard Thanks for sharing the codes to handle the issue : -)

Pascal G. Bernard Over a year ago

My pleasure @SeaBean - Congrats for the code you have provided - I was working on it but you clearly outperformed my proposition

|

gal peled · Accepted Answer · 2021-10-01 14:07:27Z

0

basic you want to marge the tables by left join 3 times each time tack a different column or just treat the 2 table as 3 tables like the following

table_1 = #read_file
table_2 = #read_file
targets = table_2[['Target Company ID','Target Size']]
Subsidiarys = table_2[['Subsidiary Company ID','Sub Size']]
Parents = table_2[['Parent Company ID','Parent Size']]
comb_table = pd.merge(table_1,targets,on=['Target Company ID'],how='left')
comb_table = pd.merge(comb_table ,Subsidiarys ,on=['Subsidiary Company ID'],how='left')
comb_table = pd.merge(comb_table ,Parents , on=['Parent Company ID'],how='left')

answered Oct 1, 2021 at 14:07

gal peled

4725 silver badges8 bronze badges

Collectives™ on Stack Overflow

Merging data between two dataframes for multiple columns

2 Answers 2

14 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

14 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related