Merge multiindex with multiple column levels and dataframe

Question

Say I have a multiindex mi as follows:

        Serial No.               Date          
        A       B         A         B
0  816292  934609  27/01/17  27/01/17
1  983803  683858  25/01/17  26/01/17
2  596573  493741  27/01/17  28/01/17
3  199203  803515  28/01/17  28/01/17

A and B are two parts such that the multiindex contains information about the serial number and build date of multiple instances of the two parts.

I have a dataframe df containing test information for part A, as follows:

        A    Test 1    Test 2    Test 3      
0  816292  0.934609  0.475035  0.822712
1  983803  0.683858  0.025861  0.691112
2  596573  0.493741  0.397398  0.489101
3  199203  0.803515  0.679537  0.308588

I would like to be able to merge these two and yield something like

        Serial No.               Date                         Tests
        A       B         A         B    Test 1    Test 2    Test 3
0  816292  934609  27/01/17  27/01/17  0.934609  0.475035  0.822712
1  983803  683858  25/01/17  26/01/17  0.683858  0.025861  0.691112
2  596573  493741  27/01/17  28/01/17  0.493741  0.397398  0.489101
3  199203  803515  28/01/17  28/01/17  0.803515  0.679537  0.308588

My initial attempt was

mi = mi.merge(df,left_on=('Serial No.','A'),right_on='A',how='inner')

but that yields ValueError: len(right_on) must equal len(left_on). I have tried adding an additional column index 'Tests' to df and then doing

mi = mi.merge(df,left_on=('Serial No.','A'),right_on=('Tests','A'),how='inner')

but that yields KeyError: 'A'

Andy Hayden · Accepted Answer · 2017-11-08 05:46:21Z

2

The easiest way is to fix df's columns to match mi:

In [11]: df
Out[11]:
        A    Test 1    Test 2    Test 3
0  816292  0.934609  0.475035  0.822712
1  983803  0.683858  0.025861  0.691112
2  596573  0.493741  0.397398  0.489101
3  199203  0.803515  0.679537  0.308588

In [12]: df.columns = pd.MultiIndex.from_arrays([["Serial No.", "Test", "Test", "Test"], df.columns])

In [13]: df
Out[13]:
  Serial No.      Test
           A    Test 1    Test 2    Test 3
0     816292  0.934609  0.475035  0.822712
1     983803  0.683858  0.025861  0.691112
2     596573  0.493741  0.397398  0.489101
3     199203  0.803515  0.679537  0.308588

Then a merge will "just work":

In [14]: df.merge(mi)
Out[14]:
  Serial No.      Test                     Serial No.      Date
           A    Test 1    Test 2    Test 3          B         A         B
0     816292  0.934609  0.475035  0.822712     934609  27/01/17  27/01/17
1     983803  0.683858  0.025861  0.691112     683858  25/01/17  26/01/17
2     596573  0.493741  0.397398  0.489101     493741  27/01/17  28/01/17
3     199203  0.803515  0.679537  0.308588     803515  28/01/17  28/01/17

There's a bunch of ways to create the top level of the MultiIndex, here I just wrote the list:

["Serial No.", "Test", "Test", "Test"]

by hand... but you can generate that: it's just a list.

answered Nov 8, 2017 at 5:46

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

piRSquared Over a year ago

I like this as well.

Archie Over a year ago

Thanks for this and to @piRSquared for the solution below. This one got me to where I needed to be

piRSquared · Accepted Answer · 2017-11-08 05:45:55Z

1

mi.set_index(('Serial No.', 'A')).join(
    pd.concat([df.set_index('A')], axis=1, keys=['Tests'])
).reset_index()

  Serial No.              Date               Tests                    
           A       B         A         B    Test 1    Test 2    Test 3
0     816292  934609  27/01/17  27/01/17  0.934609  0.475035  0.822712
1     983803  683858  25/01/17  26/01/17  0.683858  0.025861  0.691112
2     596573  493741  27/01/17  28/01/17  0.493741  0.397398  0.489101
3     199203  803515  28/01/17  28/01/17  0.803515  0.679537  0.308588

answered Nov 8, 2017 at 5:45

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Collectives™ on Stack Overflow

Merge multiindex with multiple column levels and dataframe

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related