2

Say I have a multiindex mi as follows:

        Serial No.               Date          
        A       B         A         B
0  816292  934609  27/01/17  27/01/17
1  983803  683858  25/01/17  26/01/17
2  596573  493741  27/01/17  28/01/17
3  199203  803515  28/01/17  28/01/17

A and B are two parts such that the multiindex contains information about the serial number and build date of multiple instances of the two parts.

I have a dataframe df containing test information for part A, as follows:

        A    Test 1    Test 2    Test 3      
0  816292  0.934609  0.475035  0.822712
1  983803  0.683858  0.025861  0.691112
2  596573  0.493741  0.397398  0.489101
3  199203  0.803515  0.679537  0.308588

I would like to be able to merge these two and yield something like

        Serial No.               Date                         Tests
        A       B         A         B    Test 1    Test 2    Test 3
0  816292  934609  27/01/17  27/01/17  0.934609  0.475035  0.822712
1  983803  683858  25/01/17  26/01/17  0.683858  0.025861  0.691112
2  596573  493741  27/01/17  28/01/17  0.493741  0.397398  0.489101
3  199203  803515  28/01/17  28/01/17  0.803515  0.679537  0.308588

My initial attempt was

mi = mi.merge(df,left_on=('Serial No.','A'),right_on='A',how='inner')

but that yields ValueError: len(right_on) must equal len(left_on). I have tried adding an additional column index 'Tests' to df and then doing

mi = mi.merge(df,left_on=('Serial No.','A'),right_on=('Tests','A'),how='inner')

but that yields KeyError: 'A'

2 Answers 2

2

The easiest way is to fix df's columns to match mi:

In [11]: df
Out[11]:
        A    Test 1    Test 2    Test 3
0  816292  0.934609  0.475035  0.822712
1  983803  0.683858  0.025861  0.691112
2  596573  0.493741  0.397398  0.489101
3  199203  0.803515  0.679537  0.308588

In [12]: df.columns = pd.MultiIndex.from_arrays([["Serial No.", "Test", "Test", "Test"], df.columns])

In [13]: df
Out[13]:
  Serial No.      Test
           A    Test 1    Test 2    Test 3
0     816292  0.934609  0.475035  0.822712
1     983803  0.683858  0.025861  0.691112
2     596573  0.493741  0.397398  0.489101
3     199203  0.803515  0.679537  0.308588

Then a merge will "just work":

In [14]: df.merge(mi)
Out[14]:
  Serial No.      Test                     Serial No.      Date
           A    Test 1    Test 2    Test 3          B         A         B
0     816292  0.934609  0.475035  0.822712     934609  27/01/17  27/01/17
1     983803  0.683858  0.025861  0.691112     683858  25/01/17  26/01/17
2     596573  0.493741  0.397398  0.489101     493741  27/01/17  28/01/17
3     199203  0.803515  0.679537  0.308588     803515  28/01/17  28/01/17

There's a bunch of ways to create the top level of the MultiIndex, here I just wrote the list:

["Serial No.", "Test", "Test", "Test"]

by hand... but you can generate that: it's just a list.

Sign up to request clarification or add additional context in comments.

2 Comments

I like this as well.
Thanks for this and to @piRSquared for the solution below. This one got me to where I needed to be
1
mi.set_index(('Serial No.', 'A')).join(
    pd.concat([df.set_index('A')], axis=1, keys=['Tests'])
).reset_index()

  Serial No.              Date               Tests                    
           A       B         A         B    Test 1    Test 2    Test 3
0     816292  934609  27/01/17  27/01/17  0.934609  0.475035  0.822712
1     983803  683858  25/01/17  26/01/17  0.683858  0.025861  0.691112
2     596573  493741  27/01/17  28/01/17  0.493741  0.397398  0.489101
3     199203  803515  28/01/17  28/01/17  0.803515  0.679537  0.308588

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.