1

Suppose I have 3 dataframes,

df_1 representing revenues

Year        TSLA    MSFT     AVY
2019         851     200   112.8
2018         725     150    92.6

df_2 representing some other revenues

Year        TSLA    MSFT     AVY
2019          10      13      17
2018          11      14      18

and df_3 representing expenses

Year        TSLA    MSFT     AVY
2019         110     213     317
2018         111     214     418

what might the code be to obtain the following df?

       TSLA                             MSFT                             AVY
Year   revenues other_revenues expenses revenues other_revenues expenses revenues other_revenues expenses
2019        851             10      110      200             13      213    112.8             17      317
2018        725             11      111      150             14      214     92.6             18      418

where the columns are in multi-index form?

Thanks

1 Answer 1

2

Use concat with DataFrame.swaplevel and DataFrame.sort_index for sorting MultiIndex:

#if not Year is index first create it
L = [x.set_index('Year') for x in [df_1, df_2, df_3]]
df = (pd.concat(L, 
               axis=1, 
               keys=('revenues', 'other_revenues', 'expenses'))
        .swaplevel(1, 0, axis=1)
        .sort_index(axis=1))
print (df)
         AVY                            MSFT                            TSLA  \
     revenues expenses other_revenues revenues expenses other_revenues revenues   
Year                                                                           
2019   112.8      317             17     200      213             13     851   
2018    92.6      418             18     150      214             14     725   

                              
     expenses other_revenues  
Year                          
2019      110             10  
2018      111             11  

EDIT: For order like in original add DataFrame.reindex by MultiIndex.from_product from unique values of first level of MultiIndex:

sub = ['revenues', 'other_revenues', 'expenses']
L = [x.set_index('Year') for x in [df_1, df_2, df_3]]
df = (pd.concat(L, 
               axis=1, 
               keys=sub)
        .swaplevel(1, 0, axis=1))

mux = pd.MultiIndex.from_product([df.columns.levels[0], sub])
df = df.reindex(mux, axis=1)
        
print (df)
         TSLA                             MSFT                          \
     revenues other_revenues expenses revenues other_revenues expenses   
Year                                                                     
2019      851             10      110      200             13      213   
2018      725             11      111      150             14      214   

          AVY                          
     revenues other_revenues expenses  
Year                                   
2019    112.8             17      317  
2018     92.6             18      418  
Sign up to request clarification or add additional context in comments.

4 Comments

Is there a mistake with swaplevel and sort index, after keys?
@finmathstudent - ya, it was missing one ', sorry.
What would the code be if I want to specify the order ('revenues', 'other_revenues', expenses') for the inner level column index? Rather than the sort you chose?
@finmathstudent - Give me a sec.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.