1

I have a dataframe d1:

import pandas as pd

df1 = pd.DataFrame({('sw1', '2020-01-01 00:00:00'): {'A1': 5.496714153011234,
  'B1': 4.536582307187538,
  'C1': 6.465648768921554},
 ('sw1', '2020-01-01 00:15:00'): {'A1': 5.417291254384371,
  'B1': 5.089825801985299,
  'C1': 5.32977925506902},
 ('sw2', '2020-01-01 00:00:00'): {'A1': 5.593791702359273,
  'B1': 3.1212115651371235,
  'C1': 4.546877553622513},
 ('sw2', '2020-01-01 00:15:00'): {'A1': 6.385936244917259,
  'B1': 4.66918047921994,
  'C1': 5.303265379619803},
 ('clust', ''): {'A1': 1, 'B1': 2, 'C1': 3}})
df1.columns.names = ['None', 'dtime']
df1.index.names = ['dev']
df1

>>> df1
None                  sw1                                     sw2                     clust
dtime 2020-01-01 00:00:00 2020-01-01 00:15:00 2020-01-01 00:00:00 2020-01-01 00:15:00
dev
A1               5.496714            5.417291            5.593792            6.385936     1
B1               4.536582            5.089826            3.121212            4.669180     2
C1               6.465649            5.329779            4.546878            5.303265     3

I would like to transform it to this format:

>>> df2
cust                        1                   2                   3
dev                        A1                  B1                  C1
sw                        sw1       sw2       sw1       sw2       sw1       sw2
dtime
2020-01-01 00:00:00  5.496714  5.593792  4.536582  3.121212  6.465649  4.546878
2020-01-01 00:15:00  5.417291  6.385936  5.089826  4.669180  5.329779  5.303265

How to do that?

(I am adding this text because stackoverflow gave to me the following error: "It looks like your post is mostly code; please add some more details.", so I need to put some extra text in the post. Please Ignore.)

1 Answer 1

3

First MultiIndex column clust to index with select by tuple and DataFrame.set_index with append=True for avoid lost dev values, then reshape by DataFrame.stack and DataFrame.unstack, last change order in MultiIndex and sorting by DataFrame.reorder_levels and DataFrame.sort_index:

df = (df1.set_index(('clust',''), append=True)
         .rename_axis(index=('dev','clust'), columns=('sw','dtime'))
         .stack()
         .unstack([0,1])
         .reorder_levels((2,1,0), axis=1)
         .sort_index(axis=1)
        )
print (df)
clust                       1                   2                   3  \
dev                        A1                  B1                  C1   
sw                        sw1       sw2       sw1       sw2       sw1   
dtime                                                                   
2020-01-01 00:00:00  5.496714  5.593792  4.536582  3.121212  6.465649   
2020-01-01 00:15:00  5.417291  6.385936  5.089826  4.669180  5.329779   

clust                          
dev                            
sw                        sw2  
dtime                          
2020-01-01 00:00:00  4.546878  
2020-01-01 00:15:00  5.303265  

Similar solution with reshape by DataFrame.stack and transpose:

df = (df1.set_index(('clust',''), append=True)
         .rename_axis(index=('dev','clust'), columns=('sw','dtime'))
         .stack(0)
         .T
         .reorder_levels((1,0,2), axis=1)
        )
print (df)
clust                       1                   2                   3  \
dev                        A1                  B1                  C1   
sw                        sw1       sw2       sw1       sw2       sw1   
dtime                                                                   
2020-01-01 00:00:00  5.496714  5.593792  4.536582  3.121212  6.465649   
2020-01-01 00:15:00  5.417291  6.385936  5.089826  4.669180  5.329779   

clust                          
dev                            
sw                        sw2  
dtime                          
2020-01-01 00:00:00  4.546878  
2020-01-01 00:15:00  5.303265  
Sign up to request clarification or add additional context in comments.

4 Comments

the second one is clean nice one
Both solutions are excellent. I don't understand why you needed tuple with empty string as a second parameter. Your first solution I have changed to: (df1.set_index(('clust'), append=True) .rename_axis(columns=('sw','dtime')) .stack() .unstack([0,1]) .reorder_levels((2,1,0), axis=1) .sort_index(axis=1) ) and second to: (df1.set_index('clust', append=True) .rename_axis(columns=('sw','dtime')) .stack(0) .T .reorder_levels((1,0,2), axis=1) ) Thank you very very much jezrael.
@user3225309 - Reason is if check print (df.columns.tolist()) all columns are in MultiIndex
Now I see why. Thanks for an explanation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.