pandas multiIndex restructuring DataFrame

Question

After grouping procedure for DataFrame, I have table structured this way:

                sum     count
endAt       id      
2021-01-02  628 100.0   1
2021-01-03  628 300.0   1
2021-01-06  32  100.0   1
2021-01-07  629 50.0    1
2021-01-08  619 150.0   2
... ... ... ...
2021-04-22  860 100.0   2
            861 150.0   2
            869 350.0   6
            876 100.0   1
            883 200.0   4

Tried to pivot table but stuck with re-indexing the structure. is a way to restructure DataFrame with following indexes:

colums = pd.MultiIndex.from_product([['2021-01-01', '2021-01-02', ....],['sum','count']], names=['date','types'])
index = pd.MltiIndex.from_product([32,619,628,....],names=['id'])

and get DataFrame structured like:

id      endAt       2021-01-02  2021-01-02
                    sum count   sum count
32                  100 1       200 2
619                 0   0       100 1
628                 300 3       0   0
...........
883                 100 1       200 2

jezrael · Accepted Answer · 2021-04-23 11:43:10Z

1

I think you need DataFrame.unstack by first level, then swap levels in MultiIndex and sorting:

df = df.unstack(level=0, fill_value=0).swaplevel(1,0, axis=1).sort_index(axis=1)
print (df)
endAt 2021-01-02        2021-01-03        2021-01-06        2021-01-07        \
           count    sum      count    sum      count    sum      count   sum   
id                                                                             
32             0    0.0          0    0.0          1  100.0          0   0.0   
619            0    0.0          0    0.0          0    0.0          0   0.0   
628            1  100.0          1  300.0          0    0.0          0   0.0   
629            0    0.0          0    0.0          0    0.0          1  50.0   

endAt 2021-01-08         
           count    sum  
id                       
32             0    0.0  
619            2  150.0  
628            0    0.0  
629            0    0.0

Sample data:

d = {'sum': {('2021-01-02', 628): 100.0, ('2021-01-03', 628): 300.0, ('2021-01-06', 32): 100.0, ('2021-01-07', 629): 50.0, ('2021-01-08', 619): 150.0}, 'count': {('2021-01-02', 628): 1, ('2021-01-03', 628): 1, ('2021-01-06', 32): 1, ('2021-01-07', 629): 1, ('2021-01-08', 619): 2}}

df = pd.DataFrame(d).rename_axis(['endAt','id'])
print (df)
                  sum  count
endAt      id               
2021-01-02 628  100.0      1
2021-01-03 628  300.0      1
2021-01-06 32   100.0      1
2021-01-07 629   50.0      1
2021-01-08 619  150.0      2

edited Apr 23, 2021 at 11:43

answered Apr 23, 2021 at 11:19

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

OcMaRUS Over a year ago

thank you. Yes, i am playing with unstack. Unfortunatel, swaplevel got error back "unexpected param axis" (what is strange to me). Can you show me how you initialized your df ?

jezrael Over a year ago

@OcMaRUS - Added to answer.

OcMaRUS Over a year ago

! Thank you! Yes, exactly what is needed. Now, i have to find out what wrong with my df, because your procedure is not working for.

jezrael Over a year ago

@OcMaRUS - There is only "unexpected param axis" error? Maybe try upgdate pandas.

OcMaRUS Over a year ago

Much appreciate your time and ideas!! Yes, I got the result! The problem was in incorrect indexes. It was important to set up right multiIndex df and last .sort_index - is very important also!

|

Collectives™ on Stack Overflow

pandas multiIndex restructuring DataFrame

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related