4

I want to pivot a dataframe like:

       dim1   Value_V     Value_y   instance
0      A_1     50.000000        0   instance200
1      A_2   6500.000000        1   instance200
2      A_3     50.000000        0   instance200
3      A_4   4305.922313        1   instance200

Into a dataframe with hierarchical columns like that:

              A_1               A_2               A_3                .....
              Value_V  Value_y  Value_V  Value_y  Value_V  Value_y
instance200   50       0        6500     1        50       0

I tried df = df.pivot(index = "instance", columns = "dim1"), but it will only give me a frame like that:

              Value_V               Value_y                              
              A_1   A_2   A_3 ....  A_1  A_2  A_3 ....
instance200   50    6500  50        0    1    0

How can i change the hierarchy of the columns?

4 Answers 4

5

I figured it out by myself:

df = df.swaplevel(0,1,axis = 1).sort(axis = 1)

will do

Sign up to request clarification or add additional context in comments.

2 Comments

as a side note, swaplevel is good for swapping between 2, whereas reorder_levels can perform multiple at once, if that helps :)
You are reading my mind: I was just wondering what the difference between swaplevel and reorder_levels might be :)
3

What you need is reorder_levels and then sort the columns, like this:

import pandas as pd

df = pd.read_clipboard()

df
Out[8]:
dim1    Value_V Value_y instance
0   A_1 50.000000   0   instance200
1   A_2 6500.000000 1   instance200
2   A_3 50.000000   0   instance200
3   A_4 4305.922313 1   instance200
In [9]:

df.pivot('instance', 'dim1').reorder_levels([1, 0], axis=1).sort(axis=1)
Out[9]:
dim1        A_1             A_2             A_3             A_4
            Value_V Value_y Value_V Value_y Value_V Value_y Value_V Value_y
instance                                
instance200 50      0       6500    1       50      0       4305.922313 1

1 Comment

In case someone runs into: "'DataFrame' object has no attribute 'sort'" have a look at this question: stackoverflow.com/questions/44123874/…
0

I have been battling this problem for a long time. My job requires me to handle large pivot_tables, where there are several dozen indexes and a bit more values. The last, most convenient solution in terms of versatility is this:

def pivot_fix(df):
    df = (df.reset_index().T.reset_index(level=0).T.reset_index(drop=True).
          reset_index(drop=True).reset_index(drop=True).T.reset_index().T)
    df.iloc[0, :df.iloc[0, :].isna().sum()] = df.iloc[1, :df.iloc[0, :].isna().sum()]
    df.columns = df.iloc[0]
    df.drop(df.index[0:2], inplace=True)
    return(df)

using it like that: df = (df.pivot_table(index=['location_id', 'place_name', 'address'], columns='day', values='sum')

Comments

0

I'd like to add that the prior answers are outdated.

df = df.swaplevel(0, 1, axis=1) # Swaps level as desired
df.columns = df.columns.sortlevel(0)[0] # Orders level 0, slices to the cols

You can find more documentation on this method here: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.sortlevel.html. It allows a user to directly sort individual levels of a pandas.MultiIndex.

2 Comments

Adding to this, I just noticed that .sortlevel() is causing some value changes in my case. I have no idea why, but it seems like a bug. Values are, in come cases, converted from int to float AND somehow obtain a decimal value... quite weird. Watch out for that.
Resolved the above issue by moving the .swaplevel(0, 1, axis=1) into the line where I manipulate column order. Still not sure why it was messing up in the first place.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.