1

I've got a dataframe here

    Row dist1   dist2   dist3   variable    value   Smallest    Group   
0   0       40  101 dist2   40  Smallest    SmallestGroup   
1   0       40  101 dist3   101 SecondSmallest  SecondSmallestGroup 
2   1       30  100 dist2   30  Smallest    SmallestGroup   
3   1       30  100 dist3   100 SecondSmallest  SecondSmallestGroup 
4   2   30  20  98  dist2   20  Smallest    SmallestGroup   
5   2   30  20  98  dist1   30  SecondSmallest  SecondSmallestGroup 
6   3   20  15  72  dist2   15  Smallest    SmallestGroup   
7   3   20  15  72  dist1   20  SecondSmallest  SecondSmallestGroup 
8   4   15  16  11  dist3   11  Smallest    SmallestGroup   
9   4   15  16  11  dist1   15  SecondSmallest  SecondSmallestGroup 

I'd like to pivot the data so that I have four new columns: Smallest and SecondSmallest (which contains the values from the value column) and SmallestGroup and SecondSmallestGroup (which contains the valus from the variable column).

My desired output is as follows:

Row dist1   dist2   dist3   Smallest    SecondSmallest  SmallestGroup   SecondSmallestGroup
0   NaN      40      101    40                101            dist2         dist3
1   NaN      30      100    30                100            dist2         dist3

I'm certain I can achieve this with a pivot, but am unsure about how to structure the command.

2 Answers 2

2

Since, I don't think the raw data specifies that 'Group' and 'variable' must be asssociated and 'Smallest' and 'value' together, then we have to do this two steps and concatenate.

Let's try this:

(pd.concat([df.set_index(['Row','dist1','dist2','dist3','Group'])['variable'].unstack(),
            df.set_index(['Row','dist1','dist2','dist3','Smallest'])['value'].unstack()],
           axis=1)
   .reset_index())

Output:

   Row  dist1  dist2  dist3 SecondSmallestGroup SmallestGroup  SecondSmallest  Smallest
0    0    NaN     40    101               dist3         dist2             101        40
1    1    NaN     30    100               dist3         dist2             100        30
2    2   30.0     20     98               dist1         dist2              30        20
3    3   20.0     15     72               dist1         dist2              20        15
4    4   15.0     16     11               dist1         dist3              15        11
Sign up to request clarification or add additional context in comments.

Comments

0

df[['Row','dist1','dist2','dist3']].groupby('Row').first().join(
    df.pivot_table(values = 'value',index= 'Row',columns = ['Smallest'],aggfunc='first')).join(
    df.pivot_table(values = 'variable',index= 'Row',columns = ['Group'],aggfunc='first'))   

Output:

     dist1  dist2  dist3  SecondSmallest  Smallest SecondSmallestGroup  \
Row                                                                      
0      NaN     40    101             101        40               dist3   
1      NaN     30    100             100        30               dist3   
2     30.0     20     98              30        20               dist1   
3     20.0     15     72              20        15               dist1   
4     15.0     16     11              15        11               dist1   
SmallestGroup  
Row                
0           dist2  
1           dist2  
2           dist2  
3           dist2  
4           dist3  

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.