Pandas Dataframe reshaping with stack and unstack

Question

I am trying to messing with pandas stack and unstack. I was wondering if it is possible to reshape my data in this way.

This is my sample data which I am practicing.

ID,Value1,Value2
1,3,12
1,4,13
1,5,14
1,6,15
1,7,16
2,8,17
2,9,18
2,10,19
2,11,20

And I want to reshape in this way.

ID 
1   Index(Extra Column) Value1, value2
    1                      3    12
    2                      4    13
    3                      5    14
    4                      6    15
    5                      7    16

2
    1                      8    17
    2                      9    18
    3                      10   19
    4                      11   20

I tried this

df1 = pd.DataFrame(df[['Value1', 'Value2']], index= df['ID']).stack()

or

df1 = df.set_index(['ID']).stack()

this changes Value1 and Value2 from column to rows which I dont want.

Any Ideas ?

cs95 · Accepted Answer · 2018-06-09 20:47:42Z

4

I propose set_index + cumcount here:

df.set_index(['ID', df.groupby('ID').cumcount() + 1])

      Value1  Value2
ID                  
1  1       3      12
   2       4      13
   3       5      14
   4       6      15
   5       7      16
2  1       8      17
   2       9      18
   3      10      19
   4      11      20

Another option is using concat:

pd.concat({k : g.reset_index(drop=True) for k, g in df.drop('ID', 1).groupby(df.ID)})

     Value1  Value2
1 0       3      12
  1       4      13
  2       5      14
  3       6      15
  4       7      16
2 0       8      17
  1       9      18
  2      10      19
  3      11      20

answered Jun 9, 2018 at 20:47

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user96564 Over a year ago

one more question, do you know, what kind of format would be best to save this kind of output. Csv, excel, txt, dont save in proper way ?

cs95 Over a year ago

@user3280146 It's a problem, none of these flat files support multiIndex. My advise would be to do df = df.reset_index() and then save to CSV. After that, when loading, specify df = pd.read_csv(..., index_col=[0, 1]) which will read those two columns as multiindex.

BENY · Accepted Answer · 2018-06-09 20:50:21Z

3

One way from apply

df.groupby('ID')[['Value1','Value2']].apply(lambda x : x.reset_index(drop=True))
Out[662]: 
      Value1  Value2
ID                  
1  0       3      12
   1       4      13
   2       5      14
   3       6      15
   4       7      16
2  0       8      17
   1       9      18
   2      10      19
   3      11      20

answered Jun 9, 2018 at 20:50

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

piRSquared · Accepted Answer · 2018-06-10 04:04:48Z

2

`defaultdict` and `count`

from itertools import count
from collections import defaultdict

d = defaultdict(count)

df.set_index(['ID', np.array([next(d[x]) for x in df.ID])])

      Value1  Value2
ID                  
1  0       3      12
   1       4      13
   2       5      14
   3       6      15
   4       7      16
2  0       8      17
   1       9      18
   2      10      19
   3      11      20

edited Jun 10, 2018 at 4:04

answered Jun 9, 2018 at 21:21

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe reshaping with stack and unstack

3 Answers 3

2 Comments

Comments

`defaultdict` and `count`

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

defaultdict and count

Comments

Your Answer

Sign up or log in

Post as a guest

Related

`defaultdict` and `count`