3

I am trying to messing with pandas stack and unstack. I was wondering if it is possible to reshape my data in this way.

This is my sample data which I am practicing.

ID,Value1,Value2
1,3,12
1,4,13
1,5,14
1,6,15
1,7,16
2,8,17
2,9,18
2,10,19
2,11,20

And I want to reshape in this way.

ID 
1   Index(Extra Column) Value1, value2
    1                      3    12
    2                      4    13
    3                      5    14
    4                      6    15
    5                      7    16

2
    1                      8    17
    2                      9    18
    3                      10   19
    4                      11   20

I tried this

df1 = pd.DataFrame(df[['Value1', 'Value2']], index= df['ID']).stack()

or

df1 = df.set_index(['ID']).stack()

this changes Value1 and Value2 from column to rows which I dont want.

Any Ideas ?

3 Answers 3

4

I propose set_index + cumcount here:

df.set_index(['ID', df.groupby('ID').cumcount() + 1])

      Value1  Value2
ID                  
1  1       3      12
   2       4      13
   3       5      14
   4       6      15
   5       7      16
2  1       8      17
   2       9      18
   3      10      19
   4      11      20

Another option is using concat:

pd.concat({k : g.reset_index(drop=True) for k, g in df.drop('ID', 1).groupby(df.ID)})

     Value1  Value2
1 0       3      12
  1       4      13
  2       5      14
  3       6      15
  4       7      16
2 0       8      17
  1       9      18
  2      10      19
  3      11      20
Sign up to request clarification or add additional context in comments.

2 Comments

one more question, do you know, what kind of format would be best to save this kind of output. Csv, excel, txt, dont save in proper way ?
@user3280146 It's a problem, none of these flat files support multiIndex. My advise would be to do df = df.reset_index() and then save to CSV. After that, when loading, specify df = pd.read_csv(..., index_col=[0, 1]) which will read those two columns as multiindex.
3

One way from apply

df.groupby('ID')[['Value1','Value2']].apply(lambda x : x.reset_index(drop=True))
Out[662]: 
      Value1  Value2
ID                  
1  0       3      12
   1       4      13
   2       5      14
   3       6      15
   4       7      16
2  0       8      17
   1       9      18
   2      10      19
   3      11      20

Comments

2

defaultdict and count

from itertools import count
from collections import defaultdict

d = defaultdict(count)

df.set_index(['ID', np.array([next(d[x]) for x in df.ID])])

      Value1  Value2
ID                  
1  0       3      12
   1       4      13
   2       5      14
   3       6      15
   4       7      16
2  0       8      17
   1       9      18
   2      10      19
   3      11      20

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.