stack columns values from same dataframe pandas

Question

I wanted to know if it's possible to stack columns values from the same data frame with almost the same name. I have the following data frame

import pandas as pd

data = {'text':['hello','hi'],
        'a':[1,2,],
        'b':[2,1,],
        'a.1':[3,4],
        'b.1':[4,3]
        }

I have multiple a. and b. so it goes to a.N and b.N but the end result has to be like the below data frame.

data2 ={'text':['hello','hi','hello','hi'],'identifier':[0,0,1,1],
        'a':[1,2,3,4],
        'b':[2,1,4,3],
        }

the identifier column is just to know how it was stacked for instance the first 2 values 0,0 came from the original column and 1,1 came from a.1 and b.1. I hope it all makes sense.

Are you manipulating dataframes or dictionaries?

piRSquared
– piRSquared

2021-03-17 18:30:01 +00:00
Commented Mar 17, 2021 at 18:30 — piRSquared
– piRSquared, Commented Mar 17, 2021 at 18:30

Quang Hoang · Accepted Answer · 2021-03-17 18:31:19Z

1

This is similar to pd.wide_to_long except that you don't have the prefix for the first set.

Try with a custom rename function, then unstack:

def rename_col(x):
    out = x.split('.')
    return (x,'0') if len(out)==1 else tuple(out)

df = df.set_index('text')
df.columns=df.columns.map(rename_col)

df.stack(level=1).reset_index()

Output:

    text level_1  a  b
0  hello       0  1  2
1  hello       1  3  4
2     hi       0  2  1
3     hi       1  4  3

Update Or you can use pd.wide_to_long with another rename function:

def rename_col(x): return x if x=='text' or '.' in x else x+'.0'

pd.wide_to_long(df.rename(columns=rename_col),
                i='text', j='identifier',
                stubnames=['a','b'],
                sep='.'
               )

Output:

                  a  b
text  identifier      
hello 0           1  2
hi    0           2  1
hello 1           3  4
hi    1           4  3

edited Mar 17, 2021 at 18:31

answered Mar 17, 2021 at 18:16

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Seph77 Over a year ago

Hello! the first option worked flawlessly about the second option I had the error stubname can't be identical to a column name but thank you it worked as I wanted

anky · Accepted Answer · 2021-03-17 18:23:15Z

1

You can create the identifier , however here is a way with groupby on axis=1

u = df.set_index("text")
out = pd.concat([g.stack().droplevel(-1) for _,g in 
                 u.groupby(u.columns.str.split('.').str[0],axis=1)],axis=1,keys=u)

print(out)

       a  b
text       
hello  1  2
hello  3  4
hi     2  1
hi     4  3

answered Mar 17, 2021 at 18:23

anky

75.3k11 gold badges46 silver badges76 bronze badges

Collectives™ on Stack Overflow

stack columns values from same dataframe pandas

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related