Unpivot multiple columns with same name in pandas dataframe

Question

I have the following dataframe:

pp  b          pp   b
5   0.001464    6   0.001853
5   0.001459    6   0.001843

Is there a way to unpivot columns with the same name into multiple rows?

This is the required output:

pp  b         
5   0.001464    
5   0.001459    
6   0.001853
6   0.001843

BENY · Accepted Answer · 2018-04-29 03:19:39Z

10

Try groupby with axis=1

df.groupby(df.columns.values, axis=1).agg(lambda x: x.values.tolist()).sum().apply(pd.Series).T.sort_values('pp')
Out[320]: 
          b   pp
0  0.001464  5.0
2  0.001459  5.0
1  0.001853  6.0
3  0.001843  6.0

A fun way with wide_to_long

s=pd.Series(df.columns)
df.columns=df.columns+s.groupby(s).cumcount().astype(str)

pd.wide_to_long(df.reset_index(),stubnames=['pp','b'],i='index',j='drop',suffix='\d+')
Out[342]: 
            pp         b
index drop              
0     0      5  0.001464
1     0      5  0.001459
0     1      6  0.001853
1     1      6  0.001843

edited Apr 29, 2018 at 3:19

answered Apr 29, 2018 at 3:06

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user308827 Over a year ago

thanks @Wen, your soln works. can you tell me what is the groupby and agg part doing? thanks!

BENY Over a year ago

@user308827 that part is groupby the columns , same column we concat the value into a list , then we juts need to flatten the list , we yield the result

jpp · Accepted Answer · 2018-04-29 03:05:30Z

4

This is possible using numpy:

res = pd.DataFrame({'pp': df['pp'].values.T.ravel(),
                    'b': df['b'].values.T.ravel()})

print(res)

          b  pp
0  0.001464   5
1  0.001459   5
2  0.001853   6
3  0.001843   6

Or without referencing specific columns explicitly:

res = pd.DataFrame({i: df[i].values.T.ravel() for i in set(df.columns)})

edited Apr 29, 2018 at 3:05

answered Apr 29, 2018 at 2:41

jpp

166k37 gold badges301 silver badges362 bronze badges

Comments

Scott Boston · Accepted Answer · 2018-04-29 02:45:38Z

3

Let's use melt, cumcount and unstack:

dm = df.melt()
dm.set_index(['variable',dm.groupby('variable').cumcount()])\
  .sort_index()['value'].unstack(0)

Output:

variable         b   pp
0         0.001464  5.0
1         0.001459  5.0
2         0.001853  6.0
3         0.001843  6.0

answered Apr 29, 2018 at 2:45

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

1 Comment

user308827 Over a year ago

thanks! I get this error: *** TypeError: '<' not supported between instances of 'str' and 'int', not sure yet if this is because the sample dataframe is different from my actual dataframe or something else

Gustavo Mirapalheta · Accepted Answer · 2019-03-10 20:58:43Z

2

I'm a little bit surprise that nobody has mentioned so far the use of pd.concat... Take a look below:

df1 = pd.DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})
df1
      Col1  Col2
   0     1     5
   1     2     6
   2     3     7
   3     4     8

Now if you make:

   df2 = pd.concat([df1,df1])

you get:

   Col1  Col2
0     1     5
1     2     6
2     3     7
3     4     8
0     1     5
1     2     6
2     3     7
3     4     8

This is what you wanted, isn't?

answered Mar 10, 2019 at 20:58

Gustavo Mirapalheta

9972 gold badges11 silver badges28 bronze badges

Comments

lisrael1 · Accepted Answer · 2021-08-23 13:13:23Z

0

if you know the number of repetitions in ahead, it's very easy with using numpy:

import numpy as np
import pandas as pd

repetitions=5
rows=2
original_columns=list('ab')

df=pd.DataFrame(np.random.randint(0,10,[rows,len(original_columns)*repetitions]), columns=original_columns*repetitions)
display(df)

    a   b   a   b   a   b   a   b   a   b
0   6   4   7   5   2   5   3   1   4   3
1   1   5   4   9   6   2   9   5   3   6

# now the interesting part:
df=pd.concat(np.hsplit(df, repetitions))
display(df)


    a   b
0   6   4
1   1   5
0   7   5
1   4   9
0   2   5
1   6   2
0   3   1
1   9   5
0   4   3
1   3   6

answered Aug 23, 2021 at 13:13

lisrael1

4093 silver badges9 bronze badges

Comments

sammywemmy · Accepted Answer · 2023-02-15 22:07:41Z

0

One option is with pivot_longer from pyjanitor - in this case we take advantage of the fact that pp is followed by b - we can safely pair them and reshape into two columns.

# pip install pyjanitor
import pandas as pd
import janitor

arr = ['pp', 'b']
df.pivot_longer(index = None, names_to = arr, names_pattern = arr)
   pp         b
0   5  0.001464
1   5  0.001459
2   6  0.001853
3   6  0.001843

answered Feb 15, 2023 at 22:07

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

Collectives™ on Stack Overflow

Unpivot multiple columns with same name in pandas dataframe

6 Answers 6

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related