Concatenate rows column-wise in Python Pandas with groupby

Question

Instead of e.g. calculating the sum with group_by I would like to concatenate all rows within the same group. Instead of sum() the code beneath should just combine/ concat the rows. If there would be 5 rows per group the new data frame would have 5-times the columns (each column x 5)

Example: This is the data frame I have right now.

Index    Pool   B          C         D           E
70       Pool1  8.717402   7.873173  16.029238   8.533174   
71       Pool1  7.376365   6.228181  9.272679    7.498993   
72       Pool2  8.854857   10.340896 9.218947    8.670379   
73       Pool2  11.509130  8.571492  19.363829   14.605199   
74       Pool3  14.780578  7.405982  9.279374    13.551686   
75       Pool3  7.448860   11.952275 8.239564    12.264440

I want to have it like this:

Index    Pool   B1         C1        D1          E1        B2         C2        D2          E2
70       Pool1  8.717402   7.873173  16.029238   8.533174  7.376365   6.228181  9.272679    7.498993  
71       Pool2  8.854857   10.340896 9.218947    8.670379  11.509130  8.571492  19.363829   14.605199  
72       Pool3  14.780578  7.405982  9.279374    13.551686 7.448860   11.952275 8.239564    12.264440

I would provide you with sample code but have no idea. If I would just sum the rows up I would use:

t.groupby(['pool']).sum()

But I do not want to combine the rows and keep the column structure, I want to concatenate the rows with the same group.

Could not provide sample code but added an example that is hopefully helpful to you guys. — Jamona
– Jamona, Commented Jan 6, 2016 at 14:07
@Jamona in your desired output, e.g. df['B'] would essentially be an ambiguous statement. Such non-unique columns seem somewhat odd to me. — Nelewout
– Nelewout, Commented Jan 6, 2016 at 14:10
The column names don't need to be the same - B1 and B2 would also be fine or sth. else. — Jamona
– Jamona, Commented Jan 6, 2016 at 14:12

2342G456DI8 · Accepted Answer · 2016-01-07 03:52:16Z

1

You could try:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Pool': ['a', 'a', 'b', 'b', 'c'], 'B':[1, 2, 3, 4, 5], 'C':[1,2,3,4,5]})
gd = df1.groupby('Pool')

def comb2(x):
    rslt = dict()
    for col in x.columns:
        rslt[col]=x[col].tolist()
    return pd.Series(rslt)

rslt = gd.apply(comb2)
rslt = rslt.drop('Pool', axis=1)
finaldf = pd.DataFrame()
for col in rslt.columns:
    tempdf = rslt[col].apply(lambda x: pd.Series(x))
    tempdf.columns  = [col+str(i+1) for i in range(len(tempdf.columns))]
    finaldf = pd.concat([finaldf, tempdf],axis=1)

print(finaldf)

Output:
      B1  B2  C1  C2
Pool                
a      1   2   1   2
b      3   4   3   4
c      5 NaN   5 NaN

answered Jan 7, 2016 at 3:52

2342G456DI8

1,8093 gold badges17 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Concatenate rows column-wise in Python Pandas with groupby

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related