Python Pandas - Reshape Dataframe

Question

Given the following data frame:

pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[6,7,8]})

   A   B   C
0  1   4   6
1  2   5   7
2  3   6   8
3  11  14  16
4  12  15  17
5  13  16  18

I would like to reshape it so it would look like so:

   A   B   C   A_1   B_1   C_1   A_2   B_2   C_2
0  1   4   6     2     5     7     3     6     8
1  11  14  16    12    15    17    13    16    18

So every 3 rows are grouped into 1 row

How can I achieve this with pandas?

jezrael · Accepted Answer · 2020-06-08 05:06:08Z

12

One idea is create MultiIndex with integer and modulo division and reshape by DataFrame.unstack:

a = np.arange(len(df))
df.index = [a // 3, a % 3]
df = df.unstack().sort_index(axis=1, level=1)
df.columns = [f'{a}_{b}' for a, b in df.columns]
print (df)
   A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2
0    1    4    6    2    5    7    3    6    8
1   11   14   16   12   15   17   13   16   18

For reverse operation is possible use str.split with DataFrame.stack:

a = np.arange(len(df))
df1 = (df.set_index(pd.MultiIndex.from_arrays([a // 3, a % 3]))
         .unstack().sort_index(axis=1, level=1))
df1.columns = [f'{a}_{b}' for a, b in df1.columns]
print (df1)
   A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2
0    1    4    6    2    5    7    3    6    8
1   11   14   16   12   15   17   13   16   18

df1.columns = df1.columns.str.split('_', expand=True)
df2 = df1.stack().reset_index(drop=True)
print (df2)
    A   B   C
0   1   4   6
1   2   5   7
2   3   6   8
3  11  14  16
4  12  15  17
5  13  16  18

edited Jun 8, 2020 at 5:06

answered Jun 7, 2020 at 12:29

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Shlomi Schwartz Over a year ago

Thanks for your answer, how would you revert the operation?

warped · Accepted Answer · 2020-06-07 12:34:12Z

10

new = pd.concat([df[a::3].reset_index(drop=True) for a in range(3)], axis=1)
new.columns = ['{}_{}'.format(a,b) for b in range(3) for a in 'ABC']

answered Jun 7, 2020 at 12:34

warped

9,6655 gold badges26 silver badges55 bronze badges

Comments

Abhinav Goyal · Accepted Answer · 2020-06-07 12:28:54Z

3

You can try this:

pd.DataFrame(
    data=df.values.reshape([-1, df.values.shape[1]*3]),
    columns=list(df.columns) + sum([[c+'_'+str(i) for c in df.columns] for i in range(1, 3)], [])
)

Output for your input dataframe:


    A   B   C   A_1 B_1 C_1 A_2 B_2 C_2
0   1   4   6   2   5   7   3   6   8
1   11  14  16  12  15  17  13  16  18

answered Jun 7, 2020 at 12:28

Abhinav Goyal

1,45510 silver badges19 bronze badges

Comments

sammywemmy · Accepted Answer · 2020-06-07 12:59:11Z

3

Data is transformed every three rows : we can use numpy's reshape method to transform the data, and create a cartesian product of range(1,3) with the columns to get the new columns :

from itertools import product
row = len(df)//3

#create new columns
new_columns = df.columns.union(["_".join((letter,str(num))) 
                                for letter,num in product(df.columns,range(1,3))],
                               sort=False)

#create new dataframe
new_df = pd.DataFrame(np.reshape(df.to_numpy(),(row,-1)), 
                      columns=new_columns)
new_df

    A   B   C   A_1 A_2 B_1 B_2 C_1 C_2
0   1   4   6   2   5   7   3   6   8
1   11  14  16  12  15  17  13  16  18

answered Jun 7, 2020 at 12:59

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

Comments

Erfan · Accepted Answer · 2020-06-07 12:40:14Z

2

We can group your data by repeating groups of n (in this case 3) and then use pd.concat to concat them together over the column axis:

n = 3
grps = df.groupby(df.index // n).cumcount()
dfn = pd.concat([d.reset_index(drop=True) for _, d in df.groupby(grps)], axis=1)
dfn.columns = [f'{col}_{i}' for col, i in zip(dfn.columns, np.arange(dfn.shape[1]) // n)]

   A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2
0    1    4    6    2    5    7    3    6    8
1   11   14   16   12   15   17   13   16   18

answered Jun 7, 2020 at 12:40

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

Collectives™ on Stack Overflow

Python Pandas - Reshape Dataframe

5 Answers 5

1 Comment

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related