How to convert DataFrame column to Rows in Python?

Question

I have the following dataset in df_1 which I want to convert into the format of df_2. In df_2 I have converted the columns of df_1 to rows in df_2 (excluding UserId and Date). I looked up for similar answers but they are providing little complex solutions. Is there a simple way to do this?

df_1

   UserId       Date                   -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7
    87      2011-05-10 18:38:55.030     0   0   0   0   0   0   1   0   0   0   0   0   0   0   0
    487     2011-11-29 14:46:12.080     0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
    21      2012-03-02 14:35:06.867     0   1   0   1   2   0   2   2   0   1   2   2   1   3   1

df_2

day | count
-7   0
-7   0
-7   0
-6   0
-6   0
-6   1
-5   0
-5   1
-5   0 
.    .
.    .(Similarly for other columns in between)
.    .
6   0    
6   0
6   3
7   0
7   0
7   1

Bill Huang · Accepted Answer · 2020-10-12 17:33:13Z

Pandas provides a default method df.melt() for exactly this purpose, which is the reverse operation of df.pivot() or df.pivot_table(). (Not sure why the function name is not the more intuitive unpivot).

The advantages of this solution:

No reinvention of wheels. An easily understandable and generally applicable df.transpose() -> df.melt() logic.
Concatenation of columns and/or appending datasets were avoided.

Code

# 1. preparation: get the "day" column in place.
# Note: The column names were strings ('-7', '-6', ...) as copy-pasted.
col_names = [str(i) for i in range(-7, 8)]
df_tr = df_1[col_names].transpose().reset_index()
df_tr.rename(columns={"index": "day"}, inplace=True)
df_tr["day"] = df_tr["day"].astype(int)  # str to int

# 2. unpivoting (melting)
df_2_unpivot = df_tr.melt(id_vars="day", var_name="col", value_name="count")
df_2 = df_2_unpivot.sort_values(by=["day", "col"])

# 3.cleanup
del df_2["col"]
df_2.reset_index(drop=True, inplace=True)

Result

df_2
Out[134]: 
    day  count
0    -7      0
1    -7      0
2    -7      0
3    -6      0
4    -6      0
5    -6      1
6    -5      0
7    -5      1
8    -5      0
9    -4      0
10   -4      0
11   -4      1
12   -3      0
13   -3      0
14   -3      2
15   -2      0
16   -2      0
17   -2      0
18   -1      1
19   -1      0
20   -1      2
21    0      0
22    0      0
23    0      2
24    1      0
25    1      0
26    1      0
27    2      0
28    2      0
29    2      1
30    3      0
31    3      0
32    3      2
33    4      0
34    4      0
35    4      2
36    5      0
37    5      0
38    5      1
39    6      0
40    6      0
41    6      3
42    7      0
43    7      0
44    7      1

Also check out the intermediate datasets and play with the options yourself.

sai · Accepted Answer · 2020-10-12 16:20:49Z

You could use apply and concatenate all the rows and sort them-

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.random((3, 10)), columns=range(10))
df = df.T
new_df = pd.Series([], dtype=np.float64)


def f(x):
    global new_df  # not the most elegant way, something you could work upon?
    new_df = pd.Series.append(new_df, x)


df.apply(f, axis=0)
new_df.sort_index(inplace=True)
print(new_df)

0    0.020673
0    0.710004
0    0.590984
1    0.643964
1    0.719694
1    0.105075
2    0.270417
2    0.537349
2    0.610228
3    0.391562
3    0.760375
3    0.105794
4    0.726044
4    0.676487
4    0.851921
5    0.447779
5    0.798975
5    0.877853
6    0.807380
6    0.639440
6    0.435890
7    0.263091
7    0.722340
7    0.586944
8    0.142973
8    0.928533
8    0.438123
9    0.076326
9    0.385373
9    0.662350
dtype: float64

D-E-N · Accepted Answer · 2020-10-12 16:47:55Z

0

Is this what you want (transpose())?

import pandas as pd
from io import StringIO

# Prework to generate your data
data = """UserId       Date                   -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7
    87      2011-05-10 18:38:55.030     0   0   0   0   0   0   1   0   0   0   0   0   0   0   0
    487     2011-11-29 14:46:12.080     0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
    21      2012-03-02 14:35:06.867     0   1   0   1   2   0   2   2   0   1   2   2   1   3"""

input_data = StringIO(data)
df_1 = pd.read_table(input_data, sep=r"\s{2,}", engine="python")

# remove unused columns
df_1.drop(["Date", "UserId"], axis=1, inplace=True)

# # and transpose
df_2 = df_1.transpose()

# concat all lines
df_2 = df_2[0].append(df_2[1]).append(df_2[2])
df_2.sort_index(inplace=True)

print(df_2)

Output:

edited Oct 12, 2020 at 16:47

answered Oct 12, 2020 at 16:16

D-E-N

1,2727 silver badges15 bronze badges

2 Comments

Ishan Dutta Over a year ago

No, I don't want the Transpose. The obtained dataframe must have only 2 columns.

D-E-N Over a year ago

i changed to a 2 col table after transposing

Collectives™ on Stack Overflow

How to convert DataFrame column to Rows in Python?

3 Answers 3

Code

Result

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Code

Result

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related