Convert columns into multiple rows in pandas dataframe

Question

I have a Dataframe that looks something like this:

   Deal  Year  Quarter_1  Quarter_2  Quarter_3  Financial_Data
h     1  1991          1          2          3             120
i     2  1992          4          5          6              80
j     3  1993          7          8          9             100

I want to combine all the quarters into one new column and copy the deal number, year and financial data. The end result should then look like this:

   Deal  Year  Quarter  Financial_Data
h     1  1991        1             120
i     1  1991        2             120
j     1  1991        3             120
k     2  1992        4              80
l     2  1992        5              80
m     2  1992        6              80
n     3  1993        7             100
o     3  1993        8             100
p     3  1993        9             100

What have you tried so far, and how is it not working as expected ? — Faibbus
– Faibbus, Commented Apr 30, 2018 at 10:01
I haven't really tried anything, i'm new to python and don't how i would even approach this problem — Elias K.
– Elias K., Commented Apr 30, 2018 at 10:04

avgJoe · Accepted Answer · 2020-03-30 04:50:21Z

8

You can use melt method.

df = pd.melt(d, id_vars=["Deal", "Year", "Financial_Data"], 
             value_name="Quarter").drop(['variable'],axis=1).sort_values('Quarter')

Output

   Deal  Year  Financial_Data  Quarter
0     1  1991             120        1
3     1  1991             120        2
6     1  1991             120        3
1     2  1992              80        4
4     2  1992              80        5
7     2  1992              80        6
2     3  1993             100        7
5     3  1993             100        8
8     3  1993             100        9

If you have many columns, you can use df.columns.tolist() method in order to achieve your requirement.

column_list = df.columns.tolist()
id_vars_list = column_list[:2] + column_list[-1:]

The statement will become

df = pd.melt(d, id_vars=id_vars_list, 
             value_name="Quarter").drop(['variable'],axis=1).sort_values('Quarter')

edited Mar 30, 2020 at 4:50

avgJoe

8421 gold badge7 silver badges26 bronze badges

answered Apr 30, 2018 at 10:12

Mihai Alexandru-Ionut

48.6k14 gold badges106 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Elias K. Over a year ago

Thanks for your answer! One more question, I have a rather large dataset with over 200 columns is there any shortcut so i dont have to enter all the headings into id_vars?

Mihai Alexandru-Ionut Over a year ago

@Dan, my thought was to get first two columns and the last one.

Dan · Accepted Answer · 2018-04-30 10:13:01Z

3

This is done using melt:

pd.melt(df, id_vars=['Deal','Year','Financial_Data'], value_vars=['Quarter_1','Quarter_2','Quarter_3'])
   Deal  Year  Financial_Data   variable  value
0     1  1991             120  Quarter_1      1
1     2  1992              80  Quarter_1      4
2     3  1993             100  Quarter_1      7
3     1  1991             120  Quarter_2      2
4     2  1992              80  Quarter_2      5
5     3  1993             100  Quarter_2      8
6     1  1991             120  Quarter_3      3
7     2  1992              80  Quarter_3      6
8     3  1993             100  Quarter_3      9

Cleaning it up a little:

>>> pd.melt(df, id_vars=['Deal','Year','Financial_Data'], value_vars=['Quarter_1','Quarter_2','Quarter_3']).drop('variable',axis=1).sort_values('value')
   Deal  Year  Financial_Data  value
0     1  1991             120      1
3     1  1991             120      2
6     1  1991             120      3
1     2  1992              80      4
4     2  1992              80      5
7     2  1992              80      6
2     3  1993             100      7
5     3  1993             100      8
8     3  1993             100      9

answered Apr 30, 2018 at 10:13

Dan

45.8k20 gold badges98 silver badges169 bronze badges

1 Comment

Dan Over a year ago

Yes, just extract a list of the columns you want from df.columns

jpp · Accepted Answer · 2018-04-30 10:17:16Z

1

One way is to combine your Quarter_X data into a list. Then expand the list series via numpy / itertools in a new dataframe.

This is usually more efficient than stack or groupby based methods. Note that the resulting index is extracted from the parent row. You will need to reindex as required.

from itertools import chain
import numpy as np

df['Quarters'] = list(zip(df.Quarter_1, df.Quarter_2, df.Quarter_3))

lens = list(map(len, df.Quarters))

res = pd.DataFrame({'Deal': np.repeat(df.Deal, lens),
                    'Year': np.repeat(df.Year, lens),
                    'Quarter': list(chain.from_iterable(df.Quarters)),
                    'FinancialData': np.repeat(df.FinancialData, lens)})

print(res)

   Deal  FinancialData  Quarter  Year
h     1            120        1  1991
h     1            120        2  1991
h     1            120        3  1991
i     2             80        4  1992
i     2             80        5  1992
i     2             80        6  1992
j     3            100        7  1993
j     3            100        8  1993
j     3            100        9  1993

For multiple columns, the above method may be expensive, but you could do:

res = pd.DataFrame({**{'Quarter': list(chain.from_iterable(df.Quarters))},
                    **{k: np.repeat(df[k], lens) for k in df if 'Quarter' not in k}})

edited Apr 30, 2018 at 10:17

answered Apr 30, 2018 at 10:04

jpp

166k37 gold badges301 silver badges362 bronze badges

1 Comment

Elias K. Over a year ago

I have to do this for a rather large dataset, around 200 different columns, is there any way that I do not have to type out 'Deal': np.repeat(df.Deal, lens) for every single heading?

Collectives™ on Stack Overflow

Convert columns into multiple rows in pandas dataframe

3 Answers 3

2 Comments

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related