1

I already got answer to this question in R, wondering how this can be implemented in Python.

Let's say we have a pandas DataFrame like this:

import pandas as pd
d = pd.DataFrame({'2019Q1':[1], '2019Q2':[2], '2019Q3':[3]})

which displays like this:

   2019Q1  2019Q2  2019Q3
0       1       2       3

How can I transform it to looks like this:

Year    Quarter    Value
2019    1          1
2019    2          2
2019    3          3

2 Answers 2

2

Use Series.str.split for MultiIndex with expand=True and then reshape by DataFrame.unstack, last data cleaning with with Series.reset_index and Series.rename_axis:

d = pd.DataFrame({'2019Q1':[1], '2019Q2':[2], '2019Q3':[3]})

d.columns = d.columns.str.split('Q', expand=True)
df = (d.unstack(0)
       .reset_index(level=2, drop=True)
       .rename_axis(('Year','Quarter'))
       .reset_index(name='Value'))
print (df)
   Year Quarter  Value
0  2019       1      1
1  2019       2      2
2  2019       3      3

Thank you @Jon Clements for another solution:

df = (d.melt()
       .variable
       .str.extract('(?P<Year>\d{4})Q(?P<Quarter>\d)')
       .assign(Value=d.T.values.flatten()))
print (df)
   Year Quarter  Value
0  2019       1      1
1  2019       2      2
2  2019       3      3

Alternative with split:

df = (d.melt()
       .variable
       .str.split('Q', expand=True)
       .rename(columns={0:'Year',1:'Quarter'})
       .assign(Value=d.T.values.flatten()))
print (df)
   Year Quarter  Value
0  2019       1      1
1  2019       2      2
2  2019       3      3
Sign up to request clarification or add additional context in comments.

4 Comments

Not entirely sure as it'd need testing, but something like: df.melt().variable.str.extract(r'(?P<Year>\d{4})Q(?P<Quarter>\d)').assign(Value=df.values.T) should do the trick...
Okay... had a quick play with the above, and I reckon: df.melt().variable.str.extract('(?P<Year>\d{4})Q(?P<Quarter>\d)').assign(Value=df.values.flatten()) should do it :)
(that way it scales to > 1 row)
@JonClements - Thank you, also is necessary transpose before flatten - d = pd.DataFrame({'2019Q1':[1,8], '2019Q2':[2,9], '2019Q3':[3,0]})
1

Using DataFrame.stack with DataFrame.pop and Series.str.split:

df = d.stack().reset_index(level=1).rename(columns={0:'Value'})
df[['Year', 'Quarter']] = df.pop('level_1').str.split('Q', expand=True)
   Value  Year Quarter
0      1  2019       1
0      2  2019       2
0      3  2019       3

If you care about the order of columns, use reindex:

df = df.reindex(['Year', 'Quarter', 'Value'], axis=1)
   Year Quarter  Value
0  2019       1      1
0  2019       2      2
0  2019       3      3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.