4

Following up to my previous question here:

import pandas as pd
d = pd.DataFrame({'value':['a', 'b'],'2019Q1':[1, 5], '2019Q2':[2, 6], '2019Q3':[3, 7]})

which displays like this:

   value    2019Q1  2019Q2  2019Q3
0  a        1       2       3
1  b        5       6       7

How can I transform it into this shape:

Year    measure    Quarter    Value
2019    a          1          1
2019    a          2          2
2019    a          3          3
2019    b          1          5
2019    b          2          6
2019    b          3          7
3
  • and there can be more years right? like 2020Q1? Commented Jan 17, 2020 at 17:54
  • Yes, that should be dynamic Commented Jan 17, 2020 at 17:55
  • Have you tried anything? Any leads? Commented Jan 17, 2020 at 18:22

2 Answers 2

5

Use pd.wide_to_long with DataFrame.melt:

df2 = df.copy()
df2.columns = df.columns.str.split('Q').str[::-1].str.join('_')

new_df = (pd.wide_to_long(df2.rename(columns = {'value':'Measure'}),
                          ['1','2','3'],
                          j="Year",
                          i = 'Measure',
                          sep='_')
            .reset_index()
            .melt(['Measure','Year'],var_name = 'Quarter',value_name = 'Value')
            .loc[:,['Year','Measure','Quarter','Value']]
            .sort_values(['Year','Measure','Quarter']))

print(new_df)
   Year Measure Quarter  Value
0  2019       a       1      1
2  2019       a       2      2
4  2019       a       3      3
1  2019       b       1      5
3  2019       b       2      6
5  2019       b       3      7
Sign up to request clarification or add additional context in comments.

3 Comments

Not OP, but that is SO MUCH prettier than what I was working on. Nice solution!
Nice one @ansev
Thanks a lot, @ansev. Nice answer
0

this is just an addition for future visitors : when u split columns and use expand=True, u get a multiindex. This allows reshaping using the stack method.

#set value column as index
d = d.set_index('value')

#split columns and convert to multiindex
d.columns = d.columns.str.split('Q',expand=True)

#reshape dataframe
d.stack([0,1]).rename_axis(['measure','year','quarter']).reset_index(name='Value')


  measure   year    quarter Value
0   a       2019       1    1
1   a       2019       2    2
2   a       2019       3    3
3   b       2019       1    5
4   b       2019       2    6
5   b       2019       3    7

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.