0

I have financial performance Indicators for different companies, one row per year. Now I would like to have all the indicators per company over a specific range of years in one row.

Now my data looks similar to this:

import numpy as np
import pandas as pd


startyear = 2014
endyear = 2015

df = pd.DataFrame(np.array([
['AAPL',  2014,  0.2,  0.4,  1.5],
['AAPL',  2015,  0.3,  0.4,  2.0],
['AAPL',  2016,  0.2,  0.3,  1.5],
['GOGL',  2014,  0.4,  0.5,  0.5],
['GOGL',  2015,  0.6,  0.8,  1.0],
['GOGL',  2016,  0.3,  0.5,  2.0]]), 
columns=['Name',  'Year',  'ROE',  'ROA',  'DE'])

newcolumns = (df.columns + [str(startyear)]).append(df.columns + [str(endyear)])

dfnew=pd.DataFrame(columns=newcolumns)

What I would like to have is (e.g. only years 2014 & 2015):

Name  ROE2014 ROA2014 DE2014 ROE2015 ROA2015 DE2015
AAPL  0.2     0.4     1.5    0.3     0.4     2.0
GOOGL 0.4     0.5     0.5    0.6     0.8     1.0

So far I only managed to get the new column names, but somehow I can't get my head around how to fill this new DataFrame.

1 Answer 1

2

Probably easier to create the new DataFrame, then adjust the column names:

# limit to data you want
dfnew = df[df.Year.isin(['2014', '2015'])]

# set index to 'Name' and pivot 'Year's into the columns 
dfnew = dfnew.set_index(['Name', 'Year']).unstack()

# sort the columns by year
dfnew = dfnew.sortlevel(1, axis=1)

# rename columns
dfnew.columns = ["".join(a) for a in dfnew.columns.values]

# put 'Name' back into columns
dfnew.reset_index()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.