0

I have the following dataframe, I wish to add a new column called open_next_year.

This column will be selected by comparing two columns; fiscalYear + 1 and ticker. Then using the value from column open.

Original dataframe:

   fiscalYear ticker     open  
         2017   FINL  17.4880  
         2017   AAPL  17.4880  
...
         2016   FINL  16.4880  
         2016   AAPL  16.4880  
         2015   FINL  15.4880  
         2015   AAPL  15.4880  

Desired dataframe:

   fiscalYear ticker     open  open_next_year
         2017   FINL  17.4880  
         2017   AAPL  17.4880  
         2016   FINL  16.4880  17.4880 
         2016   AAPL  16.4880  17.4880
         2015   FINL  15.4880  16.4880 
         2015   AAPL  15.4880  16.4880

What is the pandas way to achieve this please?

2 Answers 2

2

I believe need for each group shift all values by DataFrameGroupBy.shift:

df['open_next_year'] = df.groupby('ticker')['open'].shift()
print (df)
   fiscalYear ticker    open  open_next_year
0        2017   FINL  17.488             NaN
1        2017   AAPL  17.488             NaN
2        2016   FINL  16.488          17.488
3        2016   AAPL  16.488          17.488
4        2015   FINL  15.488          16.488
5        2015   AAPL  15.488          16.488

Changed sample for unique open values:

print (df)
   fiscalYear ticker     open
0        2017   FINL  17.4881
1        2017   AAPL  17.4882
2        2016   FINL  16.4883
3        2016   AAPL  16.4884
4        2015   FINL  15.4885
5        2015   AAPL  15.4886

df['open_next_year'] = df.groupby('ticker')['open'].shift()
print (df)
   fiscalYear ticker     open  open_next_year
0        2017   FINL  17.4881             NaN
1        2017   AAPL  17.4882             NaN
2        2016   FINL  16.4883         17.4881
3        2016   AAPL  16.4884         17.4882
4        2015   FINL  15.4885         16.4883
5        2015   AAPL  15.4886         16.4884
Sign up to request clarification or add additional context in comments.

5 Comments

This doesn't seem to take into account the 'fiscalYear'
OK, can you add more data to question?
@gibo - Can you add more data to sample? fiscalYear values are unique? Or not? There are multiple tickers? Maybe 8 rows should be perfect with 2 groups if necessary.
Added. fiscalYear will be unique to the ticker, there are mulitple tickers
@gibo - thank you. But if there are unique years per groups, my solution working nice.
0

Here is another approach creating a map first.

m = dict(zip(tuple(zip(df.fiscalYear - 1, df.ticker)),df.open))
df['open_next_year'] = df[['fiscalYear','ticker']].apply(tuple, 1).map(m)

The map/dictionary looks like this and is obtained by zipping together year - 1, ticker and open value:

{(2014, 'AAPL'): 15.488,
 (2014, 'FINL'): 15.488,
 (2015, 'AAPL'): 16.488,
 (2015, 'FINL'): 16.488,
 (2016, 'AAPL'): 17.488,
 (2016, 'FINL'): 17.488}

Full example:

data = '''\
fiscalYear ticker    open
2017   FINL  17.488
2017   AAPL  17.488
2016   FINL  16.488
2016   AAPL  16.488
2015   FINL  15.488
2015   AAPL  15.488'''

fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='\s+')

m = dict(zip(tuple(zip(df.fiscalYear - 1, df.ticker)),df.open))
df['open_next_year'] = df[['fiscalYear','ticker']].apply(tuple, 1).map(m)

print(df)

Returns:

   fiscalYear ticker    open  open_next_year
0        2017   FINL  17.488             NaN
1        2017   AAPL  17.488             NaN
2        2016   FINL  16.488          17.488
3        2016   AAPL  16.488          17.488
4        2015   FINL  15.488          16.488
5        2015   AAPL  15.488          16.488

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.