I have an unbalanced Pandas MultiIndex DataFrame where each row stores a firm-year observation. Sample period (variable year) ranges from 2013 to 2017. The dataset includes variable event, which is set to 1 if an event happens in a given year.
Sample dataset:
#Create dataset
import pandas as pd
df = pd.DataFrame({'id' : [1,1,1,1,1,2,2,2,2,3,3,4,4,4,5,5,5,5],
'year' : [2013,2014,2015,2016,2017,2014,2015,2016,2017,
2016,2017,2013,2014,2015,2014,2015,2016,2017],
'event' : [1,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1]})
df.set_index(['id', 'year'], inplace = True)
df.sort_index(inplace = True)
I would like to create a new column status based on existing column event as follows: whenever the event happens for the first time in column event the value of status column should change from 0 to 1 for all subsequent years (including the year the event happens).
DataFrame with expected variable status:
event status
id year
1 2013 1 1
2014 0 1
2015 0 1
2016 0 1
2017 0 1
2 2014 0 0
2015 0 0
2016 1 1
2017 0 1
3 2016 1 1
2017 0 1
4 2013 0 0
2014 1 1
2015 0 1
5 2014 0 0
2015 0 0
2016 0 0
2017 1 1
I haven't found any useful solutions so far, so any advice would be much appreciated. Thanks!