I have a dataframe with a column that randomly starts a "count" back at 1. My goal is to produce a new_col that divides my current column by the the last value in a count. See below for an example.
This is my current DataFrame:
col
0 1.0
1 2.0
2 3.0
3 1.0
4 2.0
5 1.0
6 2.0
7 3.0
8 4.0
9 5.0
10 1.0
11 2.0
12 3.0
Trying to get an output like so:
col new_col
0 1.0 0.333
1 2.0 0.667
2 3.0 1.000
3 1.0 0.500
4 2.0 1.000
5 1.0 0.200
6 2.0 0.400
7 3.0 0.600
8 4.0 0.800
9 5.0 1.000
10 1.0 0.333
11 2.0 0.667
12 3.0 1.000
This is what I have tried so far:
df['col_bool'] = pd.DataFrame(df['col'] == 1.0)
idx_lst = [x - 2 for x in df.index[df['col_bool']].tolist()]
idx_lst = idx_lst[1:]
mask = (df['col'] != 1.0)
df_valid = df[mask]
for i in idx_lst:
df['new_col'] = 1.0 / df_valid.iloc[i]['col']
df.loc[mask, 'new_col'] = df_valid['col'] / df_valid.iloc[i]['col']
This understandably results in an index error. Maybe I need to make a copy of a DataFrame each time and concat. I believe this would work but I want to ask if I am missing any shortcuts here?