I am fairly new to Pandas and desperately need help to modify my dataframe by comparing consecutive rows within multiple group pairings:
Dataframe example =
idData idStation idCast Pressure
1 1 1 1505
2 1 1 1506
3 1 1 1507
4 1 1 1508
5 1 2 1505
6 1 2 1506
7 1 2 1503
8 1 2 1504
9 2 1 1505
10 2 1 1506
11 2 1 1507
etc
I want to delete any rows where the Pressure value for that row is less than those above it, for each idStation and idCast pair (i.e. rows where idData = 7 and 8 need to be deleted). I don't want to compare different idStation & idCast pairs - i.e. first iteration of the loop would look through the Pressure record for idStation = 1, idCast = 1 and the second iteration of the loop would look through the record for idStation = 1, idCast = 2 etc). I have tried grouping by idStation and idCast, then looping over them and comparing row by row but this only modifies a copy, not the original dataframe and the changes are lost.
Stn_Cast_Group = Dataframe.groupby(['idStation','idCast'])
for name, group in Stn_Cast_Group:
j = 0
for i in range(1,len(group['Pressure'])):
if group['Pressure'].iloc[i] < j:
group['Pressure'].iloc[i] = np.nan
else:
j = group['Pressure'].iloc[i]
This labels the correct pressure values as nan (as I am unsure of how to delete the row) but only for the group view.
How would you create a copy of the dataframe (so as to have access to original and modified versions) and then delete the rows as mentioned above?