I have the following data set:
d = {'person':[1,1,1,1,1,1],'id':['-8','-5','-4','-3','-3','-2'],'obs':
['A','B','C','D','E','F']}
df_start = pd.DataFrame(data=d)
Need to create an output dataset like:
d_end = {'id':[-8,-8,-5,-8,-5,-4,-5,-4,-3,-3,-5,-4,-3,-3],
'obs':['A','A','B','A','B','C','B','C','D','E','B','C','D','E'],
'id_group':[-8,-5,-5,-4,-4,-4,-3,-3,-3,-3,-2,-2,-2,-2]}
df_end = pd.DataFrame(data=d_end)
I am trying to group the rows using a new column called id_group that is created by comparing id values across rows. A single id will belong to its own id group. An id will belong to another id_group if (id+4) is greater than or equal to an id on another row.
Have not been able to get very far using a for loop trying to do this, very open to suggestions