From a pandas dataframe, I want to remove the "rois" where half or more of the rows have for any of the columns s, b1 or b2 a value of below 50.
Here an example dataframe:
roi s b1 b2
4 40 60 70
4 60 40 80
4 80 70 60
5 60 40 60
5 60 60 60
5 60 60 60
Only the three rows corresponding to roi 5 should be left over (roi 4 has 2 out of 3 rows where at least one of the values of s, b1, b2 is below 50).
I have this implemented already, but wonder if there is a shorter (ie. faster and cleaner) way to do this:
for roi in data.roi.unique():
subdata = data[data['roi']==roi];
subdatas = subdata[subdata['s']>=50];
subdatab1 = subdatas[subdatas['b1']>=50];
subdatab2 = subdatab1[subdatab1['b2']>=50]
if((subdatab2.size/10)/(subdata.size/10) < 0.5):
data = data[data['roi']!=roi];