1

I have three columns,['date'] which has the date, ['id'] which holds product id's and ['rating'] which holds product ratings for each product for each data, I want to create a dummy variable ['threshold'] which equals 1 when within the same value of ['id'] the value of rating went from anywhere above 5 to anywhere below 6. My code would use a for loop as follows:

df['threshold']=np.zeros(df.shape[0])
for i in range(df.shape[0]):
        if df.iloc[i]['id'] == df.iloc[i-1]['id'] and df.iloc[i-1]['rating']>5 and df.iloc[i]['rating']<6:
            df.iloc[i]['threshold']=1

Is there a way to perform this without using a for loop?

1

1 Answer 1

1

Use Series.shift and compare with Series.eq for equal and convert output mask to integers 0,1 by Series.view:

df['threshold']= (df['id'].eq(df['id'].shift()) & 
                  df['rating'].shift().gt(5) & 
                  df['rating'].lt(6)).view('i1')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.