Pandas dataframe if else condition based on previous rows

Question

I have a pandas dataframe as below:

df = pd.DataFrame({'X':[1,1,1, 0, 0]})
df

    X
0   1
1   1
2   1
3   0
4   0

Now I want to create another variable 'Y' and Values for Y should be based on the below condition:

If X = 1 , Y=1
If X = 0 and previous X = 1, Y = 2
If X = 0 and previous x = 0, Y = 0

So, my final output should look like below:

This can be achieved by iterating over rows and setting up a current and previous row and using iloc but I want a more efficient way of doing this faster

Celius Stingher · Accepted Answer · 2019-10-10 23:32:04Z

1

You can try using np.where and shift:

import pandas as pd
import numpy as np
df = pd.DataFrame({'X':[1,1,1, 0, 0]})
df['Y'] = np.where(df['X'] == 1,1,np.where(df['X'].shift(periods=1) == 1,2,0))
print(df)

Output:

answered Oct 10, 2019 at 23:32

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

PMende · Accepted Answer · 2019-10-10 23:49:23Z

0

Celius provided an answer with nested calls to np.where. This can become unfeasible if the number of conditions grow. You can use np.select instead to achieve the same result:

import numpy as np
import pandas as pd


df = pd.DataFrame({
    'X': [1, 1, 1, 0, 0]
})
conditions = [
    df["X"] == 1,
    (df["X"] == 0) & (df["X"].shift() == 1),
    (df["X"] == 0) & (df["X"].shift() == 0)
]
values = [1, 2, 0]
df['Y'] = np.select(conditions, values, default=np.nan)

answered Oct 10, 2019 at 23:49

PMende

5,4903 gold badges21 silver badges26 bronze badges

Collectives™ on Stack Overflow

Pandas dataframe if else condition based on previous rows

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related