Pandas dataframe numpy where multiple conditions

Question

Using pandas and numpy. How may I achieve the following:

df['thecol'] = np.where(
(df["a"] >= df["a"].shift(1)) &
(df["a"] >= df["a"].shift(2)) &
(df["a"] >= df["a"].shift(3)) &
(df["a"] >= df["a"].shift(4)) &
(df["a"] >= df["a"].shift(5)) &
(df["a"] >= df["a"].shift(6)) &
(df["a"] >= df["a"].shift(7)) &
(df["a"] >= df["a"].shift(8)) &
(df["a"] >= df["a"].shift(9)) &
(df["a"] >= df["a"].shift(10))
,'istrue','isnottrue')

Without such ugly repetition of code, if it is only the number that is changing? I would like to have the same code with any number that I provide without typing it all out manually?

It is meant to compare the current value in column "a" to a value in same column one row above, and two rows above, etc, and result in "istrue" if all of these conditions are true

I tried shifting the dataframe in a for loop then appending the value to a list and calculating the maximum of it to only have (df["a"] >= maxvalue) once but it wouldn't work for me either. I am a novice at Python and will likely ask more silly questions in the near future

This works but I would like it to also work without this much repetetive code so I can learn to code properly. I tried examples with yield generator but could not manage to get it working either

@Edit: Answered by Wen. I needed rolling.

In the end I came up with this terrible terrible approach:

def whereconditions(n):
    s1 = 'df["thecol"] = np.where('
    L = []
    while n > 0:
        s2 = '(df["a"] >= df["a"].shift('+str(n)+')) &'
        L.append(s2)
        n = n -1
    s3 = ",'istrue','isnottrue')"
    r = s1+str([x for x in L]).replace("'","").replace(",","").replace("&]","")+s3
    return str(r.replace("([(","(("))
call = whereconditions(10)
exec(call)

Is there a way to use slice in the shift perhaps so I can have only one condition like df["a"] >= df["a"].shift(:100)) maybe, to check current value in column "a" if it is >= the value in each 100 rows above? — Cactus
– Cactus, Commented Dec 1, 2017 at 20:50

BENY · Accepted Answer · 2017-12-01 20:57:12Z

2

Sounds Like you need rolling

np.where(df['a']==df['a'].rolling(10).max(),'istrue','isnottrue')

answered Dec 1, 2017 at 20:57

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Cactus Over a year ago

Thank you. Rolling is what I needed indeed. I shall look up documentation more before posting next time.

BENY Over a year ago

@Cactus yw,happy coding

Cactus Over a year ago

And do you know what to do to look in the future, to compare with columns below the value (negative rolling value). The data is timeseries where the first index is the oldest. I get the error: ValueError("window must be non-negative")

BENY Over a year ago

@Cactus make your df1=df[::-1], then do df['NEW']=np.where(df1['a']==df1['a'].rolling(10).max(),'istrue','isnottrue')[::-1]

Collectives™ on Stack Overflow

Pandas dataframe numpy where multiple conditions

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related