0

Using pandas and numpy. How may I achieve the following:

df['thecol'] = np.where(
(df["a"] >= df["a"].shift(1)) &
(df["a"] >= df["a"].shift(2)) &
(df["a"] >= df["a"].shift(3)) &
(df["a"] >= df["a"].shift(4)) &
(df["a"] >= df["a"].shift(5)) &
(df["a"] >= df["a"].shift(6)) &
(df["a"] >= df["a"].shift(7)) &
(df["a"] >= df["a"].shift(8)) &
(df["a"] >= df["a"].shift(9)) &
(df["a"] >= df["a"].shift(10))
,'istrue','isnottrue')

Without such ugly repetition of code, if it is only the number that is changing? I would like to have the same code with any number that I provide without typing it all out manually?

It is meant to compare the current value in column "a" to a value in same column one row above, and two rows above, etc, and result in "istrue" if all of these conditions are true

I tried shifting the dataframe in a for loop then appending the value to a list and calculating the maximum of it to only have (df["a"] >= maxvalue) once but it wouldn't work for me either. I am a novice at Python and will likely ask more silly questions in the near future

This works but I would like it to also work without this much repetetive code so I can learn to code properly. I tried examples with yield generator but could not manage to get it working either

@Edit: Answered by Wen. I needed rolling.

In the end I came up with this terrible terrible approach:

def whereconditions(n):
    s1 = 'df["thecol"] = np.where('
    L = []
    while n > 0:
        s2 = '(df["a"] >= df["a"].shift('+str(n)+')) &'
        L.append(s2)
        n = n -1
    s3 = ",'istrue','isnottrue')"
    r = s1+str([x for x in L]).replace("'","").replace(",","").replace("&]","")+s3
    return str(r.replace("([(","(("))
call = whereconditions(10)
exec(call)
1
  • Is there a way to use slice in the shift perhaps so I can have only one condition like df["a"] >= df["a"].shift(:100)) maybe, to check current value in column "a" if it is >= the value in each 100 rows above? Commented Dec 1, 2017 at 20:50

1 Answer 1

2

Sounds Like you need rolling

np.where(df['a']==df['a'].rolling(10).max(),'istrue','isnottrue')
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you. Rolling is what I needed indeed. I shall look up documentation more before posting next time.
@Cactus yw,happy coding
And do you know what to do to look in the future, to compare with columns below the value (negative rolling value). The data is timeseries where the first index is the oldest. I get the error: ValueError("window must be non-negative")
@Cactus make your df1=df[::-1], then do df['NEW']=np.where(df1['a']==df1['a'].rolling(10).max(),'istrue','isnottrue')[::-1]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.