3

I have the following array:

a = np.array([6,5,4,3,4,5,6])

Now I want to get all elements which are greater than 4 but also have in index value greater than 2. The way that I have found to do that was the following:

a[2:][a[2:]>4]

Is there a better or more readable way to accomplish this?

UPDATE: This is a simplified version. In reality the indexing is done with arithmetic operation over several variables like this:

a[len(trainPredict)+(look_back*2)+1:][a[len(trainPredict)+(look_back*2)+1:]>4]

trainPredict ist a numpy array, look_back an integer.
I wanted to see if there is an established way or how others do that.

8
  • Are you looking for the elements, the indices of the elements (in the original array, presumably), or a mask for the elements? Commented Oct 9, 2019 at 18:29
  • @MadPhysicist I am looking for the elements on part of the array as shown in the sample: a[2:][a[2:]>4] Commented Oct 9, 2019 at 23:08
  • You should select the posted answer. It's about as concise and accurate as you can be. Commented Oct 10, 2019 at 0:18
  • @MadPhysicist it is the same as that I have written in the question: a[2:][a[2:]>4], just in three lines instead of one. If there is no other way, then I will have my answer and will select it. Commented Oct 10, 2019 at 0:35
  • The other ways I can think of are all much less efficient. I'll write an answer to prove it. The existing answer is a much cleaner way than the one-liner because it avoids redundant temp arrays. Commented Oct 10, 2019 at 2:37

2 Answers 2

2

If you're worried about the complexity of the slice and/or the number of conditions, you can always separate them:

a = np.array([6,5,4,3,4,5,6])

a_slice = a[2:]

cond_1 = a_slice > 4

res = a_slice[cond_1]

Is your example very simplified? There might be better solutions for more complex manipulations.

Sign up to request clarification or add additional context in comments.

Comments

1

@AlexanderCécile's answer is not only more legible than the one liner you posted, but is also removes the redundant computation of a temp array. Despite that, it does not appear to be any faster than your original approach.

The timings below are all run with a preliminary setup of

import numpy as np
np.random.seed(0xDEADBEEF)
a = np.random.randint(8, size=N)

N varies from 1e3 to 1e8 in factors of 10. I tried four variants of the code:

  1. CodePope: result = a[2:][a[2:] > 4]
  2. AlexanderCécile: s = a[2:]; result = s[s > 4]
  3. MadPhysicist1: result = a[np.flatnonzero(a[2:]) + 2]
  4. MadPhysicist2: result = a[(a > 4) & (np.arange(a.size) >= 2)]

In all cases, the timing was obtained on the command line by running

python -m timeit -s 'import numpy as np; np.random.seed(0xDEADBEEF); a = np.random.randint(8, size=N)' '<X>'

Here, N was a power of 10 between 3 and 8, and <X> one of the expressions above. Timings are as follows:

enter image description here

Methods #1 and #2 are virtually indistinguishable. What is surprising is that in the range between ~5e3 and ~1e6 elements, method #3 seems to be slightly, but noticeably faster. I would not normally expect that from fancy indexing. Method #4 is of course going to be the slowest.

Here is the data, for completeness:

           CodePope  AlexanderCécile  MadPhysicist1  MadPhysicist2
1000       3.77e-06         3.69e-06       5.48e-06       6.52e-06
10000       4.6e-05         4.59e-05       3.97e-05       5.93e-05
100000     0.000484         0.000483         0.0004       0.000592
1000000     0.00513          0.00515        0.00503        0.00675
10000000     0.0529           0.0525         0.0617          0.102
100000000     0.657            0.658          0.782           1.09

2 Comments

Indeed, my answer only improves legibility. Since numpy array slices are views, the overheard of creating a new variable probably outweighs the small performance gain from not slicing twice.
Edit: In his updated code, however, separating the parts as in my answer may lead to an increase in performance since the numerical operations create new arrays.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.