1

In python I have numpy.ndarray called a and a list of indices called b. I want to get a list of all the values of a which are not in -10..10 places around the indices of b. This is my current code, which takes a lot of time to run due to allocations of data (a is very big):

    aa=a
    # Remove all ranges backwards
    for bb in b[::-1]:
        aa=np.delete(aa, range(bb-10,bb+10))

Is there a way to do it more efficiently? Preferably with few memory allocations.

1
  • Remember ranges do not include the "to" value, so your code will delete indexes bb-10,bb-9,...,bb+9. Is this what you intended? Commented Mar 3, 2012 at 12:42

2 Answers 2

2

np.delete will take an array of indicies of any size. You can simply populate your entire array of indicies and perform the delete once, therefore only deallocating and reallocating once. (not tested. possible typos.)

bb = np.empty((b.size, 21), dtype=int)
for i,v in enumerate(b):
    bb[i] = v+np.arange(-10,11)

np.delete(a, bb.flat)  # looks like .flat is optional

Note, if your ranges overlap, you'll get a difference between this and your algorithm: where yours will remove more items than those originally 10 indices away.

Sign up to request clarification or add additional context in comments.

1 Comment

Code execution time reduced from 1100 seconds to 9. :)
0

Could you find a certain number that you're sure will not be in a, and then set all indices around the b indices to that number, so that you can remove it afterwards?

import numpy as np
for i in range(-10, 11):
    a[b + i] = number_not_in_a
values = set(np.unique(a)) - set([number_not_in_a])

This code will not allocate new memory for a at all, needs only one range object created, and does the job in exactly 22 c-optimized numpy operations (well, 43 if you count the b + i operations), plus the cost of turning the unique return array into a set.

Beware, if b includes indices which are less than 10, the number_not_in_a "zone" around these indices will wrap around to the other end of the array. If b includes indices larger than len(a) - 11, the operation will fail with an IndexError at some point.

2 Comments

So how do I get a copy of a without the number_not_in_a values?
@Uri np.delete(np.where(a == number_not_in_a)) should do the trick. It seems to me Paul has the better idea though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.