1

I have an array looking like this:

testdata = [-2, -1, 0, 1, 2, 3, 10, 3, 2, 1, 0, -1, -2]

So it has a maximum value and to the left and right the values gradually go down to zero and it then can have values below 0.

The aim of my code is to find the maximum value of the array and sum all these values left and right to it until the value 0 (including the maximum value).

To test my code I generated a larger array like this (ignoring possible smaller values than 0):

data1 = [x for x in range(0, 100000, 1)]
data2 = [x for x in range(100000, -1, -1)]

data3 = data1 + data2

The fastest code I can come up with looks like this:

j = 1
k = 0

max_index = np.where(data3 == np.amax(data3))[0][0]

while data3[max_index + j] > 0:
    j += 1

while data3[max_index - k] > 0:
    k += 1

summ1 = np.sum(data3[max_index:(max_index+j)]) 
summ2 = np.sum(data3[(max_index-k):max_index])
total = summ1 + summ2

print(total)

Any suggestions on how to boost this even faster?

3
  • This looks like a simple pencil - paper problem. If decrement is gradual no loops needed, only a formula of arithmetic progression. Commented Oct 27, 2020 at 22:49
  • It was just for illustration purposes. The values go down to zero gradually but don't follow any arithmetic progression. It's more like random noise declining. Sorry for the confusion. Commented Oct 27, 2020 at 22:58
  • Assuming there is only one peak and the values are weakly decreasing on both sides, you could just use np.maximum(data3, 0).sum(). In other words, replace all negative values with zeros and sum over the entire array. It helps performance if data3 is a numpy array to begin with. Commented Oct 28, 2020 at 4:07

2 Answers 2

2

It depends a lot on data. Seems like you're trying to find an efficient way to return the first index of something in an array. Well, there isn't an efficient one in numpy because only iterations of whole array are allowed in numpy but you can use numba instead in order to outperform numpy

In case there you need to sum a large small part of your list, numpy is a good choice:

zero_idx = np.where(data3==0)[0]
max_loc = np.searchsorted(zero_idx, np.argmax(data3))
start, end = zero_idx[max_loc - 1], zero_idx[max_loc]
total_sum = np.sum(data3[start:end])

Otherwise, use pythonic index method (or numba):

k = np.argmax(data3)
left_list = data3[k:].tolist()
right_list = data3[k::-1].tolist()
s1 = np.sum(data3[k: k + left_list.index(0)])
s2 = np.sum(data3[k - right_list.index(0): k])
total_sum = s1 + s2

Benchmarks. I've got first method was 20x faster using %timeit decorator in Jupyter Notebook:

512 µs ± 34.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10.2 ms ± 146 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Sign up to request clarification or add additional context in comments.

Comments

1

You can use masking instead of using loops.

The masks [data3[max_index:] > 0] and [data3[:max_index] > 0] are equivalent to the slicings [max_index:(max_index+j)] and [(max_index-k):max_index], except you don't have to bother finding j and k yourself.

from contextlib import contextmanager
import numpy as np
import time

@contextmanager
def time_this_scope(name):
    """Handy context manager to time a portion of code."""
    t0 = time.perf_counter()
    yield
    print(f"{name} took {time.perf_counter() - t0}s.")

# Preparing the data.
data1 = [x for x in range(0, 100000, 1)]
data2 = [x for x in range(100000, -1, -1)]
data3 = data1 + data2
max_index = np.where(data3 == np.amax(data3))[0][0]
    
# Comparing the performance of both methods.
with time_this_scope("method 1"):
    j = 1
    k = 0
    while data3[max_index + j] > 0:
        j += 1    
    while data3[max_index - k] > 0:
        k += 1
    summ1 = np.sum(data3[max_index:(max_index+j)]) 
    summ2 = np.sum(data3[(max_index-k):max_index])
    total_m1 = summ1 + summ2

with time_this_scope("method 2"):    
    data3 = np.array(data3)
    summ1 = np.sum(data3[max_index:][data3[max_index:] > 0])
    summ2 = np.sum(data3[:max_index][data3[:max_index] > 0])      
    total_m2 = summ1 + summ2
    
# Checking they do yield the same result.
print(total_m1 == total_m2)

>>> method 1 took 0.08157979999998588s.
>>> method 2 took 0.011274500000013177s.
>>> True

4 Comments

Thanks! That looks neat, did not know of that functionality before! I'm wondering, if the array gets larger, would it be sensible to parallelize the summation of the left and right branch using threading/multiprocessing?
@Jailbone It appears that numba is possibly a good choice for parallelization of index method but it depends a lot on data you expect. Another option is to use numexpr package for further improvement of np.sum once you find which part of data3 to sum.
Thanks for the great info! I am curious about the numexpr package. Could you maybe provide an example on how to useit in this case? I don't quite understand how it works.
@Jailbone. It is quite easy. For example, a substitute of a*b+b*c+c*a for large arrays a,b,c = (np.random.randint(100, size = 1000000),) * 3 is: import numexpr as ne; ne.evaluate('a*b+b*c+c*a', {'a':a, 'b':b, 'c':c}). However, I've just noticed it doesn't help to fasten np.sum because it's quite fast itself.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.