NumPy: Sum over 1-D array split by index

Question

Consider a 1-D NumPy input array and a sorted index array. The goal is to get the sum over the input array, a, but split by the indices defined in the index array.

Below are two approaches, but both of them require slow Python for loops. Is there a pure NumPy version without the need of Python for loops?

Example:

a = np.arange(20) # Input array
idxs = np.array([7, 15, 16]) # Index array

# Goal: Split a at index 7, 15 and 16 and
# compute sum for each partition

# Solution 1:
idxs_ext = np.concatenate(([0], idxs, [a.size]))
results = np.empty(idxs.size + 1)
for i in range(results.size):
    results[i] = a[idxs_ext[i]:idxs_ext[i+1]].sum()

# Solution 2:
result = np.array(
    [a_.sum() for a_ in np.split(a, idxs)]
)

# Result: array([21., 84., 15., 70.])

check out add.reduceat

hpaulj
– hpaulj

2021-12-12 12:33:27 +00:00
Commented Dec 12, 2021 at 12:33 — hpaulj
– hpaulj, Commented Dec 12, 2021 at 12:33
would it be ok to use numba?

dankal444
– dankal444

2021-12-12 18:06:54 +00:00
Commented Dec 12, 2021 at 18:06 — dankal444
– dankal444, Commented Dec 12, 2021 at 18:06
Of course, Numba is totally fine

Yannic
– Yannic

2021-12-13 12:05:16 +00:00
Commented Dec 13, 2021 at 12:05 — Yannic
– Yannic, Commented Dec 13, 2021 at 12:05

Ali_Sh · Accepted Answer · 2021-12-12 23:14:12Z

3

At first, you can split the a array based on your idxs array by np.split and then apply your function on that as:

np.stack(np.vectorize(np.sum)(np.array(np.split(a, idxs), dtype=object)))

Another answer is using np.add.reduceat as it is mentioned by @hpaulj in the comments and is more faster:

np.add.reduceat(a, np.insert(idxs, 0, 0), axis=0)

Update:
Using np.concatenate instead of insert reduced the runtime 5 times for data range 1000 with 7 slices; It was the fastest method I have examined:

np.add.reduceat(a, np.concatenate(([0], idxs)), axis=0)

edited Dec 12, 2021 at 23:14

answered Dec 12, 2021 at 11:49

Ali_Sh

2,8464 gold badges45 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Yannic Over a year ago

You iterate over the splitted arrays in Python. The only difference to my example is that you use map instead of ist comprehension, which is, as far as I know, not faster for single process applications.

Ali_Sh Over a year ago

@Yannic Now it is pure numpy. In large data, numba and … could make your python loops very efficient. See this useful link on SO.

Ali_Sh Over a year ago

I have answered your question as you asked for pure NumPy. But in terms of speed, as it is mentioned in the link in my last comment, loops may be faster in some codes. I have tested your code and mine on the google colab TPU, your first solution is the fastest among them by using just @nb.jit(); which revealed 5 times faster comparing without numba on this small sample case: 100 loops, best of 5: 5.5 µs per loop.

hpaulj Over a year ago

Adding the reduce.at saves your answer (in my eyes). Though I'd use the OP's idxs_ext[:-1] instead of the insert. np.vectorize is nearly always slower than a list comprehension. It's worth noting that np.split is itself a python level iteration.

Ali_Sh Over a year ago

@hpaulj Yes, you are right. In terms of speed, np.add.reduceat is the correct answer. It did not seem that you want to write the answer, and just a hint in the comment. I apologize if I used it quickly in my answer (although I mentioned you). I have checked the speed with np.concatenate(([0], idxs)) instead np.insert(idxs, 0, 0), for a data range 1000 and 7 slices, and it was about 5 times faster than insert (4.7 µs per loop). I did not know such a thing until now about np.split. Thank you for this valuable point. I will be appreciated if you reference me to read more about it.

Collectives™ on Stack Overflow

NumPy: Sum over 1-D array split by index

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related