3

I have a numpy array, a list of start/end indexes that define ranges within the array, and a list of values, where the number of values is the same as the number of ranges. Doing this assignment in a loop is currently very slow, so I'd like to assign the values to the corresponding ranges in the array in a vectorized way. Is this possible to do?

Here's a concrete, simplified example:

a = np.zeros([10])

Here's the list of start and a list of end indexes that define ranges within a, like this:

starts = [0, 2, 4, 6]
ends = [2, 4, 6, 8]

And here's a list of values I'd like to assign to each range:

values = [1, 2, 3, 4]

I have two problems. The first is that I can't figure out how to index into the array using multiple slices at the same time, since the list of ranges is constructed dynamically in the actual code. Once I'm able to extract the ranges, I'm not sure how to assign multiple values at once - one value per range.

Here's how I've tried creating a list of slices and the problems I've run into when using that list to index into the array:

slices = [slice(start, end) for start, end in zip(starts, ends)]


In [97]: a[slices]
...
IndexError: too many indices for array

In [98]: a[np.r_[slices]]
...
IndexError: arrays used as indices must be of integer (or boolean) type

If I use a static list, I can extract multiple slices at once, but then assignment doesn't work the way I want:

In [106]: a[np.r_[0:2, 2:4, 4:6, 6:8]] = [1, 2, 3]
/usr/local/bin/ipython:1: DeprecationWarning: assignment will raise an error in the future, most likely because your index result shape does not match the value array shape. You can use `arr.flat[index] = values` to keep the old behaviour.
  #!/usr/local/opt/python/bin/python2.7

In [107]: a
Out[107]: array([ 1.,  2.,  3.,  1.,  2.,  3.,  1.,  2.,  0.,  0.])

What I actually want is this:

np.array([1., 1., 2., 2., 3., 3., 4., 4., 0., 0.])

2
  • Is each slice guaranteed to start where the previous slice ended? Commented Aug 12, 2016 at 17:49
  • No, there may be gaps between slices. The only guarantee is that they don't overlap. Commented Aug 12, 2016 at 18:00

2 Answers 2

4

This will do the trick in a fully vectorized manner:

counts = ends - starts
idx = np.ones(counts.sum(), dtype=np.int)
idx[np.cumsum(counts)[:-1]] -= counts[:-1]
idx = np.cumsum(idx) - 1 + np.repeat(starts, counts)

a[idx] = np.repeat(values, count)
Sign up to request clarification or add additional context in comments.

Comments

2

One possibility is to zip the start, end index with the values and broadcast the index and values manually:

starts = [0, 2, 4, 6]
ends = [2, 4, 6, 8]
values = [1, 2, 3, 4]
a = np.zeros(10)

import numpy as np
# calculate the index array and value array by zipping the starts, ends and values and expand it
idx, val = zip(*[(list(range(s, e)), [v] * (e-s)) for s, e, v in zip(starts, ends, values)])

# assign values
a[np.array(idx).flatten()] = np.array(val).flatten()

a
# array([ 1.,  1.,  2.,  2.,  3.,  3.,  4.,  4.,  0.,  0.])

Or write a for loop to assign values one range by another:

for s, e, v in zip(starts, ends, values):
    a[slice(s, e)] = v

a
# array([ 1.,  1.,  2.,  2.,  3.,  3.,  4.,  4.,  0.,  0.])

2 Comments

It's hard to beat you last loop for simplicity. And I suspect it will be as fast as any alternative, especially when starting with these 3 lists.
If there are many short ranges as per the example, I suspect my answer is faster, even though it involves something like 6 passes over the data, versus only a single one with this method. But indeed, this is certainly simpler and more readable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.