Python - slice array at different position on every row

Question

I have a 2D python array that I want to slice in an odd way - I want a constant width slice starting on a different position on every row. I would like to do this in a vectorised way if possible.

e.g. I have the array A=np.array([range(5), range(5)]) which looks like

array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

I would like to slice this as follows: 2 elements from each row, starting at positions 0 and 3. The starting posiitons are stored in b=np.array([0,3]). Desired output is thus: np.array([[0,1],[3,4]]) i.e.

array([[0, 1],
       [3, 4]])

The obvious thing I tried to get this result was A[:,b:b+2] but that doesn't work, and I can't find anything that will.

Speed is important as this will operate on a largish array in a loop, and I don't want to bottleneck other parts of my code.

There's something in numpy.lib.stride_tricks... not to mention a dupe somewhere... — cs95
– cs95, Commented Sep 7, 2017 at 8:09
Please provide a Minimal, Complete, and Verifiable example to make it easier for us to answer your question without having to do a lot of extra work ourselves :) — jwpfox
– jwpfox, Commented Sep 7, 2017 at 8:10

Kasravnd · Accepted Answer · 2017-09-07 08:10:42Z

4

You can use np.take():

In [21]: slices = np.dstack([b, b+1])

In [22]: np.take(arr, slices)
Out[22]: 
array([[[0, 1],
        [3, 4]]])

answered Sep 7, 2017 at 8:10

Kasravnd

108k19 gold badges167 silver badges195 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

ShakesBeer Over a year ago

Would this be slow for larger slices and arrays? I'm interested in slicing about 200 elements per row from a matrix of size approximately 2000x4000

Kasravnd Over a year ago

@Shakespeare Size 2000x4000 is not that large. But still there might be some ways to enhance the performance, like using broadcasting and direct slicing instead of using take.

ShakesBeer Over a year ago

Ok I will look into it, it's important for it to run quickly since it feeds data to another part of my program in a loop

Daniel F Over a year ago

I think you need an axis = 1 keyword if the rows of A are not equal

Daniel F Over a year ago

And you'll end up with an extra dimension you'll have to deal with in that case.

|

Divakar · Accepted Answer · 2017-09-07 09:05:14Z

3

Approach #1 : Here's one approach with broadcasting to get all indices and then using advanced-indexing to extract those -

def take_per_row(A, indx, num_elem=2):
    all_indx = indx[:,None] + np.arange(num_elem)
    return A[np.arange(all_indx.shape[0])[:,None], all_indx]

Sample run -

In [340]: A
Out[340]: 
array([[0, 5, 2, 6, 3, 7, 0, 0],
       [3, 2, 3, 1, 3, 1, 3, 7],
       [1, 7, 4, 0, 5, 1, 5, 4],
       [0, 8, 8, 6, 8, 6, 3, 1],
       [2, 5, 2, 5, 6, 7, 4, 3]])

In [341]: indx = np.array([0,3,1,5,2])

In [342]: take_per_row(A, indx)
Out[342]: 
array([[0, 5],
       [1, 3],
       [7, 4],
       [6, 3],
       [2, 5]])

Approach #2 : Using np.lib.stride_tricks.as_strided -

from numpy.lib.stride_tricks import as_strided

def take_per_row_strided(A, indx, num_elem=2):
    m,n = A.shape
    A.shape = (-1)
    s0 = A.strides[0]
    l_indx = indx + n*np.arange(len(indx))
    out = as_strided(A, (len(A)-num_elem+1, num_elem), (s0,s0))[l_indx]
    A.shape = m,n
    return out

Runtime test for taking 200 per row from a 2000x4000 matrix

In [447]: A = np.random.randint(0,9,(2000,4000))

In [448]: indx = np.random.randint(0,4000-200,(2000))

In [449]: out1 = take_per_row(A, indx, 200)

In [450]: out2 = take_per_row_strided(A, indx, 200)

In [451]: np.allclose(out1, out2)
Out[451]: True

In [452]: %timeit take_per_row(A, indx, 200)
100 loops, best of 3: 2.14 ms per loop

In [453]: %timeit take_per_row_strided(A, indx, 200)
1000 loops, best of 3: 435 µs per loop

edited Sep 7, 2017 at 9:05

answered Sep 7, 2017 at 8:31

Divakar

222k19 gold badges273 silver badges374 bronze badges

17 Comments

ShakesBeer Over a year ago

Would I be correct in assuming this should be the fastest method?

Daniel F Over a year ago

That's usually a good assumption. It is @Divakar after all. That said, this is just my answer wrapped in a function and made a bit more general

ShakesBeer Over a year ago

I got 20 nanoseconds for taking 200 per row from a 2000x4000 matrix, so I'm gonna stick with this. Thanks

ShakesBeer Over a year ago

@Divakar 1.5ms for approach 2!

ShakesBeer Over a year ago

milliseconds, i.e. it's faster

|

Daniel F · Accepted Answer · 2017-09-07 08:34:18Z

1

You can set up a fancy indexing method to find the correct elements:

A = np.arange(10).reshape(2,-1)

x = np.stack([np.arange(A.shape[0])]* 2).T
y = np.stack([b, b+1]).T
A[x, y]

array([[0, 1],
       [8, 9]])

Compare to @Kasramvd's np.take answer:

slices = np.dstack([b, b+1])
np.take(A, slices)

array([[[0, 1],
        [3, 4]]])

np.slice by default takes from the flattened array, not row-wise. with an axis = 1 parameter you get all the slices of all the rows:

np.take(A, slices, axis = 1)

array([[[[0, 1],
         [3, 4]]],


       [[[5, 6],
         [8, 9]]]])

Which would need more processing.

edited Sep 7, 2017 at 8:34

answered Sep 7, 2017 at 8:28

Daniel F

14.5k2 gold badges34 silver badges59 bronze badges

1 Comment

ShakesBeer Over a year ago

Thanks for this answer, went with Divakar's as it's a bit more polished

Collectives™ on Stack Overflow

Python - slice array at different position on every row

3 Answers 3

8 Comments

17 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

17 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related