Numpy indexing: first (varying) number of elements from each row in 2d array

Question

(short version of my question: In numpy, is there an elegant way of emulating tf.sequence_mask from tensorflow?)

I have a 2d array a (each row represents a sequence of different length). Next, there is a 1d array b (representing sequence lengths). Is there an elegant way to get a (flattened) array that would contain only such elements of a that belong to the sequences as specified by their length b:

a = np.array([
    [1, 2, 3, 2, 1],  # I want just [:3] from this row
    [4, 5, 5, 5, 1],  # [:2] from this row
    [6, 7, 8, 9, 0]   # [:4] from this row
])
b = np.array([3,2,4])  # 3 elements from the 1st row, 2 from the 2nd, 4 from the 4th row

the desired result:

[1, 2, 3, 4, 5, 6, 7, 8, 9]

By elegant way I mean something that avoids loops.

Divakar · Accepted Answer · 2018-06-26 14:29:46Z

5

Use broadcasting to create a mask of the same shape as the 2D array and then simply mask and extract valid elements -

a[b[:,None] > np.arange(a.shape[1])]

Sample run -

In [360]: a
Out[360]: 
array([[1, 2, 3, 2, 1],
       [4, 5, 5, 5, 1],
       [6, 7, 8, 9, 0]])

In [361]: b
Out[361]: array([3, 2, 4])

In [362]: a[b[:,None] > np.arange(a.shape[1])]
Out[362]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])

answered Jun 26, 2018 at 14:29

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

SheepPerplexed Over a year ago

This was extremely fast. Thanks! Actually, it was so fast I can't even accept the answer yet...

Collectives™ on Stack Overflow

Numpy indexing: first (varying) number of elements from each row in 2d array

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related