Find the n smallest items in a numpy array of arrays

Question

There are plenty of questions on here where one wants to find the nth smallest element in a numpy array. However, what if you have an array of arrays? Like so:

>>> print matrix
[[ 1.          0.28958002  0.09972488 ...,  0.46999924  0.64723113
   0.60217694]
 [ 0.28958002  1.          0.58005657 ...,  0.37668355  0.48852272
   0.3860152 ]
 [ 0.09972488  0.58005657  1.         ...,  0.13151364  0.29539992
   0.03686381]
 ..., 
 [ 0.46999924  0.37668355  0.13151364 ...,  1.          0.50250212
   0.73128971]
 [ 0.64723113  0.48852272  0.29539992 ...,  0.50250212  1.          0.71249226]
 [ 0.60217694  0.3860152   0.03686381 ...,  0.73128971  0.71249226  1.        ]]

How can I get the n smallest items out of this array of arrays?

>>> print type(matrix)
<type 'numpy.ndarray'>

This is how I have been doing it to find the coordinates of the smallest item:

min_cordinates = []
for i in matrix:
    if numpy.any(numpy.where(i==numpy.amin(matrix))[0]):
        min_cordinates.append(int(numpy.where(i==numpy.amin(matrix))[0][0])+1)

Now I would like to find, for example, the 10 smallest items.

qwr · Accepted Answer · 2015-07-26 22:16:52Z

6

Flatten the matrix, sort and then select the first 10.

print(numpy.sort(matrix.flatten())[:10])

edited Jul 26, 2015 at 22:16

answered Jul 26, 2015 at 22:11

qwr

11.5k6 gold badges75 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Warren Weckesser Over a year ago

Instead of calling matrix.flatten(), you could use numpy.sort(matrix, axis=None)[:10]

Warren Weckesser · Accepted Answer · 2015-07-27 12:57:59Z

If your array is not large, the accepted answer is fine. For large arrays, np.partition will accomplish this much more efficiently. Here's an example where the array has 10000 elements, and you want the 10 smallest values:

In [56]: np.random.seed(123)

In [57]: a = 10*np.random.rand(100, 100)

Use np.partition to get the 10 smallest values:

In [58]: np.partition(a, 10, axis=None)[:10]
Out[58]: 
array([ 0.00067838,  0.00081888,  0.00124711,  0.00120101,  0.00135942,
        0.00271129,  0.00297489,  0.00489126,  0.00556923,  0.00594738])

Note that the values are not in increasing order. np.partition does not guarantee that the first 10 values will be sorted. If you need them in increasing order, you can sort the selected values afterwards. This will still be faster than sorting the entire array.

Here's the result using np.sort:

In [59]: np.sort(a, axis=None)[:10]
Out[59]: 
array([ 0.00067838,  0.00081888,  0.00120101,  0.00124711,  0.00135942,
        0.00271129,  0.00297489,  0.00489126,  0.00556923,  0.00594738])

Now compare the timing:

In [60]: %timeit np.partition(a, 10, axis=None)[:10]
10000 loops, best of 3: 75.1 µs per loop

In [61]: %timeit np.sort(a, axis=None)[:10]
1000 loops, best of 3: 465 µs per loop

In this case, using np.partition is more than six times faster.

Sede · Accepted Answer · 2015-07-26 22:22:16Z

3

You can use the heapq.nsmallest function to return the list of the 10 smallest elements.

In [84]: import heapq

In [85]: heapq.nsmallest(10, matrix.flatten())
Out[85]: 
[-1.7009047695355393,
 -1.4737632239971061,
 -1.1246243781838825,
 -0.7862983016935523,
 -0.5080863016259798,
 -0.43802651199959347,
 -0.22125698200832566,
 0.034938408281615596,
 0.13610084041121048,
 0.15876389111565958]

edited Jul 26, 2015 at 22:22

answered Jul 26, 2015 at 22:14

Sede

61.5k20 gold badges158 silver badges162 bronze badges

Collectives™ on Stack Overflow

Find the n smallest items in a numpy array of arrays

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related