Returning minimum X numbers of a numpy array and keeping the order

Question

I have the following X numpy array. I want to create an array from X[i] with the three minimum X[i][3] values of the array.

array([[  2, 356,   1, 0.7],
       [  3, 356,   1, 5],
       [  3, 357,   1, 3],
       [  4, 355,   1, 0.1],
       [  4, 356,   1, 16],
       [  4, 357,   1, 2]])

The result should look like:

array([[  2, 356,   1, 0.7],
       [  4, 355,   1, 0.1],
       [  4, 357,   1, 2]])

Divakar · Accepted Answer · 2017-04-27 12:48:27Z

7

Here's one approach -

X[np.sort(X[:,3].argsort()[:3])]

Basically, we use argsort to get the sorted indices, select the first three for the lowest three elements. We will index the array with these indices for the output. To keep order as in the input array, sort those indices before indexing.

Sample run -

In [148]: X
Out[148]: 
array([[  2.00e+00,   3.56e+02,   1.00e+00,   7.00e-01],
       [  3.00e+00,   3.56e+02,   1.00e+00,   5.00e+00],
       [  3.00e+00,   3.57e+02,   1.00e+00,   3.00e+00],
       [  4.00e+00,   3.55e+02,   1.00e+00,   1.00e-01],
       [  4.00e+00,   3.56e+02,   1.00e+00,   1.60e+01],
       [  4.00e+00,   3.57e+02,   1.00e+00,   2.00e+00]])

In [149]: X[np.sort(X[:,3].argsort()[:3])]
Out[149]: 
array([[  2.00e+00,   3.56e+02,   1.00e+00,   7.00e-01],
       [  4.00e+00,   3.55e+02,   1.00e+00,   1.00e-01],
       [  4.00e+00,   3.57e+02,   1.00e+00,   2.00e+00]])

For performance, we can use np.argpartition. So, X[:,3].argsort()[:3] could be replaced by np.argpartition(X[:,3],3)[:3]. argpartition because of the way its implemented gives us the indices corresponding to the lowest 3 elements, just not necessarily in the order of lowest to second-lowest to third-lowest. But that's okay, as we would sort those indices anyway later on to keep the order as in the input array (discussed earlier).

Timings on the performance-boost suggestion -

In [164]: X = np.random.rand(100000,4)

In [165]: np.sort(X[:,3].argsort()[:3])
Out[165]: array([ 9950, 69008, 76552])

In [166]: np.sort(np.argpartition(X[:,3],3)[:3])
Out[166]: array([ 9950, 69008, 76552])

In [167]: %timeit np.sort(X[:,3].argsort()[:3])
100 loops, best of 3: 7.59 ms per loop

In [168]: %timeit np.sort(np.argpartition(X[:,3],3)[:3])
1000 loops, best of 3: 290 µs per loop

edited Apr 27, 2017 at 12:48

answered Apr 27, 2017 at 12:26

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

EdChum Over a year ago

congrats on 100k ♪(*^ ・^)ノ⌒ｵﾒﾃﾞﾄ☆

Moses Koledoye Over a year ago

@EdChum Congrats on your 100k too. Epic! :)

Norhan Ahmad Over a year ago

@Divakar Thanks. Is sorting necessary? I don't care if they are sorted, it was done by pure chance.

Divakar Over a year ago

@NorhanAhmad If you don't care, skip the np.sort part, but I guess its good to know that we could have the order kept by just adding a sort there.

Divakar Over a year ago

@NorhanAhmad Looks like you haven't accepted any solution to any of your questions yet. So, let me introduce you to accepting solutions. If your question has been answered/solved, consider accepting one of the solutions by clicking on the green tick next to the solution. Read about it in more details here - meta.stackexchange.com/questions/5234/…

Collectives™ on Stack Overflow

Returning minimum X numbers of a numpy array and keeping the order

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related