2

Lets assume I have the follwing arrays:

distance = np.array([2, 3, 5, 4, 8, 2, 3])
idx = np.array([0, 0, 1, 1, 1, 2, 2 ])

Now I want the smallest distance within one index. So my goald would be:

result = [2, 4, 2]

My only idea right now would be something like this:

for i in idx_unique:
    result.append(np.amin(distances[np.argwhere(idx = i)]))

But is there a faster way without a loop??

2 Answers 2

1

You can convert idx to a boolean vector to use indexing within the distance vector:

distance = np.array([2, 3, 5, 4, 8])
idx = np.array([0, 0, 1, 1, 1]).astype(np.bool)

result = [np.min(distance[~idx]), np.min(distance[idx])]
Sign up to request clarification or add additional context in comments.

2 Comments

oh in my example I have only two idx, but that bool option would not work for more than 2 right?
Exactly, you could always rewrite your loop in a longer expression, but for too many indexes this method is not helpful
1

Although not truly free from loops, here is one way to do that:

import numpy as np

distance = np.array([2, 3, 5, 4, 8, 2, 3])
idx = np.array([0, 0, 1, 1, 1, 2, 2 ])

t = np.split(distance, np.where(idx[:-1] != idx[1:])[0] + 1)
print([np.min(x) for x in t])

Actually, this provides no improvement as both the OP's solution and this one has the same runtime:

res1 = []
def soln1():
    for i in idx_unique:
        res1.append(np.amin(distances[np.argwhere(idx = i)]))

def soln2():
    t = np.split(distance, np.where(idx[:-1] != idx[1:])[0] + 1)
    res2 = [np.min(x) for x in t]

Timeit gives:

%timeit soln1
#10000000 loops, best of 5: 24.3 ns per loop
%timeit soln2
#10000000 loops, best of 5: 24.3 ns per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.