4

I have a 2d numpy array Z and I want to randomly choose an index of Z where the chance of an index being chosen is proportional to the value of Z at that index.

Right now, I'm doing the following:

yar = list(np.ndenumerate(Z))
x,y = yar[np.random.choice(len(yar), p=Z.ravel()/Z.sum())][0]

Which does the job but feels hideous (and is extremely slow besides). Is there a better way?

1
  • 1
    Check out Raymond Hettinger tweet about wighted random, might be helpful Commented Aug 24, 2017 at 13:58

1 Answer 1

4

We can optimize on avoiding the creation of yar. We would simply get the linear index equivalent from np.random.choice, convert it to the dimension indices with np.unravel_index to give us x and y.

So, the implementation would be -

linear_idx = np.random.choice(Z.size, p=Z.ravel()/float(Z.sum()))
x, y = np.unravel_index(linear_idx, Z.shape)

Just to give some context on the numbers by which the creation of yar was causing the bottleneck in that setup, here's a sample timing test -

In [402]: Z = np.random.randint(0,9,(300,400))

In [403]: yar = list(np.ndenumerate(Z))

In [404]: %timeit list(np.ndenumerate(Z))
10 loops, best of 3: 46.3 ms per loop

In [405]: %timeit yar[np.random.choice(len(yar), p=Z.ravel()/float(Z.sum()))][0]
1000 loops, best of 3: 1.34 ms per loop

In [406]: 46.3/(46.3+1.34)
Out[406]: 0.971872376154492

So, creating yar was eating up 97% of the runtime there.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.