Weighted Random Sampling from 2d numpy array

Question

I have a 2d numpy array Z and I want to randomly choose an index of Z where the chance of an index being chosen is proportional to the value of Z at that index.

Right now, I'm doing the following:

yar = list(np.ndenumerate(Z))
x,y = yar[np.random.choice(len(yar), p=Z.ravel()/Z.sum())][0]

Which does the job but feels hideous (and is extremely slow besides). Is there a better way?

Check out Raymond Hettinger tweet about wighted random, might be helpful — 0xdead
– 0xdead, Commented Aug 24, 2017 at 13:58

Divakar · Accepted Answer · 2017-08-24 14:12:48Z

4

We can optimize on avoiding the creation of yar. We would simply get the linear index equivalent from np.random.choice, convert it to the dimension indices with np.unravel_index to give us x and y.

So, the implementation would be -

linear_idx = np.random.choice(Z.size, p=Z.ravel()/float(Z.sum()))
x, y = np.unravel_index(linear_idx, Z.shape)

Just to give some context on the numbers by which the creation of yar was causing the bottleneck in that setup, here's a sample timing test -

In [402]: Z = np.random.randint(0,9,(300,400))

In [403]: yar = list(np.ndenumerate(Z))

In [404]: %timeit list(np.ndenumerate(Z))
10 loops, best of 3: 46.3 ms per loop

In [405]: %timeit yar[np.random.choice(len(yar), p=Z.ravel()/float(Z.sum()))][0]
1000 loops, best of 3: 1.34 ms per loop

In [406]: 46.3/(46.3+1.34)
Out[406]: 0.971872376154492

So, creating yar was eating up 97% of the runtime there.

edited Aug 24, 2017 at 14:12

answered Aug 24, 2017 at 13:50

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Weighted Random Sampling from 2d numpy array

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related