numpy randomly sample boolean array

Question

I have a numpy array as follows.

data = np.array([True, True, True, True, False, True, True, False, True, True, False])

From the locations of 'True', I have to randomly sample 3 locations and keep them as True, besides them, convert as False.

I tried as:

indx = np.random.choice(len(data),3,replace=False)        
data[~indx] = False

How to do it in a better (1. easy, 2. performance, 3. elegance)?

print (data)

Also, how to sample only from 'True` locations? My code is doing from all locations and incorrect.

Divakar · Accepted Answer · 2020-01-03 18:36:43Z

2

For elegance, here's one -

n = 3
idx = np.flatnonzero(data)
r = np.random.choice(idx, n, replace=False)
data[idx[~np.isin(idx,r)]] = False

For performance -

s = data.sum()
t_mask = np.zeros(s, dtype=bool)
t_mask[np.random.choice(s, n, replace=False)] = True
data[data] = t_mask

answered Jan 3, 2020 at 18:30

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

@mk1 What do you mean difficult?

is not there more simple approach?

any alternative approach without using np.setdiff1d?

@mk1 Added in the post.

okay, though it is not part of the question, i'm interested to know if there is way to use random seed and stratification?

|