21

First I create my array

myarray = np.random.random_integers(0,10, size=20)

Then, I want to set 20% of the elements in the array to 0 (or some other number). How should I do this? Apply a mask?

5 Answers 5

25

You can calculate the indices with np.random.choice, limiting the number of chosen indices to the percentage:

indices = np.random.choice(np.arange(myarray.size), replace=False,
                           size=int(myarray.size * 0.2))
myarray[indices] = 0
Sign up to request clarification or add additional context in comments.

4 Comments

Cool. I should also set replace=False for np.random.choice right? If I do not, some of the indices generated will be repeated
Yes. Updated my answer.
For arbitrarily shaped arrays we can use unravel_index: myarray[np.unravel_index(indices, myarray.shape)] = 0
@Holi 's comment is such a useful addition it should either be incorporated into this answer or made its own separate answer, IMO
4

For others looking for the answer in case of nd-array, as proposed by user holi:

my_array = np.random.rand(8, 50)
indices = np.random.choice(my_array.shape[1]*my_array.shape[0], replace=False, size=int(my_array.shape[1]*my_array.shape[0]*0.2))

We multiply the dimensions to get an array of length dim1*dim2, then we apply this indices to our array:

my_array[np.unravel_index(indices, my_array.shape)] = 0 

The array is now masked.

Comments

1

Use np.random.permutation as random index generator, and take the first 20% of the index.

myarray = np.random.random_integers(0,10, size=20)
n = len(myarray)
random_idx = np.random.permutation(n)

frac = 20 # [%]
zero_idx = random_idx[:round(n*frac/100)]
myarray[zero_idx] = 0

1 Comment

DeprecationWarning: This function is deprecated. Please call randint(0, 10 + 1) instead. Now it is: myarray = np.random.randint(0,10, size=20)
0

If you want the 20% to be random:

random_list = []
array_len = len(myarray)

while len(random_list) < (array_len/5):
    random_int = math.randint(0,array_len)
    if random_int not in random_list:
        random_list.append(random_int)

for position in random_list:
    myarray[position] = 0

return myarray

This would ensure you definitely get 20% of the values, and RNG rolling the same number many times would not result in less than 20% of the values being 0.

Comments

0

Assume your input numpy array is A and p=0.2. The following are a couple of ways to achieve this.

Exact Masking

ones = np.ones(A.size)
idx = int(min(p*A.size, A.size))
ones[:idx] = 0
A *= np.reshape(np.random.permutation(ones), A.shape)

Approximate Masking

This is commonly done in several denoising objectives, most notably the Masked Language Modeling in Transformers pre-training. Here is a more pythonic way of setting a certain proportion (say 20%) of elements to zero.

A *= np.random.binomial(size=A.shape, n=1, p=0.8)

Another Alternative:

A *= np.random.randint(0, 2, A.shape)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.