93

What is the efficient(probably vectorized with Matlab terminology) way to generate random number of zeros and ones with a specific proportion? Specially with Numpy?

As my case is special for 1/3, my code is:

import numpy as np 
a=np.mod(np.multiply(np.random.randomintegers(0,2,size)),3)

But is there any built-in function that could handle this more effeciently at least for the situation of K/N where K and N are natural numbers?

12
  • 2
    Do you need the proportion to be exactly the given value, or is that just the expected proportion of the sample? Commented Oct 25, 2013 at 19:11
  • Also, what should happen for the 1/3 case when size is not divisible by 3? Exception? Round/floor/trunc? Weighted random round (so 10 has a 2/3 chance of 3 and a 1/3 chance of 4)? Commented Oct 25, 2013 at 19:15
  • @WarrenWeckesser, its the expected proportion in my case. I wished you didn't deleter your answer so I would have accepted it. Commented Oct 25, 2013 at 19:16
  • 1
    @Naji: I restored my answer. If you had needed the exact proportion, that method wouldn't work. Commented Oct 25, 2013 at 19:27
  • 1
    @Naji: Whatever you want? I wanted it to generate a trillion dollars, and all it gave me was an array. I suppose I'm not believing hard enough. ;) Commented Oct 25, 2013 at 20:15

7 Answers 7

130

Yet another approach, using np.random.choice:

>>> np.random.choice([0, 1], size=(10,), p=[1./3, 2./3])
array([0, 1, 1, 1, 1, 0, 0, 0, 0, 0])
Sign up to request clarification or add additional context in comments.

4 Comments

note that this approach will not give you the exact proportion of zeros and ones you request . . . the answer by @mdml below will.
true, and since it is accepted, I think Cupitor might have added a bug to his program
@JFFIGK, dbliss: this was discussed in the comments to the question. Those comments are still there, so take a look.
Since the mentioned link is broken, see: numpy.random.choice.
52

A simple way to do this would be to first generate an ndarray with the proportion of zeros and ones you want:

>>> import numpy as np
>>> N = 100
>>> K = 30 # K zeros, N-K ones
>>> arr = np.array([0] * K + [1] * (N-K))
>>> arr
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1])

Then you can just shuffle the array, making the distribution random:

>>> np.random.shuffle(arr)
>>> arr
array([1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0,
       1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
       0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1,
       1, 1, 1, 0, 1, 1, 1, 1])

Note that this approach will give you the exact proportion of zeros/ones you request, unlike say the binomial approach. If you don't need the exact proportion, then the binomial approach will work just fine.

2 Comments

How stupid of me! Right I forgot about binary distribution. Actually somebody posted binary right before you but he deleted his answer(dont know why!!)
This is quite clever
23

If I understand your problem correctly, you might get some help with numpy.random.shuffle

>>> def rand_bin_array(K, N):
    arr = np.zeros(N)
    arr[:K]  = 1
    np.random.shuffle(arr)
    return arr

>>> rand_bin_array(5,15)
array([ 0.,  1.,  0.,  1.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
        0.,  0.])

Comments

21

You can use numpy.random.binomial. E.g. suppose frac is the proportion of ones:

In [50]: frac = 0.15

In [51]: sample = np.random.binomial(1, frac, size=10000)

In [52]: sample.sum()
Out[52]: 1567

3 Comments

This doesn't guarantee the correct proportion of ones like mdml's answer does.
@John, this was discussed in the comments to the question. Take a look.
I see now! Of course the question needs editing then as it asks for specific proportion.
2

Another way of getting the exact number of ones and zeroes is to sample indices without replacement using np.random.choice:

arr_len = 30
num_ones = 8

arr = np.zeros(arr_len, dtype=int)
idx = np.random.choice(range(arr_len), num_ones, replace=False)
arr[idx] = 1

Out:

arr

array([0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1,
       0, 0, 0, 0, 0, 1, 0, 0])

Comments

1

Simple one-liner: you can avoid using lists of integers and probability distributions, which are unintuitive and overkill for this problem in my opinion, by simply working with bools first and then casting to int if necessary (though leaving it as a bool array should work in most cases).

>>> import numpy as np
>>> np.random.random(9) < 1/3.
array([False,  True,  True,  True,  True, False, False, False, False])   
>>> (np.random.random(9) < 1/3.).astype(int)
array([0, 0, 0, 0, 0, 1, 0, 0, 1])    

2 Comments

This doesn't guarantee the correct proportion of ones like mdml's answer does.
The OP said they wanted 1/3 to be the expected proportion of 1s, not the exact proportion.
0

You can generate a nd-array with random binary members (0 and 1) directly in one line through the following method. You can also use np.random.random() instead of np.random.uniform().

>>import numpy as np
>>np.array([[round(np.random.uniform()) for i in range(3)] for j in  range(3)])
array([[1, 0, 0],
       [1, 1, 1],
       [0, 1, 0]])
>>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.