3

I'd like to confirm that

a = [random.choices([0,1],weights=[0.2,0.8],k=1) for i in range(0,10)] 

does probabilistically the same thing as

a = random.choices([0,1],weights=[0.2,0.8],k=10) 

In particular, I expect both to make 10 independent draws from the set {0,1} with probability 0.2 on 0 and 0.8 on 1. Is this right?

Thanks!

3
  • Is there any particular reason you thought it might be otherwise? The documentation seems pretty clear. Commented Oct 17, 2019 at 1:32
  • If you have a specific reason you thought the behavior might be different, we can give a more useful answer addressing that reason, instead of just saying "yes" and expecting you and future readers to trust us. Commented Oct 17, 2019 at 1:33
  • Thanks all. I ask because I took ~360 draws using this method for initial production of an app and observed a very unlikely outcome: about 1/3000 under the hypothesis that the data is generated as intended. So I just wanted to make sure this method doesn't somehow mess with independence across draws. Commented Oct 17, 2019 at 10:53

2 Answers 2

3

The documentation seems to indicate the two are probabilistically the same and after running the following experiment:

from collections import defaultdict
import pprint
import random

results1 = defaultdict(int)
results2 = defaultdict(int)

for _ in range(10000):
    a = [random.choices([0,1],weights=[0.2,0.8],k=1) for i in range(0,10)]
    for sublist in a:
        for n in sublist:
            results1[n] += 1

for _ in range(10000):
    a = random.choices([0,1],weights=[0.2,0.8],k=10)
    for n in a:
        results2[n] += 1


print('first way 0s: {}'.format(results1[0]))
print('second way 0s: {}'.format(results2[0]))
print('first way 1s: {}'.format(results1[1]))
print('second way 1s: {}'.format(results2[1]))

I am seeing very similar results between the two methods.

Sign up to request clarification or add additional context in comments.

Comments

2

As others have mentioned, the documentation is clear in regard to this aspect, you can further verified by setting the seed before each call, for example:

import random

random.seed(42)
print([random.choices([0, 1], weights=[0.2, 0.8], k=1)[0] for i in range(0, 10)])

random.seed(42)
print(random.choices([0, 1], weights=[0.2, 0.8], k=10))

Output

[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]
[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]

Furthermore setting just once, does leads to different results, as one might expect:

random.seed(42)
print([random.choices([0, 1], weights=[0.2, 0.8], k=1)[0] for i in range(0, 10)])
print(random.choices([0, 1], weights=[0.2, 0.8], k=10))

Output

[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]
[1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.