Python - Random Values generation based on distribution

Question

I need to create a simulation of case assignment in Python:

For each item in a list it needs to be assigned value from one of the below location based on the %age of cases that need to be assigned to each country.

Country	Case Load
US	30%
UK	30%
India	30%
Singapore	10%

For example, if there are 100 items in a python list, each needs to be assigned to either of the countries in the list. For example, once the count of cases assigned to UK reaches 30, it needs to stop assigning US anymore.

distro = {'US': 0.3, 'UK': 0.3, 'India': 0.3, 'Singapore': 0.1}

locations = []
for key in distro.keys():
    locations.append(key)
locations

loc_assign = []
cases = 100

distro = {'US': 0.3, 'UK': 0.3, 'India': 0.3, 'Singapore': 0.1}

locations = []
for key in distro.keys():
    locations.append(key)
locations

for i in range(cases):
    a = random.choice(locations)
    if loc_assign.count(a) < distro.get(a):
        loc_assign.append(a)
    else:
        a = random.choice(locations)
        loc_assign.append(a)

But the output I am getting is below not correct:

US: 0.27
UK: 0.3
India: 0.22
Singapore: 0.21

How to I get this to arrive at the target distribution percentage.

I am fairly new to Python and can't figure this out. Any help would be appreciated.

For any solution to work, the number of cases must be divisible by the sum of case distributions. Otherwise, you will inevitably have an inexact case distribution. Also, use percentages instead of decimal point notation, because numbers stored via decimal point notation may be inexact. — Peter O.
– Peter O., Commented Apr 25, 2021 at 17:59

accdias · Accepted Answer · 2022-02-18 12:11:57Z

1

Perhaps you can do it with random.sample():

from random import sample
from collections import Counter # Just to have a nice counter

population = ['US', 'UK', 'IN', 'SI']
weights = [3, 3, 3, 1]

c = Counter(sample(population, k=1, counts=weights)[0] for _ in range(1000))

print(c)

Which will give you something like this:

Counter({'UK': 308, 'IN': 302, 'US': 289, 'SI': 101})

As you can see, the distribution of values is very close to what you need in your post.

edited Feb 18, 2022 at 12:11

answered Apr 25, 2021 at 17:56

accdias

5,3523 gold badges24 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

excelman Over a year ago

Thanks for your solution. However, blorgon's solution gives closer results.

accdias Over a year ago

No worries. Just tried to present you with an alternative. :-)

excelman Over a year ago

A little late but I am trying to run a simulation and this actually provides the right variation. This was not what I wanted but is exactly what I need. :)

accdias Over a year ago

I'm glad it helped.

excelman Over a year ago

I think something has changed in in the random.sample package. Getting an error: TypeError: sample() got an unexpected keyword argument 'counts'

|

pakpe · Accepted Answer · 2021-04-25 18:08:23Z

0

Here is a solution that more closely replicates your original approach. The biggest problem was that you need to multiply the distribution by the cases in your if statement. Also using a while loop is better in this case:

import random

distro = {'US': 0.3, 'UK': 0.3, 'India': 0.3, 'Singapore': 0.1}

loc_assign = []
cases = 100

i = 0
while i < cases:
    a = random.choice(list(distro.keys()))
    if loc_assign.count(a) < distro.get(a)* cases:
        loc_assign.append(a)
        i += 1

print(loc_assign.count('US')) #30
print(loc_assign.count('UK')) #30
print(loc_assign.count('India')) #30
print(loc_assign.count('Singapore')) #10

answered Apr 25, 2021 at 18:08

pakpe

5,4892 gold badges11 silver badges24 bronze badges

Collectives™ on Stack Overflow

Python - Random Values generation based on distribution

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related