I need to run a program about 500 times with different inputs.
I'd like to use asyncio.create_subprocess_exec and want to limit the number of processes running at the same time so as not to clog up the machine.
Is there a way to set the concurrency level? For example, I'd expect something like AbstractEventLoop.set_max_tasks.
2 Answers
As suggested by @AndrewSvetlov, you can use an asyncio.Semaphore to enforce the limit:
async def run_program(input):
p = await asyncio.create_subprocess_exec(...)
# ... communicate with the process ...
p.terminate()
return something_useful
async def run_throttled(input, sem):
async with sem:
result = await run_program(input)
return result
LIMIT = 10
async def many_programs(inputs):
sem = asyncio.Semaphore(LIMIT)
results = await asyncio.gather(
*[run_throttled(input, sem) for input in inputs])
# ...
Comments
In situations where one wants to schedule enough program calls that creating each as a coroutine right away starts being an issue, a semaphore-backed task queue as in this answer can be useful:
tasks = TaskQueue(NUM_PARALLEL_PROCESSES)
for input in MANY_INPUTS:
await tasks.put(run_program(input))
tasks.join()
asyncio.Semaphore