0

Looking to build a python script that runs an infinite read loop from stdin like for line in sys.stdin:. For each iteration, I would like to get a worker from a pool that executes in the background using line as input. The process on finishing its execution or timing out prints to stdout.

I am having a difficult time finding a worker pool module that is able to work continuously. For example, the multiprocess pool module only supports functions like join that wait for all workers to finish all tasks. For the above specification, I cannot know all the tasks ahead of time and need to assign work as it comes to processes in the background.

4
  • You can use a separate Process that consumes from a Queue the results from the workers (Process as well), printing it to stdout. Commented Jan 11, 2017 at 14:00
  • Restating the idea... Each line is added to a queue. Then each process continuously checks the queue for a line. (Do I need to lock the queue so multiple processes do not remove the same line from the queue?). Then if there is a line, a process will remove it from the queue and print the result to stdout after which it returns to looking at the queue? How do I force a process to timeout if the work takes too long and to move on? Do you know of any examples online? Commented Jan 11, 2017 at 14:05
  • You have your mainloop spawning Process(..., args=(queue, line)) as each new line arrives. Meanwhile a previously spanwed Process consumes the Queue and prints the results. docs.python.org/3.6/library/multiprocessing.html Commented Jan 11, 2017 at 14:17
  • From what it looks like you are saying, every line spawns a new process with a queue? How do I reuse processes so that every line does not create a new process? Commented Jan 11, 2017 at 14:37

2 Answers 2

2

This will run forever.

import sys
from multiprocessing import Pool

pool = Pool()

for line in sys.stdin.readline():
    pool.apply_async(function, args=[line])

def function(line):
    """Process the line in a separate process."""
    print(line)
Sign up to request clarification or add additional context in comments.

1 Comment

Hey! I am pretty sure this does not work. Have you tried it? I tried something like this with a loop from 1 to 10. The apply_async does not actually start executing the task.
0

Using Pool and imap might make it easier, but you have to assume a maximum capacity of workers (processes=5):

import multiprocessing
import sys


def worker(line):
    return "Worker got %r" % (line)


pool = multiprocessing.Pool(processes=5)
for result in pool.imap(worker, sys.stdin):
    print "Result: %r" % (result)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.