1

I am new to python and try to execute two tasks simultanousely. These tasks are just fetching pages on a web server and one can terminate before the other. I want to display the result only when all requests are served. Easy in linux shell but I get nowhere with python and all the howto's I read look like black magic to a beginner like me. They all look over complicated to me compared with the simplicity of the bash script below.

Here is the bash script I would like to emulate in python:

# First request (in background). Result stored in file /tmp/p1
wget -q -O /tmp/p1 "http://ursule/test/test.php?p=1&w=5" &
PID_1=$!

# Second request. Result stored in file /tmp/p2
wget -q -O /tmp/p2 "http://ursule/test/test.php?p=2&w=2"
PID_2=$!

# Wait for the two processes to terminate before displaying the result
wait $PID_1 && wait $PID_2 && cat /tmp/p1 /tmp/p2

The test.php script is a simple:

<?php
printf('Process %s (sleep %s) started at %s ', $_GET['p'], $_GET['w'], date("H:i:s"));
sleep($_GET['w']);
printf('finished at %s', date("H:i:s"));
?>

The bash script returns the following:

$ ./multiThread.sh
Process 1 (sleep 5) started at 15:12:59 finished at 15:12:04
Process 2 (sleep 2) started at 15:12:59 finished at 15:12:01

What I have tried so far in python 3:

#!/usr/bin/python3.2

import urllib.request, threading

def wget (address):
    url = urllib.request.urlopen(address)
    mybytes = url.read()
    mystr = mybytes.decode("latin_1")
    print(mystr)
    url.close()

thread1 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=5",))
thread2 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=2",))

thread1.run()
thread2.run()

This doesn't work as expected as it returns:

$ ./c.py 
Process 1 (sleep 5) started at 15:12:58 finished at 15:13:03
Process 1 (sleep 2) started at 15:13:03 finished at 15:13:05
2
  • 3
    You want thread1.start(); thread2.start() and then join. See docs.python.org/2/library/threading.html for basic information on the threading module. Now, threading doesn't replicate the behavior you had with Bash. For that, you will want multiple processes and you should check the multiprocessing module docs.python.org/2/library/multiprocessing.html Commented Dec 31, 2012 at 14:40
  • It seems to work all right with join. I'll have a look at multiprocessing. Thanks for putting me on track. Commented Dec 31, 2012 at 15:53

2 Answers 2

1

Instead of using threading, it would be nice to use multiprocessing module as each task in independent. You may like to read more about GIL (http://wiki.python.org/moin/GlobalInterpreterLock).

Sign up to request clarification or add additional context in comments.

Comments

0

Following your advise I dived into the doc pages about multithreading and multiprocessing and, after having done a couple of benchmarks, I came to the conclusion that multiprocessing was better suited for the job. It scale up much better as the number of threads/processes increases. Another problem I was confronted with was how to store the results of all these processes. Using Queue.Queue did the trick. Here is the solution I came up with:

This snippet send concurrent http requests to my test rig that pauses for one second before sending the anwser back (see the php script above).

import urllib.request

# function wget arg(queue, adresse)
def wget (resultQueue, address):
    url = urllib.request.urlopen(address)
    mybytes = url.read()
    url.close()
    resultQueue.put(mybytes.decode("latin_1"))

numberOfProcesses = 20

from multiprocessing import Process, Queue

# initialisation
proc = []
results = []
resultQueue = Queue()

# creation of the processes and their result queue
for i in range(numberOfProcesses):
    # The url just passes the process number (p) to the my testing web-server
    proc.append(Process(target=wget, args=(resultQueue, "http://ursule/test/test.php?p="+str(i)+"&w=1",)))
    proc[i].start()

# Wait for a process to terminate and get its result from the queue
for i in range(numberOfProcesses):
    proc[i].join()
    results.append(resultQueue.get())

# display results
for result in results:
    print(result)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.