4

On a Python + Python Image Library script, there's a function called processPixel(image,pos) that calculates a mathematical index in function of an image and a position on it. This index is computed for each pixel using a simple for loop:

for x in range(image.size[0)):
    for y in range(image.size[1)):
        myIndex[x,y] = processPixel(image,[x,y])

This is taking too much time. How could threading be implemented to split the work speeding itup? How faster could a multi-threaded code be? Specifially, is this defined by the number of processor cores?

1
  • 1
    Also, I'm be very willing to bet that processPixel could be "numpy-ified", in which case you'll see an immense speedup over your current approach. Commented Jan 11, 2012 at 15:23

3 Answers 3

7

You cannot speed it up using threading due to the Global Interpreter Lock. Certain internal state of the Python interpreter is protected by that lock, which prevents different threads that need to modify that state from running concurrently.

You could speed it up by spawning actual processes using multiprocessing. Each process will run in its own interpreter, thus circumventing the limitation of threads. With multiprocessing, you can either use shared memory, or give each process its own copy/partition of the data.

Depending on your task, you can either parallelize the processing of a single image by partitioning it, or you can parallelize the processing of a list of images (the latter is easily done using a pool). If you want to use the former, you might want to store the image in an Array that can be accessed as shared memory, but you'd still have to solve the problem of where to write the results (writing to shared memory can hurt performance badly). Also note that certain kinds of communication between processes (Queues, Pipes, or the parameter/return-value passing of some function in the module) require the data to be serialized using Pickle. This imposes certain limitations on the data, and might create significant performance-overhead (especially if you have many small tasks).

Another way for improving performance of such operations is to try writing them in Cython, which has its own support for parallelization using OpenMP - I have never used that though, so I don't know how much help it can be.

Sign up to request clarification or add additional context in comments.

2 Comments

If you are processing images (or your doing any operation requiring a lot of computing power) then you should also look at GPU. Python surely supports it.
As @freakish suggests, you should use GPU-based solutions for this kind of problem. What you said about the GIL and multiprocessing is correct, but still won't help for image processing. And when it comes to array processing I recommend using NumPy since it was designed for efficient array handling.
1

Take a look at Doug Hellmans tutorial on multiprocessing. As Björn points out, there are various issues regarding parallel processing which you'll need to get a hang of, but it can really be worth the effort.

Tip: You can use multiprocessing.cpu_count() to check the number of cores available to you.

1 Comment

linkrot on the tutorial
1

Here are a list of library that you'll want to explore for doing efficient image processing:

OpenCV - is a library of programming functions for real time computer vision and image manipulation that contains Python bindings.

PyOpenCL lets you access GPUs and other massively parallel compute devices from Python.

PyCUDA is a sister project to PyOpenCL

NumPy and SciPy are fundamental packages for doing scientific computing that may be helpful with the above packages for doing efficient image and array processing.

Also note that for doing image processing the multiprocessing library that some people suggest is not going to help you efficiently handle image processing, so you should avoid using operating system threads to do this. If for some reason you do need coarse-grained parallelism, then you can use python library for MPI, but you probably want to stick with GPU-based libraries.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.