I'm using urllib2, cstringIO and PIL. I need to really tune this and make it very fast (at least half the current speed)
I access and load the image using the below.
imageurl = "http://bit.ly/wOqVTE"
@log_performance
def get_image(imageurl):
img_file = urllib.urlopen(imageurl)
data = StringIO(img_file.read())
im = Image.open(data)
size = 128, 128
im.thumbnail(size, Image.ANTIALIAS)
return im
Then process the image using:
@log_performance
def process_image(image, sample_limit=10000, top=10):
colors = image.getcolors(sample_limit)
sc = sorted(colors, key=lambda x: x[0], reverse=True)
return sc[:top]
This takes on average 0.6 seconds to get the image and around 0.006 seconds to process.
How can I speed up the get and load process?
The full gist can be found here. https://gist.github.com/1920167
>>>>Function: get_image, Executed:20, Avg Time:0.558275926113
>>>>Function: process_image, Executed:20, Avg Time:0.00609920024872
I will add bounty of 50 for anyone that can half the time.
get_imageup to see how much time is being spent on network I/O and how much is being spent on PIL.multiprocessing.Poolto get a few concurrent downloads.