Python vectorizing nested for loops in image processing

Question

I am trying to detect skin. I found a nice and easy formula to detect skin from RGB picture. The only problem is, that for loops are very slow, and I need to accelerate the process. I've done some researching and vectorization could fasten my for-loops, but I don't know how to use it in my case.

Here is code of my function:

Function receives 1 parameter of type: numpy array, with shape of (144x256x3), dtype=np.uint8

Function returns coordinates of first detected skin colored pixel(as numpy.array [height,width]); number of skin detected pixels(int) and calculated angle (from left to right) of first skin detected picture(float)

# picture = npumpy array, with 144x256x3 shape, dtype=np.uint8
def filter_image(picture):
    r = 0.0
    g = 0.0
    b = 0.0

    # In first_point I save first occurrence of skin colored pixel, so I can track person movement
    first_point = np.array([-1,-1])

    # counter is used to count how many skin colored pixels are in an image (to determine distance to target, because LIDAR isn't working)
    counter = 0

    # angle of first pixel with skin color (from left to right, calculated with Horizontal FOV)
    angle = 0.0

    H = picture.shape[0]
    W = picture.shape[1]

    # loop through each pixel
    for i in range(H):
        for j in range(W):
            # if all RGB are 0(black), we take with next pixel
            if(int(picture[i,j][0]+picture[i,j][1]+picture[i,j][2])) == 0:
               continue
            #else we calculate r,g,b used for skin recognition
            else:    
                r = picture[i,j][0]/(int(picture[i,j][0]+picture[i,j][1]+picture[i,j][2]))
                g = picture[i,j][1]/(int(picture[i,j][0]+picture[i,j][1]+picture[i,j][2]))
                b = picture[i,j][2]/(int(picture[i,j][0]+picture[i,j][1]+picture[i,j][2]))
            # if one of r,g,b calculations are 0, we take next pixel
            if(g == 0 or r == 0 or b == 0):
                continue
            # if True, pixel is skin colored
            elif(r/g > 1.185 and (((r * b) / math.pow(r + b + g,2)) > 0.107) and ((r * g) / math.pow(r + b + g,2)) > 0.112):
                # if this is the first point with skin colors in the whole image, we save i,j coordinate
                if(first_point[0] == -1):
                    # save first skin color occurrence
                    first_point[0] = i
                    first_point[1] = j

                    # here angle is calculated, with width skin pixel coordinate, Hor. FOV of camera and constant
                    angle = (j+1)*91 *0.00390626

                # whenever we detect skin colored pixel, we increment the counter value
                counter += 1
                continue
    # funtion returns coordinates of first skin colored pixel, counter of skin colored pixels and calculated angle(from left to right based on j coordinate of first pixel with skin color)         
    return first_point,counter, angle

Function works well, the only problem is its speed!

Thank you, for helping!

Is this algorithm described somewhere? It seems mad to test if all RGB components are zero, then to calculate some averages, then to test and continue out of the loop if any RGB component is zero. You might as well test if any is zero up front and continue out without calculating the averages, surely? — Mark Setchell
– Mark Setchell, Commented May 29, 2020 at 11:53
@MarkSetchell I found the equation here link (2.2 normalized RGB), this is just my implementation. I don't have any experience with image processing so I might've over-complicated a bit and as you see I also don't know how to use vectorization... — Aljaž Gornik
– Aljaž Gornik, Commented May 29, 2020 at 12:01

Mercury · Accepted Answer · 2020-05-29 13:17:45Z

You can skip all of the loops and do the operation with numpy's broadcasting. The process becomes even easier if the image is reshaped to 2D from 3D, giving you HxW rows of pixels to work with.

def filter(picture):
    H,W = picture.shape[0],picture.shape[1]
    picture = picture.astype('float').reshape(-1,3)
    # A pixel with any r,g,b equalling zero can be removed.
    picture[np.prod(picture,axis=1)==0] = 0

    # Divide non-zero pixels by their rgb sum
    picsum = picture.sum(axis=1)
    nz_idx = picsum!=0
    picture[nz_idx] /= (picsum[nz_idx].reshape(-1,1))

    nonzeros = picture[nz_idx]

    # Condition 1: r/g > 1.185
    C1 = (nonzeros[:,0]/nonzeros[:,1]) > 1.185
    # Condition 2: r*b / (r+g+b)^2 > 0.107
    C2 = (nonzeros[:,0]*nonzeros[:,2])/(nonzeros.sum(axis=1)**2) > 0.107 
    # Condition 3: r*g / (r+g+b)^2 > 0.112
    C3 = (nonzeros[:,0]*nonzeros[:,1])/(nonzeros.sum(axis=1)**2) > 0.112
    # Combine conditions
    C = ((C1*C2*C3)!=0)
    picsum[nz_idx] = C
    skin_points = np.where(picsum!=0)[0]
    first_point = np.unravel_index(skin_points[0],(H,W))
    counter = len(skin_points)
    angle = (first_point[1]+1) * 91 * 0.00390626
    return first_point, counter, angle

Tom O'Connell · Accepted Answer · 2020-05-30 01:39:37Z

2

One thing that is often nice to try first, when trying to improve the performance of code, is to see how much something like numba can make it faster basically for free.

Here's an example of how to use it for your code:

import math
import time

# I'm just importing numpy here so I can make a random input of the
# same dimensions that you mention in your question.
import numpy as np
from numba import jit

@jit(nopython=True)
def filter_image(picture):
    ... I just copied the body of this function from your post above ...
    return first_point, counter, angle

def main():
    n_iterations = 10
    img = np.random.rand(144, 256, 3)
    before = time.time()
    for _ in range(n_iterations):
        # In Python 3, this was just a way I could get access to the original
        # function you defined, without having to make a separate function for
        # it (as the numba call replaces it with an optimized version).
        # It's equivalent to just calling your original function here.
        filter_image.__wrapped__(img)
    print(f'took: {time.time() - before:.3f} without numba')

    before = time.time()
    for _ in range(n_iterations):
        filter_image(img)
    print(f'took: {time.time() - before:.3f} WITH numba')

if __name__ == '__main__':
    main()

Output showing the time difference:

took: 1.768 without numba
took: 0.414 WITH numba

...actually optimizing this function could probably do a lot better, but if this speedup is enough so that you don't need to do other optimization, that's good enough!

Edit (as per macroeconomist's comment): the times I report above also include the upfront time cost of numba just-in-time compiling your function, which happens on the first call. If you are making many calls to this function, the performance difference could actually be much more dramatic. Timing all calls after first first should make the comparison of per-call times more accurate.

edited May 30, 2020 at 1:39

answered May 29, 2020 at 11:35

Tom O'Connell

662 bronze badges

3 Comments

macroeconomist Over a year ago

This is great, but I think you're actually underselling Numba, because these timings include the one-time cost of compilation. On my system, once that cost has passed, I get a 500-fold speedup relative to pure Python.

Tom O'Connell Over a year ago

I guess I didn't know when the compilation happened, but I suppose you're saying it (at least often) happens on the first call? I'll edit my post to acknowledge this caveat when interpreting the times.

macroeconomist Over a year ago

Yes, by default it happens on the first call in a Python session, although for special cases you can specify ahead-of-time compilation, or use a cached version, instead. I usually do a "burn-in" run of a Numba function before I time it to get rid of the compilation time (though as you indicate, if someone plans to only run once, then maybe the compilation time should be included).

Collectives™ on Stack Overflow

Python vectorizing nested for loops in image processing

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related