1

I have computed a segmentation of an image where every superpixel (region) is defined by the value of an entry in a 2D array of the same size as the image. I am trying to obtain a list of indices for each of the regions in order to perform per-region operations at a later time. Here is my current code:

index_list = []
for i in range(num_superpixels):
    indices = np.where(superpixels == i)
    index_list.append(indices)

The following is a minimal example with a 3x3 input containing 3 regions. In the practice I am working with 500-1000 superpixels obtained from 640x480 images, and things get very slow.

>>> superpixels
array([[0, 0, 2],
       [0, 0, 2],
       [1, 1, 2]])

>>> index_list
      [[array([0, 0, 1, 1]), array([0, 1, 0, 1])],
       [array([2, 2]), array([0, 1])],
       [array([0, 1, 2]), array([2, 2, 2])]]

Since each region is a continuous chunk (in the 2D image, but not in memory), using np.where in a loop is really inefficient - at every iteration it's traversing width*height entries to find a region of ~500 entries.

How do I speed this up?

5
  • "A region is defined by value of an entry in a 2D array of the same size as the image". Which 2D array? Do you mean superpixels? Are you calculating each region based on difference? Commented Apr 2, 2021 at 15:31
  • @Kevin indeed, superpixels is the 2D array containing the indices for each superpixel / region. The computation of the region is from an external API and how it's done may change but the output format is going to be the 2D array indexing the regions as described Commented Apr 2, 2021 at 20:05
  • 1
    So you don't know how the regions are computed? All you know is the input and output from this external API? Commented Apr 2, 2021 at 21:10
  • That's correct. Commented Apr 3, 2021 at 16:49
  • Most per-region operations are better performed on the labeled image itself than on your array with indices. This array is just a very inefficient representation of an image. Commented Apr 4, 2021 at 2:23

1 Answer 1

2

Firstly, a better algorithm can be design based on the direct indexing of regions. Indeed, the current code has a complexity of O(width * height * num_superpixels) while it is possible to achieve a O(width * height) complexity. The idea is to create num_superpixels bins and append the location of each cell (of the 2D array) in bin[cellValue].

Note that implementing that with a Python loop would be too slow, but you can speed up the implementation with Numba. Since Numba does not like variable-sized arrays (inefficient), a first pass can be applied to count the number of cell in each bin and then fill the cell positions.

Here is an example:

from numba import jit, njit, int32, int64, prange
from numba.types import UniTuple, List

@jit(List(UniTuple(int32[::1],2))(int64[:,::1], int64))
def fastCompute(superpixels, num_superpixels):
    # Count the number of elements
    binSize = np.zeros(num_superpixels, dtype=np.int32)
    for i in range(superpixels.shape[0]):
        for j in range(superpixels.shape[1]):
            binSize[superpixels[i,j]] += 1

    # Put the pixels location in the right bin
    result = [(np.empty(binSize[i], dtype=np.int32), np.empty(binSize[i], dtype=np.int32)) for i in range(num_superpixels)]
    binPos = np.zeros(num_superpixels, dtype=np.int32)
    for i in range(superpixels.shape[0]):
        for j in range(superpixels.shape[1]):
            binIdx = superpixels[i,j]
            tmp = result[binIdx]
            cellBinPos = binPos[binIdx]
            tmp[0][cellBinPos] = i
            tmp[1][cellBinPos] = j
            binPos[binIdx] += 1

    return result

On my machine, with the following random-based configuration, the above function is 120 times faster than the initial code.

# Generate a random input
num_superpixels = 500
superpixels = np.random.randint(np.ones((640, 480)) * num_superpixels)

The type of the output of fastCompute is similar to the initial code (except it use tuples and 32-bit integers for sake of performance), but it is not optimal as it contains pure-Python object types and it is not very compact. Tweaking the output type should result in an even faster code.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice, I hadn't used numba before. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.