5

I have an RGB image composed of 7 different possible colors. I want to count how many of each pixel type is present in the image, in an efficient way. So not a loop on every pixels if possible, at least not manually (numpy operation is ok beacause it's way faster)

I tried loading it into a numpy array, which gives me a N*M*3 array, but I can't figure out a way to count the pixels of a particular value... Any ideas?

Thank you !

6
  • 1
    Are you considering using an histogram? Commented Oct 4, 2018 at 10:35
  • 1
    Can you provide an image and a piece of code ? Commented Oct 4, 2018 at 11:04
  • Possible duplicate of Count the number of pixels by color from an image loaded into a numpy array Commented Oct 4, 2018 at 11:22
  • Are the seven colors fixed? Do you need to know all color channels to decide which color? Because if for example the red (or any of the other two) channel were enough to discriminate between all seven colors, that would make things much easier. Commented Oct 4, 2018 at 11:24
  • @Brenlla I think it's not a direct dupe, because the small number of colors used may allow for some optimizations. Commented Oct 4, 2018 at 11:31

2 Answers 2

8

Just partially flatten and use np.unique with return_counts = True and axis = 0

flat_image = image.reshape(-1, 3)  # makes one long line of pixels
colors, counts = np.unique(flat_image, return_counts = True, axis = 0)

Or as one line:

colors, counts = np.unique(image.reshape(-1, 3), 
                           return_counts = True, 
                           axis = 0)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I've come up with a solution a bit similar but more complicated... This is perfect !
3

Since there are only seven colors simple masking will under reasonable assumptions be quite competitive. Timings below are for 100x100x3 @ 8bit random images:

timings
np.unique 6.510251379047986
masking   0.2401340039796196

Note that much but not all of the speedup is due to merging the channels into a single one.

Code:

import numpy as np

def create(M, N, k=7):
    while True:
        colors = np.random.randint(0, 256**3, (k,), dtype=np.int32)
        if np.unique(colors).size == k:
            break
    picture = colors[np.random.randint(0, k, (M, N))]
    RGB = np.s_[..., :-1] if picture.dtype.str.startswith('<') else np.s_[..., 1:]
    return picture.view(np.uint8).reshape(*picture.shape, 4)[RGB]

def f_df(image):
    return np.unique(image.reshape(-1, 3), 
                     return_counts = True, 
                     axis = 0)

def f_pp(image, nmax=50):
    iai32 = np.pad(image, ((0, 0), (0, 0), (0, 1)), mode='constant')
    iai32 = iai32.view(np.uint32).ravel()

    colors = np.empty((nmax+1,), np.uint32)
    counts = np.empty((nmax+1,), int)
    colors[0] = iai32[0]
    counts[0] = 0
    match = iai32 == colors[0]
    for i in range(1, nmax+1):
        counts[i] = np.count_nonzero(match)
        if counts[i] == iai32.size:
            return colors.view(np.uint8).reshape(-1, 4)[:i, :-1], np.diff(counts[:i+1])
        colors[i] = iai32[match.argmin()]
        match |= iai32 == colors[i]
    raise ValueError('Too many colors')



image = create(100, 100, 7)

col_df, cnt_df = f_df(image)
col_pp, cnt_pp = f_pp(image)
#print(col_df)
#print(cnt_df)
#print(col_pp)
#print(cnt_pp)
idx_df = np.lexsort(col_df.T)
idx_pp = np.lexsort(col_pp.T)

assert np.all(cnt_df[idx_df] == cnt_pp[idx_pp])

from timeit import timeit
print('timings')
print('np.unique', timeit(lambda: f_df(image), number=1000))
print('masking  ', timeit(lambda: f_pp(image), number=1000))

1 Comment

Anecdotally, this code was a lot faster for me than np.unique. Thanks Paul!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.