How to use OpenCV's connectedComponentsWithStats in Python?

Question

I am looking for an example of how to use OpenCV's connectedComponentsWithStats() function in Python. Note this is only available with OpenCV 3 or newer. The official documentation only shows the API for C++, even though the function exists when compiled for Python. I could not find it anywhere online.

For insights on using the labels to mask the image etc, see Python OpenCV \- Connected Component Labeling and Analysis \- GeeksforGeeks — nealmcb
– nealmcb, Commented Mar 12, 2023 at 0:39

Zack Knopp · Accepted Answer · 2016-07-04 17:14:59Z

150

The function works as follows:

# Import the cv2 library
import cv2
# Read the image you want connected components of
src = cv2.imread('/directorypath/image.bmp')
# Threshold it so it becomes binary
ret, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# You need to choose 4 or 8 for connectivity type
connectivity = 4  
# Perform the operation
output = cv2.connectedComponentsWithStats(thresh, connectivity, cv2.CV_32S)
# Get the results
# The first cell is the number of labels
num_labels = output[0]
# The second cell is the label matrix
labels = output[1]
# The third cell is the stat matrix
stats = output[2]
# The fourth cell is the centroid matrix
centroids = output[3]

Labels is a matrix the size of the input image where each element has a value equal to its label.

Stats is a matrix of the stats that the function calculates. It has a length equal to the number of labels and a width equal to the number of stats. It can be used with the OpenCV documentation for it:

Statistics output for each label, including the background label, see below for available statistics. Statistics are accessed via stats[label, COLUMN] where available columns are defined below.

cv2.CC_STAT_LEFT The leftmost (x) coordinate which is the inclusive start of the bounding box in the horizontal direction.

cv2.CC_STAT_TOP The topmost (y) coordinate which is the inclusive start of the bounding box in the vertical direction.

cv2.CC_STAT_WIDTH The horizontal size of the bounding box

cv2.CC_STAT_HEIGHT The vertical size of the bounding box

cv2.CC_STAT_AREA The total area (in pixels) of the connected component

Centroids is a matrix with the x and y locations of each centroid. The row in this matrix corresponds to the label number.

edited Jul 4, 2016 at 17:14

answered Mar 7, 2016 at 21:16

Zack Knopp

2,9852 gold badges15 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Бојан Матовски Over a year ago

I must say that for some reason, I had to use cv2.THRESH_BINARY instead of cv2.THRESH_BINARY+cv2.THRESH_OTSU, then I had to cast src to integer and thresh to float in order for it to work. I don't know why, but it didn't work otherwise.

Zack Knopp Over a year ago

@ypnos You don't need to for connected components with stats, but do for connected components without stats. I think that part was just left over from me doing it the other way. I fixed it now. Cheers!

recurf Over a year ago

can some one explain how to use the labels? How to check if a centroid is what label?

krs013 Over a year ago

Each component in the image gets a number (label). The background is label 0, and the additional objects are numbered from 1 to num_labels-1. The centroids are indexed by the same numbers as the labels. centroids[0] isn't particularly useful--it's just the background. centroids[1:num_labels] is what you want.

smcs Over a year ago

@matchifang You could create an array with the component areas: areas=output[2][:,4] Then an array with the numbers of components: nr=np.arange(output[0]) Then sort them according to area size: ranked=sorted(zip(areas,nr)) With help from here: stackoverflow.com/questions/6618515/…

|

Dan Erez · Accepted Answer · 2018-11-13 19:57:09Z

22

I have come here a few times to remember how it works and each time I have to reduce the above code to :

_, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
connectivity = 4  # You need to choose 4 or 8 for connectivity type
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresh , connectivity , cv2.CV_32S)

Hopefully, it's useful for everyone :)

answered Nov 13, 2018 at 19:57

Dan Erez

1,54418 silver badges16 bronze badges

Comments

Barel Levy · Accepted Answer · 2018-03-28 10:52:38Z

11

Adding to Zack Knopp answer, If you are using a grayscale image you can simply use:

import cv2
import numpy as np

src = cv2.imread("path\\to\\image.png", 0)
binary_map = (src > 0).astype(np.uint8)
connectivity = 4 # or whatever you prefer

output = cv2.connectedComponentsWithStats(binary_map, connectivity, cv2.CV_32S)

When I tried using Zack Knopp answer on a grayscale image it didn't work and this was my solution.

edited Mar 28, 2018 at 10:52

answered Mar 5, 2018 at 14:48

Barel Levy

1212 silver badges9 bronze badges

Comments

Threadprogrammer · Accepted Answer · 2022-10-05 07:42:24Z

0

the input image needs to be single channel. so first convert to grayscale, otherwise it causes error in opencv 4.x you need to convert to grayscale and then the Zack's answer.

src = cv.cvtColor(src, cv.COLOR_BGR2GRAY)

answered Oct 5, 2022 at 7:42

Threadprogrammer

111 silver badge3 bronze badges

Collectives™ on Stack Overflow

How to use OpenCV's connectedComponentsWithStats in Python?

4 Answers 4

12 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

12 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related