4

I have an image

enter image description here

from where I want to extract each and every character individually.

As i want something like THIS OUTPUT and so on.

What would be the appropriate approach to do this using OpenCV and python?

2
  • you have provided the same links for sample and output1. Commented Feb 28, 2017 at 7:04
  • @frederick99 sorry..now please check it again.... Commented Feb 28, 2017 at 7:08

2 Answers 2

8

A short addition to Amitay's awesome answer. You should negate the image using

cv2.THRESH_BINARY_INV

to capture black letters on white paper.

Another idea could be the MSER blob detector like that:

img = cv2.imread('path to image')
(h, w) = img.shape[:2]
image_size = h*w
mser = cv2.MSER_create()
mser.setMaxArea(image_size/2)
mser.setMinArea(10)

gray = cv2.cvtColor(filtered, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
_, bw = cv2.threshold(gray, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

regions, rects = mser.detectRegions(bw)

# With the rects you can e.g. crop the letters
for (x, y, w, h) in rects:
    cv2.rectangle(img, (x, y), (x+w, y+h), color=(255, 0, 255), thickness=1)

This also leads to a full letter recognition.

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Your idea is great and it is working in most cases, but sometimes it detects two characters as one. Do you know a way to optimize it to get a perfect character segmentation?
Despite tweaking the MSER parameters, you can use dilate + erode to increase the gap (use it on a mask and crop from the original image afterwards). Sorry for the late reply though.
In very difficult scenarios (a lot of noice in the image) this is not working very well with cv2. I'm going to build my own model to separate the chars.
1

You can do the following ( opencv 3.0 and aboove)

  1. Run Otsu thresholding on the image (http://docs.opencv.org/3.2.0/d7/d4d/tutorial_py_thresholding.html)
  2. Run connected component labeling with stats on the threshold images.(How to use openCV's connected components with stats in python?)
  3. For each connected component take the bounding box using the stat you got from step 2 which has for each one of the comoneonts the follwing information (cv2.CC_STAT_LEFT cv2.CC_STAT_TOP cv2.CC_STAT_WIDTH cv2.CC_STAT_HEIGHT)
  4. Using the bounding box crop the component from the original image.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.