1

I have a folder containing multiple image files. I want to extract text from these files and have the output saved as csv file with 2 columns, 1st column: Image_no., 2nd column: Text.

TIA

I have tried this code on Python:

img_dir = "MyFolder" # Folder name containing image files 
data_path = os.path.join(img_dir,'*g')
files = glob.glob(data_path)
data = []
for f1 in files:
    img = cv2.imread(f1)
    x=data.append(img)

Q1: How can I see the text that is extracted from each image? Q2: How can I export the image name & the corresponding text to csv?

1
  • your current code is simply appending all the pixels from each image into a single list. look into tesseract or other OCR libraries you can easily integrate with OpenCV, for example: pyimagesearch.com/2018/09/17/… Commented Jul 30, 2019 at 10:53

1 Answer 1

3

Part 1:

Please install Tesseract and pytesseract

pip install pillow
pip install pytesseract
pip install tesseract

Reference links:

Part 2:

from PIL import Image
import pytesseract
import os
import pandas as pd

# Path is given for for 64 bit installer
pytesseract.pytesseract.tesseract_cmd = "C:/Program Files/Tesseract-OCR/tesseract.exe"

f = []
t = []
input_dir = r'C:/Users/suhas/Downloads/images/'

for root, dirs, filenames in os.walk(input_dir):
    for filename in filenames:
        try:
            print(filename)
            f.append(filename)
            img = Image.open(input_dir+ filename)
            text = pytesseract.image_to_string(img, lang = 'eng')
            t.append(text)
            print(text)
            print('-='*20)
        except:
            continue


df = pd.DataFrame(list(zip(f, t)),columns=['file_Name','Text'])

Output:

                     file_Name      Text
0   Screenshot_20191104-130254.png  MNP_6050
1   Screenshot_20191104-130336.png  MNP_6039
2   Screenshot_20191104-130943.png  MNP_6116
3   Screenshot_20191104-131248.png  MNP_6093
4   Screenshot_20191104-230714.png  MNP_6013
5   Screenshot_20191104-230834.png  MNP_6006

PS: In order to get clean text you may need to use Regex

Reference Links:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.