5

I am trying to convert a pdf file to image file for this in my ubuntu server i have installed:

  1. python2.7
  2. poppler-utils
  3. pdf2image==1.12.1

My code:

from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path("/home/user/pdf_file.pdf")

# OR

with open("/home/user/pdf_file.pdf") as pdf:
    images = convert_from_bytes(pdf.read())

OUTPUT

When I am using the function "convert_from_path"

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

When I am using the function "convert_from_bytes"

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
    paths_only=paths_only,
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

I have reinstalled all my utilities then i am facing these problems.

1
  • From pip pypi.org/project/pdf2image , Python 2.7 seems not supported. It clearly says A python (3.5+) module that wraps pdftoppm and pdftocairo to convert PDF to a PIL Image object for version 1.12.1 Commented Mar 16, 2020 at 6:51

2 Answers 2

5

If you want to convert PDF to image you can try Python Ghostscript package:

pip install ghostscript

import ghostscript
import locale

def pdf2jpeg(pdf_input_path, jpeg_output_path):
    args = ["pef2jpeg", # actual value doesn't matter
            "-dNOPAUSE",
            "-sDEVICE=jpeg",
            "-r144",
            "-sOutputFile=" + jpeg_output_path,
            pdf_input_path]

    encoding = locale.getpreferredencoding()
    args = [a.encode(encoding) for a in args]

    ghostscript.Ghostscript(*args)

pdf2jpeg(
    "...Fixate/ActiveState/pdf/a.pdf",
    "...Fixate/ActiveState/pdf/a.jpeg",
)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer. I have tried your code and it is working well.
I would be glad to accept this as answer but it is not the answer of my question, if anyone have same kind of problem will be confused and actually i also need to solve this problem. thanks!
@RokiDGupta Ok! I got it
4

I failed in python2 too, but succeeded in python3.

There's a same issue happened on an other library: TypeError: 'threadsafe_iter' object is not an iterator

As they said, it's a python 2 vs 3 issue, caused by next() function.
If modify __next__() -> next() in file/home/***/.local/lib/python2.7/site-packages/pdf2image/generators.py , it will run successful in py2.

BTW, i have create a new issue to pdf2image team.
TypeError: ThreadSafeGenerator object is not an iterator #133


Additional
pdf2image readme said it's a python (3.5+) module.
pdf2image v1.7.1 work on py27. try it by pip install pdf2image==1.7.1

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.