1

I am trying to use Python 3.7.2 with PyPDF2 1.26 to select some pages of an input PDF file and write the output to stdout (the actual code is more complicated, this is just a MCVE):

import sys
from PyPDF2 import PdfFileReader, PdfFileWriter

input = PdfFileReader("example.pdf")
output = PdfFileWriter()
output.addPage(input.getPage(0))

output.write(sys.stdout)

This fails with the following error:

UserWarning: File <<stdout>> to write to is not in binary mode. It may not be written to correctly. [pdf.py:453]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/PyPDF2/pdf.py", line 487, in write
    stream.write(self._header + b_("\n"))
TypeError: write() argument must be str, not bytes

The problem seems to be that sys.stdout is not open in binary mode. As some of the answers suggest, I have tried the following:

output.write(sys.stdout.buffer)

This fails with the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/PyPDF2/pdf.py", line 491, in write
    object_positions.append(stream.tell())
OSError: [Errno 29] Illegal seek

I have also tried the answer from Changing the way stdin/stdout is opened in Python 3:

sout = open(sys.stdout.fileno(), "wb")
output.write(sout)

This fails with the same error as above.

How can I use the PyPDF2 library to output a PDF to standard output?

More generally, how do I correctly switch sys.stdout to binary mode (akin to Perl's binmode STDOUT)?

Note: There is no need to tell me that I can open a file in binary mode and write the PDF to that file. That works; however, I specifically want to write the PDF to stdout.

1 Answer 1

1

From the documentation:

write(stream)

Writes the collection of pages added to this object out as a PDF file.

Parameters: stream – An object to write the file to. The object must support the write method and the tell method, similar to a file object.

It turns out that sys.stdout.buffer is not tellable if not redirected to a file, hence you can't use it as a stream for PdfFileWriter.write.

Say your script is called myscript. If you call just myscript, then you'll get this error, but if you use it with a redirection, as in:

myscript > myfile.pdf

then Python understands it's a seekable stream, and you won't get the error.

Sign up to request clarification or add additional context in comments.

2 Comments

So what you are saying is that it is impossible to use PyPDF2 to write the output PDF to stdout unless the stdout is redirected to a file? That is unfortunate. I wanted to use the Python script with pipes.
@NikolaBenes I'm afraid it's seems like it's not possible to have PyPDF2 write to a non-tellable stream.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.