I am trying to use Python 3.7.2 with PyPDF2 1.26 to select some pages of an input PDF file and write the output to stdout (the actual code is more complicated, this is just a MCVE):
import sys
from PyPDF2 import PdfFileReader, PdfFileWriter
input = PdfFileReader("example.pdf")
output = PdfFileWriter()
output.addPage(input.getPage(0))
output.write(sys.stdout)
This fails with the following error:
UserWarning: File <<stdout>> to write to is not in binary mode. It may not be written to correctly. [pdf.py:453]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/site-packages/PyPDF2/pdf.py", line 487, in write
stream.write(self._header + b_("\n"))
TypeError: write() argument must be str, not bytes
The problem seems to be that sys.stdout is not open in binary mode. As some of the answers suggest, I have tried the following:
output.write(sys.stdout.buffer)
This fails with the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/site-packages/PyPDF2/pdf.py", line 491, in write
object_positions.append(stream.tell())
OSError: [Errno 29] Illegal seek
I have also tried the answer from Changing the way stdin/stdout is opened in Python 3:
sout = open(sys.stdout.fileno(), "wb")
output.write(sout)
This fails with the same error as above.
How can I use the PyPDF2 library to output a PDF to standard output?
More generally, how do I correctly switch sys.stdout to binary mode (akin to Perl's binmode STDOUT)?
Note: There is no need to tell me that I can open a file in binary mode and write the PDF to that file. That works; however, I specifically want to write the PDF to stdout.