Document type to bytes in Python

Question

I am creating a PDF in python using the borb library. Everything is great and i've found it really easy to work with.

However, I now have an object that is of type 'Document' in memory. I don't want to actually save this file anywhere. I just want to upload it to S3.

What is the best way to convert this file to bytes so that it can be uploaded to S3?

Below is the actual code. As well as some things i've tried.

async def create_invoice_pdf():
    # Create document
    pdf = Document()

    # Add page
    page = Page()
    pdf.append_page(page)

    page_layout = SingleColumnLayout(page)
    page_layout.vertical_margin = page.get_page_info().get_height() * Decimal(0.02)

    page_layout.add(    
        Image(        
        "xxx.png",        
        width=Decimal(250),        
        height=Decimal(60),    
        ))

    # Invoice information table  
    page_layout.add(_build_invoice_information())  
    
    # Empty paragraph for spacing  
    page_layout.add(Paragraph(" "))

    # Itemized description
    page_layout.add(_build_itemized_description_table())

    page_layout.add(Paragraph(" "))

    # Itemized description
    page_layout.add(_build_payment_itemized_description_table())

    upload_to_aws(pdf, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

TypeError: a bytes-like object is required, not 'Document'

Using

 pdf_bytes = bytearray(pdf)
 upload_to_aws(pdf, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

TypeError: 'Name' object cannot be interpreted as an integer

Using

s = str(pdf).encode
pdf_bytes = bytearray(s)
upload_to_aws(pdf_bytes, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

File is uploaded, but is corrupted and can not be opened after download

I am able to save the file locally using:

with open("file.pdf", "wb") as pdf_file_handle:
        PDF.dumps(pdf_file_handle, pdf)

But I don't actually want to do this.

Any idea? Thanks in advance

You have an in-memory Python object. It is not a PDF. There is no way that converting it to bytes is going to work because nothing you have tried actually creates a PDF, except the dumps() method you say you don't want to use. You can't avoid that method because you need borb's facilities to turn its internal data structures into PDF-style compressed PostScript. The best you can do is create an in-memory file using io.StringIO. You need to upload the compressed Postscript representation of your document. You can't do that without first creating it. — BoarGules
– BoarGules, Commented May 31, 2022 at 8:52

bruzza42 · Accepted Answer · 2022-05-31 12:13:58Z

2

Managed to figure it out:

PDF.dumps can be used outside of the with open...

and then it is a simple io buffer

buffer = io.BytesIO()

PDF.dumps(buffer, pdf)
buffer.seek(0)

upload_to_aws(buffer.read(), "Borb/Test.pdf", INVOICE_BUCKET, "application/pdf")

answered May 31, 2022 at 12:13

bruzza42

4331 gold badge3 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Mark Ransom Over a year ago

You should at least give credit to the comment that put you on the right track.

bruzza42 Over a year ago

@MarkRansom it didn't help me. I reached out to the creator or Borb and he helped.

Collectives™ on Stack Overflow

Document type to bytes in Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related