0

Context: My code fetches a set of coordinates from some png documents and later on performs some redaction in certain fields (it uses these coordinates for drawing rectangles in certain areas).

I want my final output to be a pdf with each redacted image as page. I can achieve this with fpdf package with no problem.

However, I intend to send this pdf file as email (base64 encoded) attachment. Is there any way to get the base64 string from fpdf output?

On top of that, can I use image binary string in fpdf image method?

See the redact_pdf method below (I placed some comments there to be more clear)

Code:

class Redaction:
def __init__(self,png_image_list,df_coordinates):
    self.png_image_list = png_image_list
    self.df_coordinates = df_coordinates
    
def _redact_images(self):
    redacted_images_bin = []
    for page_num,page_data in enumerate(self.png_image_list):
        im_page = Image.open(io.BytesIO(page_data))
        draw = ImageDraw.Draw(im_page)
        df_filtered = self.df_coordinates[self.df_coordinates['page_number'] == page_num+1]
        for index, row in df_filtered.iterrows():
            x0 = row['x0'] * im_page.size[0]
            y0 = row['y0'] * im_page.size[1]
            x1 = row['x1'] * im_page.size[0]
            y1 = row['y1'] * im_page.size[1]
            x2 = row['x2'] * im_page.size[0]
            y2 = row['y2'] * im_page.size[1]
            x3 = row['x3'] * im_page.size[0]
            y3 = row['y3'] * im_page.size[1]
            coords = [x0,y0,x1,y1,x2,y2,x3,y3]
            draw.polygon(coords,outline='blue',fill='yellow')
        redacted_images_bin.append(im_page)
        
    return redacted_images_bin

def redacted_pdf(self):
    redacted_images = self._redact_images()
    pdf = FPDF()
    pdf.set_auto_page_break(0)
    for index,img_redacted in enumerate(redacted_images):
        img_redacted.save(f"image_{index}.png")
        pdf.add_page()
        pdf.image(f"image_{index}.png",w=210,h=297)
        os.remove(f"image_{index}.png") # I would like to avoid file handling!
    pdf.output("doc.pdf","F") # I would like to avoid file handling!
    #return pdf #this is what I want, to return the pdf as base64 or binary
2
  • 2
    if you save in file then you can read it as bytes and use standard module base64 -ie. base64.b64encode(). You may also use standard io.BytesIO() to create file in memory and then you don't have to save file on hard drive. The same way you may use io.BytesIO() to create file image in memory and use it instead of file on hard drive. Many functions may read/write io.BytesIO() (file-like object) instead of filename Commented Feb 17, 2022 at 20:02
  • 1
    on Stackoverflow you may find questions which show how to use io.BytesIO and base64 to send image (or other file) from Flask to HTML/JavaScript. Commented Feb 17, 2022 at 20:04

1 Answer 1

1

In documentation I found that you can get PDF as string using

pdf_string = pdf.output(dest='S')

so you can use standard module base64

import fpdf
import base64

pdf = fpdf.FPDF()

# ... add some elements ...

pdf_string = pdf.output(dest='S')
pdf_bytes  = pdf_string.encode('utf-8')

base64_bytes  = base64.b64encode(pdf_bytes)
base64_string = base64_bytes.decode('utf-8')

print(base64_string)

Result:

JVBERi0xLjMKMyAwIG9iago8PC9UeXBlIC9QYWdlCi9QYXJlbnQgMSAwIFIKL1Jlc291cmNlcyAyIDAgUgovQ29udGVudHMgNCAwIFI+PgplbmRvYmoKNCAwIG9iago8PC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTk+PgpzdHJlYW0KeMKcM1LDsMOiMsOQMzVXKMOnAgALw7wCEgplbmRzdHJlYW0KZW5kb2JqCjEgMCBvYmoKPDwvVHlwZSAvUGFnZXMKL0tpZHMgWzMgMCBSIF0KL0NvdW50IDEKL01lZGlhQm94IFswIDAgNTk1LjI4IDg0MS44OV0KPj4KZW5kb2JqCjIgMCBvYmoKPDwKL1Byb2NTZXQgWy9QREYgL1RleHQgL0ltYWdlQiAvSW1hZ2VDIC9JbWFnZUldCi9Gb250IDw8Cj4+Ci9YT2JqZWN0IDw8Cj4+Cj4+CmVuZG9iago1IDAgb2JqCjw8Ci9Qcm9kdWNlciAoUHlGUERGIDEuNy4yIGh0dHA6Ly9weWZwZGYuZ29vZ2xlY29kZS5jb20vKQovQ3JlYXRpb25EYXRlIChEOjIwMjIwMjE3MjExMDE3KQo+PgplbmRvYmoKNiAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMSAwIFIKL09wZW5BY3Rpb24gWzMgMCBSIC9GaXRIIG51bGxdCi9QYWdlTGF5b3V0IC9PbmVDb2x1bW4KPj4KZW5kb2JqCnhyZWYKMCA3CjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMDE3NSAwMDAwMCBuIAowMDAwMDAwMjYyIDAwMDAwIG4gCjAwMDAwMDAwMDkgMDAwMDAgbiAKMDAwMDAwMDA4NyAwMDAwMCBuIAowMDAwMDAwMzU2IDAwMDAwIG4gCjAwMDAwMDA0NjUgMDAwMDAgbiAKdHJhaWxlcgo8PAovU2l6ZSA3Ci9Sb290IDYgMCBSCi9JbmZvIDUgMCBSCj4+CnN0YXJ0eHJlZgo1NjgKJSVFT0YK

As for image(): it needs filename (or url) and it can't work with string or io.BytesIO().

Eventually you may get source code and you can try to change it.

There is even request on GitHub: Support for StringIO objects as images


EDIT:

I found that there is fork fpdf2 which can use pillow.Image in image() - see fpdf2 Image

And in source code I found image() can also work with io.BytesIO()


Example code for fpdf2 (output() gives bytes instead of string)

import fpdf
import base64
from PIL import Image
import io

#print(fpdf.__version__)

pdf = fpdf.FPDF()

pdf.add_page()

pdf.image('lenna.png')

pdf.image('https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png')

f = open('lenna.png', 'rb')
pdf.image(f)

f = Image.open('lenna.png')
pdf.image(f)

f = open('lenna.png', 'rb')
b = io.BytesIO()
b.write(f.read())
pdf.image(b)

# save in file
pdf.output('output.pdf')

# get as bytes
pdf_bytes = pdf.output()

#print(pdf_bytes)

base64_bytes  = base64.b64encode(pdf_bytes)
base64_string = base64_bytes.decode('utf-8')

print(base64_string)

Wikipedia: Lenna [image]


Test for writing in fpdf2

import fpdf   

pdf = fpdf.FPDF()

pdf.add_page()
pdf.image('https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png')

# --- test 1 ---

pdf.output('output-test-1.pdf')

# --- test 2 ---

pdf_bytes = pdf.output()

with open('output-test-2.pdf', 'wb') as f:  # it will close automatically
    f.write(pdf_bytes)

# --- test 2 ---

pdf_bytes = pdf.output()

f = open('output-test-3.pdf', 'wb')
f.write(pdf_bytes)
f.close()  # don't forget to close when you write
Sign up to request clarification or add additional context in comments.

5 Comments

For some reason performing this: pdf_string = pdf.output(dest='S') pdf_bytes = pdf_string.encode('utf-8') Does not return a valid pdf binary (I tried to write it in a pdf file and then opened it and it was not the actual pdf, it was a blank pdf!)
first you could use print() to see what you get in variables. You could also open PDF in text viewer/editor to see what you get in file. But if it opens it without any errors then it is correct PDF but without content. In first example I don't add any pages/text/images so it create empty PDF.
BTW: after installing fpdf2 I don't have access to original fpdf (because both use the same names for modules) and I can't test this problem. Besides fpdf2 seems more useful so I would forget fpdf
I added code which writes PDF using three methods but I tested it only with fpdf2
I will have a look at fpdf2 (I did all my tests with fpdf). Really appreciate your help here. As soon as I test it I will write here what I saw. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.