2

How does one convert a file that has been base64 encoded back to its original format and write it to disk? For instance I have a pdf file which has been mime64 encoded. The file starts with:

data:application/pdf;base64,JVBER

I would like to write this out to disk in the proper format. I have tried several libraries (e.g. ring.util.codec) that decode the string into a byte-array, but if I write the resulting byte-array out to a file (using spit) the file appears corrupted.

UPDATE:

The PHP function base64_decode appears to be doing what I am looking for, as it returns a string. What is the equivalent in Java?

3
  • Have you looked on the internet for base64 tools? Or in Linux have you searched your package repository? Commented Aug 21, 2011 at 22:37
  • 2
    stackoverflow.com/questions/469695/decode-base64-data-in-java Commented Aug 21, 2011 at 22:41
  • 1
    I have and I also read the question referenced above. I can decode the string into a byte.array, but how do I write this to a file in a way that turns the contents into the original file format? Commented Aug 21, 2011 at 22:47

2 Answers 2

3

In Clojure, there is data.codec (formerly in clojure-contrib).

Using Java interoperability :

So those are the helper functions I used for images when using data.codec :

(require '[clojure.data.codec.base64 :as b64-codec])

(defn write-img! [id b64]
  (clojure.java.io/copy
   (decode-str (chop-header b64))
   (java.io.File. (str "/Users/nha/tmp/" id "." (b64-ext b64)))))

(defn decode-str [s]
  (b64-codec/decode (.getBytes s)))

(defn in?
  "true if the seq coll contains the element el"
  [coll el]
  (some #(= el %) coll))

(defn b64-ext [s]
  (if-let [ext (second (first (re-seq #"data:image/(.*);base64.*" s)))]
    (if (in? ["png" "jpeg"] ext)
      ext
      (throw (Exception. (str "Unsupported extension found for image " ext))))
    (throw (Exception. (str "No extension found for image " s)))))

(defn chop-header [s]
  (nth (first (re-seq #"(data:image/.*;base64,)(.*)" s)) 2))
Sign up to request clarification or add additional context in comments.

Comments

3

Any java library should work (here's one, from Apache Commons, here's one totally in Clojure from Clojure-contrib

I suspect the content is modified somehow, meaning bytes may be converted to string using some encoding, and then trying to read this string back to bytes using a different encoding.

The first step may be to check you have the exact same number of bytes in the file on the server side, and the file you are trying to read. Also, try to confirm the checksum (MD5) is the same.

In any case, a PDF file is a binary file, so you should NOT convert it to string anywhere, but straight bytes.

3 Comments

I did check the integrity of the file and it has not been corrupted. It also can be converted using the PHP base64_decode function without any issues.
can you make the raw bytes available somewhere ?
I solved it. Apparently the datauri header was messing up the decoding. If I chop off the header "data:application/pdf;base64," it works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.