0

Is there any simple way for me to read the contents of a binary file as a binary string, turn it into a normal (utf-8) string, do some operations with it, turn it back into a binary string and write it into a binary file? I tried doing something as simple as:

a_file = open('image1.png', 'rb')
text = b''
for a_line in a_file:
    text += a_line
a_file.close()
text2 = text.decode('utf-8')
text3 = text2.encode()
a_file = open('image2.png', 'wb')
a_file.write(text3)
a_file.close()

but I get 'Unicode can not decode bytes in position...'

What am I doing terribly wrong?

2
  • 1
    Why do you think a PNG file would contain text? Commented Oct 17, 2015 at 0:05
  • Not sure what you're trying to accomplish, but this answer to another question may help. Commented Oct 17, 2015 at 0:11

1 Answer 1

1

The utf8 format has enough structure that random arrangements of bytes are not valid UTF-8. The best approach would be to simply work with the bytes read from the file (which you can extract in one step with text = a_file.read()). Binary strings (type bytes) have all the string methods you'll want, even text-oriented ones like isupper() or swapcase(). And then there's bytearray, a mutable counterpart to the bytes type.

If for some reason you really want to turn your bytes into a str object, use a pure 8-bit encoding like Latin1. You'll get a unicode string, which is what you are really after. (UTF-8 is just an encoding for Unicode-- a very different thing.)

Sign up to request clarification or add additional context in comments.

2 Comments

And note, if you settle on a working encoding (e.g. latin-1), you don't need to handle the encode/decode yourself in Python 3. Just change open('image1.png', 'rb') to open('image1.png', 'r', encoding='latin-1'), and for the output, open('image2.png', 'w', encoding='latin-1') and you can read and write without bothering to manually encode/decode; it will have been decoded to str for you on read, and will encode the str for you on write.
Good point; though opening the files in binary mode makes the code a little more transparent... I'm not sure the OP should be converting to str at all.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.