Note: I don't know much about Encoding / Decoding, but after I ran into this problem, those words are now complete jargon to me.
Question:
I'm a little confused here. I was playing around with encoding/decoding images, to store an image as a TextField in a django model, looking around Stack-Overflow I found I could decode an image from ascii(I think or binary? Whatever open('file', 'wb') uses as encoding. I'm assuming the default ascii) to latin1 and store it in a database with no problems.
The problem comes from creating the image from the latin1 decoded data. When attempting to write to a file-handle I get a UnicodeEncodeError saying ascii encoding failed.
I think the problem is when opening a file as binary data (rb) it's not a proper asciiencoding, because it contains binary data. Then I decode the binary data to latin1 but when converting back to ascii (auto encodes when trying to write to the file), it fails, for some unknown reason.
My guess is either that when decoding to latin1 the raw binary data get converted to something else, then when trying to encode back to ascii it can't identify what was once raw binary data. (although the original and decoded data have the same length).
Or the problem lies not with the decoding to latin1 but that I'm attempting to ascii encode binary data. In which case how would I encode the latin1
data back to an image.
I know this is very confusing but I'm confused on it all, so I can't explain it well. If anyone can answer this question there probably a riddle master.
some code to visualize:
>>> image_handle = open('test_image.jpg', 'rb')
>>>
>>> raw_image_data = image_handle.read()
>>> latin_image_data = raw_image_data.decode('latin1')
>>>
>>>
>>> # The raw data can't be processed by django
... # but in `latin1` it works
>>>
>>> # Analysis of the data
>>>
>>> type(raw_image_data), len(raw_image_data)
(<type 'str'>, 2383864)
>>>
>>> type(latin_image_data), len(latin_image_data)
(<type 'unicode'>, 2383864)
>>>
>>> len(raw_image_data) == len(latin_image_data)
True
>>>
>>>
>>> # How to write back to as a file?
>>>
>>> copy_image_handle = open('new_test_image.jpg', 'wb')
>>>
>>> copy_image_handle.write(raw_image_data)
>>> copy_image_handle.close()
>>>
>>>
>>> copy_image_handle = open('new_test_image.jpg', 'wb')
>>>
>>> copy_image_handle.write(latin_image_data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
>>>
>>>
>>> latin_image_data.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
>>>
>>>
>>> latin_image_data.decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)