5

None of the similar questions on StackOverflow seem to have the same problem as me, I've only found http://qnundrum.com/question/766895, which was never answered. I'm running Python 3.3 and Django 1.6, so usually Unicode stuff is automatically taken care of. Appreciate any help I can get.

I'm trying to save automatically generated ebooks to my database for later retrieval. Some of the books contain non-ASCII characters. I've got the generation down, .mobi and .epub's work as intended. Here's models.py:

class StoryDownload(models.Model):
    text = models.OneToOneField('stories.Story', primary_key=True, related_name='downloads')
    epub = models.FileField(upload_to='epub/', blank=True, null=True)
    mobi = models.FileField(upload_to='mobi/', blank=True, null=True)

    def update_downloads(self):
        #code to generate epub and mobi files from text
        ...
        self.epub = File(open('filename.epub'), 'r'))
        self.mobi = File(open('filename.mobi'), 'r'))
        self.save()
        ...

The error comes on self.save() is what confuses me; if the files are accepted as Django File objects, then why can't I save them?

Traceback:

File "C:\Users\Chris\Envs\stories\lib\site-packages\django\core\handlers\base.py" in get_response
  114.                     response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\views\generic\base.py" in view
  69.             return self.dispatch(request, *args, **kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\braces\views\_access.py" in dispatch
  64.             request, *args, **kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\guardian\mixins.py" in dispatch
  190.             **kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\views\generic\base.py" in dispatch
  87.         return handler(request, *args, **kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\views\generic\edit.py" in post
  228.         return super(BaseUpdateView, self).post(request, *args, **kwargs)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\views\generic\edit.py" in post
  171.             return self.form_valid(form)
File "C:\Users\Chris\Envs\stories\dev\akrito\chapters\views.py" in form_valid
  69.         self.chapter.story.save()
File "C:\Users\Chris\Envs\stories\dev\akrito\stories\models.py" in save
  87.             self.downloads.update_downloads()
File "C:\Users\Chris\Envs\stories\dev\akrito\stories\models.py" in update_downloads
  135.         self.save()
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\base.py" in save
  545.                        force_update=force_update, update_fields=update_fields)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\base.py" in save_base
  573.             updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\base.py" in _save_table
  632.                       for f in non_pks]
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\base.py" in <listcomp>
  632.                       for f in non_pks]
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\fields\files.py" in pre_save
  252.             file.save(file.name, file, save=False)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\db\models\fields\files.py" in save
  86.         self.name = self.storage.save(name, content)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\core\files\storage.py" in save
  49.         name = self._save(name, content)
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\core\files\storage.py" in _save
  203.                         for chunk in content.chunks():
File "C:\Users\Chris\Envs\stories\lib\site-packages\django\core\files\base.py" in chunks
  76.             data = self.read(chunk_size)
File "C:\Users\Chris\Envs\stories\lib\encodings\cp1252.py" in decode
  23.         return codecs.charmap_decode(input,self.errors,decoding_table)[0]

Exception Type: UnicodeDecodeError at /stories/2/1/e/
Exception Value: 'charmap' codec can't decode byte 0x81 in position 123: character maps to <undefined>
6
  • It looks like you are running on Windows; if your file names have non-ASCII characters, this could be the cause of your exception. Can you confirm the file names? Commented Apr 10, 2014 at 5:17
  • Filenames do not have non-ASCII characters. I'm testing with "The Trial" by Kafka, so filenames are 'the-trial.epub' and 'the-trial.mobi'. Inside the files, there are non-ASCII characters though. Commented Apr 10, 2014 at 5:22
  • How are you converting? Commented Apr 10, 2014 at 5:35
  • Pandoc and Kindlegen. The files themselves work fine, I can open them, view on devices, etc, the problem is in getting the db to take them. Commented Apr 10, 2014 at 5:38
  • self.epub = File(open('the-trial.epub'), 'r', encoding='utf-8')) and the same with the mobi, still doesn't work. The epub is 100% in utf-8 (Pandoc docs), and from what I can find it seems 95% likely the .mobi file is as well. Commented Apr 10, 2014 at 5:56

1 Answer 1

7

For a text file you need to call open with the desired encoding. The default encoding is locale.getpreferredencoding(False), which is why the traceback shows it attempting to decode using the Windows 1252 codepage.

That said, MOBI and EPUB files are zipped archives that should be opened in binary mode, e.g. open('filename.epub', 'rb').

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.