4

I'm trying to compare two strings, the first one, s1, comes from mongoengine and the second one, s2, comes from a Django http request.

They look like this:

>>> s1 = product_model.Product.objects.get(pk=1).name
>>> s1
u'Product \xe4 asdf'
>>> s2 = request.POST['name']
>>> s2
'Product \xc3\xa4 asdf'

They have the same letter in them, the Swedish 'ä', but mongoengines (s1) is in a Python unicode string and Djangos (s2) is in a Python bytestring with unicode encoded characters.

I can easily solve this by e.g. converting the Python unicode string to be a byte string

>>> s1.encode('utf-8') == s2
True

But I would like to think that the best-practice is to have all my Python strings encoded the same way in my system, correct?

How can I tell Django to use Python unicode strings instead? Or how can I tell MongoEngine to use unicode encoded Python bytestrings?

3
  • I would not suggest to work with encoded strings. Like this slices say (farmdev.com/talks/unicode) "Decode early, Unicode everywhere, encode late". So i would suggest you to tell Django to use unicode strings, but I am not Django expert, sorry. My approach: s1 == s2.decode("utf8"), so you have both Unicode strings to work with Commented Dec 5, 2013 at 14:55
  • 1
    Something appears to be wrong here, because in Django request.POST['name'] should always give you a Unicode string. Django automatically decodes POST values to Unicode before it ever gets to your view. Commented Dec 5, 2013 at 15:22
  • Looks like @DanielRoseman is right: HttpRequest.POST -> A dictionary-like object containing all given HTTP POST parameters, providing that the request contains form data. See the QueryDict documentation below. And QueryDict -> docs.djangoproject.com/en/dev/ref/request-response/… works with Unicode strings... Commented Dec 5, 2013 at 15:30

1 Answer 1

2

Django docs says:

General string handling

Whenever you use strings with Django – e.g., in database lookups, template rendering or anywhere else – you have two choices for encoding those strings. You can use Unicode strings, or you can use normal strings (sometimes called “bytestrings”) that are encoded using UTF-8.

In Python 3, the logic is reversed, that is normal strings are Unicode, and when you want to specifically create a bytestring, you have to prefix the string with a ‘b’. As we are doing in Django code from version 1.5, we recommend that you import unicode_literals from the future library in your code. Then, when you specifically want to create a bytestring literal, prefix the string with ‘b’.

Python 2 legacy:

my_string = "This is a bytestring"
my_unicode = u"This is an Unicode string"

Python 2 with unicode literals or Python 3:

from __future__ import unicode_literals

my_string = b"This is a bytestring"
my_unicode = "This is an Unicode string"

If you are in Python 2, you can try that. As I said in the comment:

I would not suggest to work with encoded strings. Like this slices say (farmdev.com/talks/unicode) "Decode early, Unicode everywhere, encode late". So i would suggest you to tell Django to use unicode strings, but I am not Django expert, sorry. My approach: s1 == s2.decode("utf8"), so you have both Unicode strings to work with

Hope it works

EDIT: I suppose you are using Django's HttpRequest, so from the docs:

HttpRequest.encoding

A string representing the current encoding used to decode form submission data (or None, which means the DEFAULT_CHARSET setting is used). You can write to this attribute to change the encoding used when accessing the form data. Any subsequent attribute accesses (such as reading from GET or POST) will use the new encoding value. Useful if you know the form data is not in the DEFAULT_CHARSET encoding.

Sign up to request clarification or add additional context in comments.

1 Comment

Yea I tried to read this doc without really learning anything.. do you know where in Django I can set this in a general manner? In settings.py so all Django strings is in the s2.decode('utf8') format as you say

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.