In javascript I do the following:
encodeURIComponent(comments)
while in Python i do the following:
urllib2.unquote(comments)
For some reason, when I do the following:
encodedURIComponents('ø')
I get %C3%B8, but when I decode
urllib2.unquote('%C3%B8')
I get ø instead of ø, which is the original character.
What gives?
I'm on a platform that uses jQuery on client side, and Python/Django server side.
unicodestrings internally, and encode and decode strings as appropriate at each boundary. (Python 3 makes this easier by giving you an error instead of mojibake when you get it wrong.)strandunicode, and if you just usestryou're dealing with bytes whose meaning is unspecified. (See whatsys.getdefaultencoding()returns.) So get the charset the form uses, and decode the string into unicode to work with; when sending it back, encode to UTF-8 and set the charset (or, better, let Django take care of it, in case the browser sends anAccept-Charsetfor some reason).