0

I am having some data in URL format and I want to decode it using Python. I tried the (accepted) answer here but I am still not getting getting the correct decoding. My code is as follows:

import urllib2

name = '%D0%BD%D0%BE%D1%82%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%82%D0%BE%D1%80-%D0%BE%D0%BB%D0%B8%D0%BC%D0%BF%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D1%85-%D0%B8'

print urllib2.unquote(urllib2.quote(name.encode("utf8"))).decode("utf8")

This should print нотификатор-олимпийских-и but it prints %D0%BD%D0%BE%D1%82%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%82%D0%BE%D1%80-%D0%BE%D0%BB%D0%B8%D0%BC%D0%BF%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D1%85-%D0%B8

so I tried unquoting it again

print urllib2.unquote(urllib2.unquote(urllib2.quote(name.encode("utf8"))).decode("utf8"))

but it gives me ноÑиÑикаÑоÑ-олимпийÑкиÑ-и

I am not sure why this happens. Can anyone please explain where am I doing wrong and how do I correct my mistake?

1 Answer 1

1

Too many quote/unquote operations: you get a UTF-8 string that is already URL-encoded, why are you UTF-8 and URL encoding it again?

unquoted = urllib.unquote(name)
print unquoted.decode('utf-8')
# нотификатор-олимпийских-и
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.