28

Is there a simple method I'm missing in urllib or other library for this task? URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.

Here's an example of an input and my expected output:

Mozilla/5.0 (Linux; U; Android 4.0; xx-xx; Galaxy Nexus Build/IFL10C) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Mozilla%2F5.0+%28Linux%3B+U%3B+Android+4.0%3B+xx-xx%3B+Galaxy+Nexus+Build%2FIFL10C%29+AppleWebKit%2F534.30+%28KHTML%2C+like+Gecko%29+Version%2F4.0+Mobile+Safari%2F534.30

3 Answers 3

53

For Python 2.x, use urllib.quote

Replace special characters in string using the %xx escape. Letters, digits, and the characters '_.-' are never quoted. By default, this function is intended for quoting the path section of the URL. The optional safe parameter specifies additional characters that should not be quoted — its default value is '/'.

example:

In [1]: import urllib

In [2]: urllib.quote('%')
Out[2]: '%25'

EDIT:

In your case, in order to replace space by plus signs, you may use urllib.quote_plus

example:

In [4]: urllib.quote_plus('a b')
Out[4]: 'a+b'

For Python 3.x, use quote

>>> import urllib
>>> a = "asdas#@das"
>>> urllib.parse.quote(a)
'asdas%23%40das'

and for string with space use quote_plus

>>> import urllib
>>> a = "as da& s#@das"
>>> urllib.parse.quote_plus(a)
'as+da%26+s%23%40das'
Sign up to request clarification or add additional context in comments.

3 Comments

or urllib.quote_plus, since OP wants + instead of %20.
but to get what the OP asks for, use urllib.quote_plus.
I believe, for Python 3.*, you should do import urllib.parse ... urllib.parse.quote ... or from urllib import parse ... parse.quote ... rather than import urllib ... urllib.parse.quote ..., which will result in AttributeError: module 'urllib' has no attribute 'parse', kind of similar to imports in werkzeug. Tested on Python 3.6.1.
3

Keep in mind that both urllib.quote and urllib.quote_plus throw an error if an input is a unicode string:

s = u'\u2013'
urllib.quote(s)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\urllib.py", line 1303, in quote
    return ''.join(map(quoter, s))
KeyError: u'\u2013'

As answered here on SO, one has to use 'UTF-8' explicitly:

urllib.quote(s.encode('utf-8'))

Comments

1

Also, if you have a dict of several values, the best way to do it will be urllib.urlencode.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.