How can I percent-encode URL parameters in Python?

Question

If I do

url = "http://example.com?p=" + urllib.quote(query)

It doesn't encode / to %2F (breaks OAuth normalization)
It doesn't handle Unicode (it throws an exception)

Is there a better library?

What is the language-agnostic canonical Stack Overflow question? (That is, only covering the encoding, not how it is achieved.) — Peter Mortensen
– Peter Mortensen, Commented Nov 27, 2022 at 21:54
@JamieMarshall what should they be called then if not URL parameters? — Ben Creasy
– Ben Creasy, Commented Oct 13, 2023 at 16:32
@BenCreasy- attributes. The specification for URL describes parameters as a separate part of the URL (not involving the query string). reference here I can't tell you how much time I've lost trying to authenticate an API because they were asking for parameters and I was sending query attributes. — Jamie Marshall
– Jamie Marshall, Commented Nov 2, 2023 at 19:12
@JamieMarshall even though "attributes" is the real name, I also found this question searching "parameter" in Google, and I would never think about searching for "url attributes" — Kaki In
– Kaki In, Commented Oct 11 at 12:32

Benjamin Loison · Accepted Answer · 2023-07-04 12:37:45Z

560

From the Python 3 documentation:

urllib.parse.quote(string, safe='/', encoding=None, errors=None)

Replace special characters in string using the %xx escape. Letters, digits, and the characters '_.-~' are never quoted. By default, this function is intended for quoting the path section of a URL. The optional safe parameter specifies additional ASCII characters that should not be quoted — its default value is '/'.

That means passing '' for safe will solve your first issue:

>>> import urllib.parse
>>> urllib.parse.quote('/test')
'/test'
>>> urllib.parse.quote('/test', safe='')
'%2Ftest'

(The function quote was moved from urllib to urllib.parse in Python 3.)

By the way, have a look at urlencode.

About the second issue, there was a bug report about it and it was fixed in Python 3.

For Python 2, you can work around it by encoding as UTF-8 like this:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller

edited Jul 4, 2023 at 12:37

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Nov 8, 2009 at 2:52

Nadia Alramli

116k39 gold badges176 silver badges152 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Paul Tarjan Over a year ago

Thanks you, both worked great. urlencode just calls quoteplus many times in a loop, which isn't the correct normalization for my task (oauth).

Jeff Sheffield Over a year ago

the spec: rfc 2396 defines these as reserved reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," Which is what urllib.quote is dealing with.

Andreas Haferburg Over a year ago

urllib.parse.quote docs

chrizonline Over a year ago

if you wanna retain the colon from http: , do urllib.parse.quote('http://example.com/some path/').replace('%3A', ':')

Pavel Vlasov Over a year ago

@chrizonline Just use urllib.parse.quote(url, safe=':/'). Even better, encode some path, then join strings. This is Python, not PHP.

|

Peter Mortensen · Accepted Answer · 2021-11-19 15:48:39Z

217

In Python 3, urllib.quote has been moved to urllib.parse.quote, and it does handle Unicode by default.

>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'

edited Nov 19, 2021 at 15:48

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Nov 29, 2012 at 11:52

Paolo Moretti

56.5k23 gold badges103 silver badges93 bronze badges

3 Comments

Luc Over a year ago

The name quote is rather vague as a global. It might be nicer to use something like urlencode: from urllib.parse import quote as urlencode.

jaymmer - Reinstate Monica Over a year ago

Note that there is a function named urlencode in urllib.parse already that does something completely different, so you'd be better off picking another name or risk seriously confusing future readers of your code.

Trevor Boyd Smith Over a year ago

(style suggestion: @Luc i agree that quote is "rather vague". rather than rename the variable/object to something else you can leave the name fully qualified as urllib.parse.quote. leaving it fully qualified does two things: takes a little extra time typing and saves time reading and maintaining the code. )

Peter Mortensen · Accepted Answer · 2021-11-19 15:59:54Z

65

I think module requests is much better. It's based on urllib3.

You can try this:

>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'

_{My answer is similar to Paolo's answer.}

edited Nov 19, 2021 at 15:59

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Jul 14, 2015 at 8:30

Aminah Nuraini

19.4k9 gold badges98 silver badges113 bronze badges

4 Comments

Cjkjvfnby Over a year ago

requests.utils.quote is link to python quote. See request sources.

Jeff Sheffield Over a year ago

requests.utils.quote is a thin compatibility wrapper to urllib.quote for python 2 and urllib.parse.quote for python 3

PythoNic Over a year ago

without reading the comments, this is creating confusion...

Jens Over a year ago

And: why take dependency on an external package, when the functionality is built into Python’s own stdlib?

Peter Mortensen · Accepted Answer · 2021-11-19 16:23:08Z

15

If you're using Django, you can use urlquote:

>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")
u'M%C3%BCller'

Note that changes to Python mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:

A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)

edited Nov 19, 2021 at 16:23

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Oct 27, 2015 at 19:40

Rick Westera

3,3001 gold badge37 silver badges23 bronze badges

1 Comment

mosi_kha Over a year ago

it's deprecated from Django 3.0+

Peter Mortensen · Accepted Answer · 2021-11-19 15:56:06Z

7

It is better to use urlencode here. There isn't much difference for a single parameter, but, IMHO, it makes the code clearer. (It looks confusing to see a function quote_plus! - especially those coming from other languages.)

In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'

In [22]: val=34

In [23]: from urllib.parse import urlencode

In [24]: encoded = urlencode(dict(p=query,val=val))

In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34

Documentation

urlencode
quote_plus

edited Nov 19, 2021 at 15:56

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Nov 29, 2018 at 15:46

balki

27.9k33 gold badges112 silver badges156 bronze badges

Comments

BaiJiFeiLong · Accepted Answer · 2022-08-04 11:01:03Z

2

An alternative method using furl:

import furl

url = "https://httpbin.org/get?hello,world"
print(url)
url = furl.furl(url).url
print(url)

Output:

https://httpbin.org/get?hello,world
https://httpbin.org/get?hello%2Cworld

answered Aug 4, 2022 at 11:01

BaiJiFeiLong

4,7951 gold badge37 silver badges37 bronze badges

Collectives™ on Stack Overflow

How can I percent-encode URL parameters in Python?

6 Answers 6

8 Comments

3 Comments

4 Comments

1 Comment

Documentation

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

8 Comments

3 Comments

4 Comments

1 Comment

Documentation

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related