How do I remove a query string from URL using Python

Question

Example:

http://example.com/?a=text&q2=text2&q3=text3&q2=text4

After removing "q2", it will return:

http://example.com/?q=text&q3=text3

In this case, there were multiple "q2" and all have been removed.

Benjamin Loison · Accepted Answer · 2024-03-20 13:44:20Z

89

import sys

if sys.version_info.major == 3:
    from urllib.parse import urlencode, urlparse, urlunparse, parse_qs
else:
    from urllib import urlencode
    from urlparse import urlparse, urlunparse, parse_qs

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4&b#q2=keep_fragment'
u = urlparse(url)
query = parse_qs(u.query, keep_blank_values=True)
query.pop('q2', None)
u = u._replace(query=urlencode(query, True))
print(urlunparse(u))

Output:

http://example.com/?a=text&q3=text3&b=#q2=keep_fragment

edited Mar 20, 2024 at 13:44

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Oct 12, 2011 at 2:42

Miki Tebeka

14.1k5 gold badges40 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

tug Over a year ago

Best answer. One addition, geturl method of urlparse object can be used instead of urlunparse print(u.geturl())

Miki Tebeka Over a year ago

@kurifu The OP wanted to remove only one parameter, not the whole query.

Vanni Totaro Over a year ago

Python 3 imports: from urllib.parse import urlencode, urlparse, urlunparse, parse_qs

IamMashed Over a year ago

access to a protected member _replace of a class.... how can we avoid this warning message?

Miki Tebeka Over a year ago

@IamMashed This is how namedtuples work - docs.python.org/3/library/… It's probably some kind of linter adding the warning.

|

Matthew D. Scholefield · Accepted Answer · 2019-12-14 16:25:55Z

88

To remove all query string parameters:

from urllib.parse import urljoin, urlparse

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
urljoin(url, urlparse(url).path)  # 'http://example.com/'

For Python2, replace the import with:

from urlparse import urljoin, urlparse

edited Dec 14, 2019 at 16:25

Matthew D. Scholefield

3,4553 gold badges36 silver badges47 bronze badges

answered Aug 25, 2015 at 21:56

png

6,6803 gold badges27 silver badges16 bronze badges

3 Comments

Dolph Over a year ago

I like this approach a bit better than the popular answer because it doesn't call any internal APIs, but it will also eliminate URL fragments, whereas the popular answer will preserve them. It also doesn't solve the OP's exact question (it deletes all query string parameters), but it solves mine :)

merlin Over a year ago

Was first looking into furl, but this removes the need to install another library. Works perferctly!

merlin Over a year ago

This should be the accepted answer. I came here twice after a few weeks since it is hard to remember and searched again for it.

Clarius · Accepted Answer · 2019-02-06 08:03:11Z

29

Isn't this just a matter of splitting a string on a character?

>>> url = http://example.com/?a=text&q2=text2&q3=text3&q2=text4
>>> url = url.split('?')[0]
'http://example.com/'

answered Feb 6, 2019 at 8:03

Clarius

1,47918 silver badges13 bronze badges

4 Comments

Programer Beginner Over a year ago

I was thinking about this solution as well. Can anyone tell me if there are any issue (potential bug/loophole) in this proposed solution?

mevers303 Over a year ago

@ProgramerBeginner There isn't one, really!

conradlee Over a year ago

The problem will be clear if you carefully read the original question. The OP wanted to remove only one parameter, all the query parameters.

Ahmed Djamel Over a year ago

This solution will work if all parameter was always sent in the same order, if you work with a URL that have unordered parameters you end deleting the wrong parameter.

Mayank Jaiswal · Accepted Answer · 2016-12-02 09:08:38Z

11

Using python's url manipulation library furl:

import furl
f = furl.furl("http://example.com/?a=text&q2=text2&q3=text3&q2=text4")
f.remove(['q2'])
print(f.url)

answered Dec 2, 2016 at 9:08

Mayank Jaiswal

13.2k7 gold badges42 silver badges41 bronze badges

1 Comment

Mattwmaster58 Over a year ago

Calling it 'python's url manipulation library' makes it sound like it's included in the standard lib, which it isn't.

4b0 · Accepted Answer · 2018-09-17 09:09:50Z

3

query_string = "https://example.com/api/api.php?user=chris&auth=true"
url = query_string[:query_string.find('?', 0)]

edited Sep 17, 2018 at 9:09

4b0

22.4k30 gold badges97 silver badges143 bronze badges

answered Sep 17, 2018 at 9:08

XKCD

1321 silver badge11 bronze badges

2 Comments

Alexander Over a year ago

this does not exactly provide a solution for the given answer. please try improving your answer or deleting it.

Nic3500 Over a year ago

While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.

Hamza · Accepted Answer · 2018-11-14 07:57:17Z

1

Or simply put, just use url_query_cleaner() from w3lib.url

from w3lib.url import url_query_cleaner

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
url_query_cleaner(url, ('q2'), remove=True)

Output: http://example.com/?a=text&q3=text3

answered Nov 14, 2018 at 7:57

Hamza

876 bronze badges

Comments

Spartacus · Accepted Answer · 2024-05-16 17:14:10Z

Another method that you can use to have more control over what you want to do is urlunparse() which takes a tuple of the parts returned from urlparse().

For example, recently I needed to change the path but keep the query:

from urllib.parse import urlparse, urlunparse

url = 'https://test.host.com/some/path?type_id=7'
parsed_url = urlparse(url)

modified_path = f'{parsed_url.path}/new_path_ending'

output_url = urlunparse((
    parsed_url.scheme,
    parsed_url.netloc,
    modified_path,
    parsed_url.params,
    parsed_url.query,
    parsed_url.fragment
))

print(output_url)
'https://test.host.com/some/path/new_path_ending?type_id=7'

This method preserves all of the URL and gives you granular control of what you want to keep, change, and remove.

lc2817 · Accepted Answer · 2011-10-12 02:49:07Z

-2

import re
q ="http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
todelete="q2"
#Delete every query string matching the pattern
r = re.sub(r''+todelete+'=[a-zA-Z_0-9]*\&*',r'',q)
#Delete the possible trailing #
r = re.sub(r'&$',r'',r)

print r

answered Oct 12, 2011 at 2:49

lc2817

3,76219 silver badges40 bronze badges

Comments

Benjamin Loison · Accepted Answer · 2024-03-20 13:44:33Z

-2

Or you could just use strip

>>> l='http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
>>> l.strip('&q2=text4')
'http://example.com/?a=text&q2=text2&q3=text3'
>>>

edited Mar 20, 2024 at 13:44

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Aug 30, 2021 at 10:01

Drew97

1

Collectives™ on Stack Overflow

How do I remove a query string from URL using Python

9 Answers 9

6 Comments

3 Comments

4 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

6 Comments

3 Comments

4 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related