2

Say I have a url that or may not be HTTPS, and who's host name I don't control, but follows a format like: http://example.com/special/content [OR] https://example.com/special/content

Using Python what would be the pythonic way of changing scheme to https and path to /something/else

My current approach is:

from urlparse import urlsplit, urljoin, urlunsplit
currenturl = "http://example.com/some/content"
parts = list(urlsplit(urljoin(currenturl, "/something/else")))
parts[0]="https"
newurl = urlunsplit(parts)

Any suggestions ?


Suggestion (from @ignacio-vazquez-abrams)

from urlparse import urlparse, urljoin, urlunparse
currenturl = "http://example.com/some/content"
parts = list(urlparse(currenturl))
parts[0]="https"
parts[2]="/something/else"  # If only path needed changing (or see bellow...)
newurl = urlunparse(parts)
newurl = urljoin(newurl, "/something/else") # If we need to rewrite everything
                                            # after network loc
1

1 Answer 1

4

You are so close. Use urlparse.urlparse() to split it up, take the parts you care about, and then use urlparse.urlunparse() to put it back together.

Sign up to request clarification or add additional context in comments.

3 Comments

What would be the difference in this case?
@Mark: One would be brute force, and the other would be Pythonic.
I meant why particularly did you suggest using urlparse/urlunparse rather than urlsplit/urlunsplit - the only difference from the documentation seems to be the handling of parameters, which doesn't seem to be an issue here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.