What is the preferred solution for checking if an URL is relative or absolute?
4 Answers
Python 2
You can use the urlparse module to parse an URL and then you can check if it's relative or absolute by checking whether it has the host name set.
>>> import urlparse
>>> def is_absolute(url):
... return bool(urlparse.urlparse(url).netloc)
...
>>> is_absolute('http://www.example.com/some/path')
True
>>> is_absolute('//www.example.com/some/path')
True
>>> is_absolute('/some/path')
False
Python 3
urlparse has been moved to urllib.parse, so use the following:
from urllib.parse import urlparse
def is_absolute(url):
return bool(urlparse(url).netloc)
6 Comments
www.example.com/some/path count as abolute too?http:// by some pre-processing or not use urlparse.//google.com is a protocol-relative url. And your code will return False for it.urlsplit instead of urlparse. BTW, in Django you have a Python 2 & 3 compatible way: from django.utils.six.moves.urllib.parse import urlsplit, urlparseIf you want to know if an URL is absolute or relative in order to join it with a base URL, I usually do urllib.parse.urljoin anyway:
>>> from urllib.parse import urljoin
>>> urljoin('http://example.com/', 'http://example.com/picture.png')
'http://example.com/picture.png'
>>> urljoin('http://example1.com/', '/picture.png')
'http://example1.com/picture.png'
>>>
2 Comments
http://www.yahoo.com and www.google.com as inputs, this will give you http://www.yahoo.com/www.google.com as output, which probably isn't what you wanted. So you'll still have to check somehow if the second one is a url without a schema, or if actually a relative path.Can't comment accepted answer, so write this comment as new answer: IMO checking scheme in accepted answer ( bool(urlparse.urlparse(url).scheme) ) is not really good idea because of http://example.com/file.jpg, https://example.com/file.jpg and //example.com/file.jpg are absolute urls but in last case we get scheme = ''
I use this code:
is_absolute = True if '//' in my_url else False
1 Comment
pip install yarl
import yarl
if not yarl.URL(image).is_absolute():
image = context["request"].build_absolute_uri(image)
because
yarl.URL("//google.com").is_absolute() is True
True
in the opposite to
urllib.parse.urlsplit("//google.com").scheme == ""
True
netloc is still defined though
urllib.parse.urlsplit("//google.com").netloc == "google.com"
Pros.
- easier to read
- easier to test (you can mock one particular method)
Cons.
- extra deps (but pretty stable one)
4 Comments
yarl? Please read How to Answer.urllib.parse.urljoin recommended in this answer? Again, please read How to Answer.