Checking if a website is up via Python

Question

By using python, how can I check if a website is up? From what I read, I need to check the "HTTP HEAD" and see status code "200 OK", but how to do so ?

Cheers

9 Comments

OscarRyz Over a year ago

Following question, using urlopen.getcode does fetch the entire page or not?

Anthony Forloney Over a year ago

As far as i know, getcode retreives the status from the response that is sent back

Peter Hansen Over a year ago

@Oscar, there's nothing in urllib to indicate it uses HEAD instead of GET, but the duplicate question referenced by Daniel above shows how to do the former.

l1zard Over a year ago

it seems there is no method urlopen in python 3.x any more. all i keep getting is ImportError: cannot import name 'urlopen' how can i work around this?

james-see Over a year ago

@l1zard like so: req = urllib.request.Request(url, headers = headers) resp = urllib.request.urlopen(req)

|

caisah · Accepted Answer · 2013-04-01 12:36:55Z

37

I think the easiest way to do it is by using Requests module.

import requests

def url_ok(url):
    r = requests.head(url)
    return r.status_code == 200

answered Apr 1, 2013 at 12:36

caisah

2,0961 gold badge22 silver badges38 bronze badges

6 Comments

Jonas Stein Over a year ago

this does not work here for url = "http://foo.example.org/" I would expect 404, but get a crash.

caisah Over a year ago

This returns False for any other response code than 200 (OK). So you wouldn't know if it's a 404. It only checks if the site is up and available for public.

AnneTheAgile Over a year ago

@caisah, did you test it? Jonas is right; I get an exception; raise ConnectionError(e) requests.exceptions.ConnectionError: HTTPConnectionPool(host='nosuch.org2', port=80): Max retries exceeded with url: / (Caused by <class 'socket.gaierror'>: [Errno 8] nodename nor servname provided, or not known)

caisah Over a year ago

I've test it before posting it. The thing is, that this checks if a site is up and doesn't handle the situtation when host name is invalid or other thing that go wrong. You should think of those exceptions and catch them.

vauhochzett Over a year ago

In my view, this does not test if a website is up, as it crashes (as the commenters before have said). This is my try at a short, pythonic implementation: stackoverflow.com/a/57999194/5712053

|

OscarRyz · Accepted Answer · 2009-12-22 21:44:21Z

11

You can use httplib

import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/")
r1 = conn.getresponse()
print r1.status, r1.reason

prints

200 OK

Of course, only if www.python.org is up.

answered Dec 22, 2009 at 21:44

OscarRyz

200k119 gold badges399 silver badges577 bronze badges

1 Comment

User Over a year ago

This only checks domains, need something efficient like this for webpages.

Christopher Punton · Accepted Answer · 2016-07-01 14:15:00Z

10

from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("http://stackoverflow.com")
try:
    response = urlopen(req)
except HTTPError as e:
    print('The server couldn\'t fulfill the request.')
    print('Error code: ', e.code)
except URLError as e:
    print('We failed to reach a server.')
    print('Reason: ', e.reason)
else:
    print ('Website is working fine')

Works on Python 3

edited Jul 1, 2016 at 14:15

answered Jul 1, 2016 at 12:36

Christopher Punton

1493 silver badges8 bronze badges

Comments

AnneTheAgile · Accepted Answer · 2013-11-14 16:38:20Z

8

import httplib
import socket
import re

def is_website_online(host):
    """ This function checks to see if a host name has a DNS entry by checking
        for socket info. If the website gets something in return, 
        we know it's available to DNS.
    """
    try:
        socket.gethostbyname(host)
    except socket.gaierror:
        return False
    else:
        return True


def is_page_available(host, path="/"):
    """ This function retreives the status code of a website by requesting
        HEAD data from the host. This means that it only requests the headers.
        If the host cannot be reached or something else goes wrong, it returns
        False.
    """
    try:
        conn = httplib.HTTPConnection(host)
        conn.request("HEAD", path)
        if re.match("^[23]\d\d$", str(conn.getresponse().status)):
            return True
    except StandardError:
        return None

edited Nov 14, 2013 at 16:38

AnneTheAgile

10.4k6 gold badges57 silver badges49 bronze badges

answered Dec 22, 2009 at 22:06

Evan Fosmark

102k36 gold badges109 silver badges118 bronze badges

1 Comment

Craig McQueen Over a year ago

is_website_online just tells you if a host name has a DNS entry, not whether a website is online.

mkonstanty · Accepted Answer · 2021-01-24 22:33:58Z

8

I use requests for this, then it is easy and clean. Instead of print function you can define and call new function (notify via email etc.). Try-except block is essential, because if host is unreachable then it will rise a lot of exceptions so you need to catch them all.

import requests

URL = "https://api.github.com"

try:
    response = requests.head(URL)
except Exception as e:
    print(f"NOT OK: {str(e)}")
else:
    if response.status_code == 200:
        print("OK")
    else:
        print(f"NOT OK: HTTP response code {response.status_code}")

edited Jan 24, 2021 at 22:33

answered Jan 8, 2021 at 12:22

mkonstanty

1752 silver badges6 bronze badges

Comments

Hari_pb · Accepted Answer · 2018-08-12 03:16:01Z

5

You may use requests library to find if website is up i.e. status code as 200

import requests
url = "https://www.google.com"
page = requests.get(url)
print (page.status_code) 

>> 200

answered Aug 12, 2018 at 3:16

Hari_pb

7,4564 gold badges49 silver badges54 bronze badges

Comments

Hank Gay · Accepted Answer · 2009-12-22 21:34:44Z

4

The HTTPConnection object from the httplib module in the standard library will probably do the trick for you. BTW, if you start doing anything advanced with HTTP in Python, be sure to check out httplib2; it's a great library.

answered Dec 22, 2009 at 21:34

Hank Gay

72.4k36 gold badges164 silver badges224 bronze badges

Comments

themadmax · Accepted Answer · 2017-10-06 09:41:03Z

4

If server if down, on python 2.7 x86 windows urllib have no timeout and program go to dead lock. So use urllib2

import urllib2
import socket

def check_url( url, timeout=5 ):
    try:
        return urllib2.urlopen(url,timeout=timeout).getcode() == 200
    except urllib2.URLError as e:
        return False
    except socket.timeout as e:
        print False


print check_url("http://google.fr")  #True 
print check_url("http://notexist.kc") #False

answered Oct 6, 2017 at 9:41

themadmax

2,4341 gold badge34 silver badges40 bronze badges

Comments

vauhochzett · Accepted Answer · 2019-09-18 18:55:00Z

3

In my opinion, caisah's answer misses an important part of your question, namely dealing with the server being offline.

Still, using requests is my favorite option, albeit as such:

import requests

try:
    requests.get(url)
except requests.exceptions.ConnectionError:
    print(f"URL {url} not reachable")

answered Sep 18, 2019 at 18:55

vauhochzett

3,4573 gold badges21 silver badges50 bronze badges

Comments

Tyler Smith · Accepted Answer · 2009-12-22 21:34:12Z

1

If by up, you simply mean "the server is serving", then you could use cURL, and if you get a response than it's up.

I can't give you specific advice because I'm not a python programmer, however here is a link to pycurl http://pycurl.sourceforge.net/.

answered Dec 22, 2009 at 21:34

Tyler Smith

7275 silver badges12 bronze badges

Comments

Manouchehr Rasouli · Accepted Answer · 2017-08-21 06:59:16Z

Hi this class can do speed and up test for your web page with this class:

 from urllib.request import urlopen
 from socket import socket
 import time


 def tcp_test(server_info):
     cpos = server_info.find(':')
     try:
         sock = socket()
         sock.connect((server_info[:cpos], int(server_info[cpos+1:])))
         sock.close
         return True
     except Exception as e:
         return False


 def http_test(server_info):
     try:
         # TODO : we can use this data after to find sub urls up or down    results
         startTime = time.time()
         data = urlopen(server_info).read()
         endTime = time.time()
         speed = endTime - startTime
         return {'status' : 'up', 'speed' : str(speed)}
     except Exception as e:
         return {'status' : 'down', 'speed' : str(-1)}


 def server_test(test_type, server_info):
     if test_type.lower() == 'tcp':
         return tcp_test(server_info)
     elif test_type.lower() == 'http':
         return http_test(server_info)

constrict0r · Accepted Answer · 2019-07-31 23:06:55Z

1

Requests and httplib2 are great options:

# Using requests.
import requests
request = requests.get(value)
if request.status_code == 200:
    return True
return False

# Using httplib2.
import httplib2

try:
    http = httplib2.Http()
    response = http.request(value, 'HEAD')

    if int(response[0]['status']) == 200:
        return True
except:
    pass
return False

If using Ansible, you can use the fetch_url function:

from ansible.module_utils.basic import AnsibleModule
from ansible.module_utils.urls import fetch_url

module = AnsibleModule(
    dict(),
    supports_check_mode=True)

try:
    response, info = fetch_url(module, url)
    if info['status'] == 200:
        return True

except Exception:
    pass

return False

edited Jul 31, 2019 at 23:06

answered Jul 25, 2019 at 21:58

constrict0r

113 bronze badges

Comments

EVE Milano · Accepted Answer · 2019-11-19 21:18:47Z

1

my 2 cents

def getResponseCode(url):
conn = urllib.request.urlopen(url)
return conn.getcode()

if getResponseCode(url) != 200:
    print('Wrong URL')
else:
    print('Good URL')

answered Nov 19, 2019 at 21:18

EVE Milano

1093 silver badges12 bronze badges

Comments

Klemen Tusar · Accepted Answer · 2016-12-06 12:38:52Z

Here's my solution using PycURL and validators

import pycurl, validators


def url_exists(url):
    """
    Check if the given URL really exists
    :param url: str
    :return: bool
    """
    if validators.url(url):
        c = pycurl.Curl()
        c.setopt(pycurl.NOBODY, True)
        c.setopt(pycurl.FOLLOWLOCATION, False)
        c.setopt(pycurl.CONNECTTIMEOUT, 10)
        c.setopt(pycurl.TIMEOUT, 10)
        c.setopt(pycurl.COOKIEFILE, '')
        c.setopt(pycurl.URL, url)
        try:
            c.perform()
            response_code = c.getinfo(pycurl.RESPONSE_CODE)
            c.close()
            return True if response_code < 400 else False
        except pycurl.error as err:
            errno, errstr = err
            raise OSError('An error occurred: {}'.format(errstr))
    else:
        raise ValueError('"{}" is not a valid url'.format(url))

PouriaDiesel · Accepted Answer · 2024-05-27 04:32:52Z

0

sometimes some sites are forbidden in some countries (like Iran) and you can't access them directly with your ip. So I found better solution in this link that checks your domain with https://www.isitdownrightnow.com that requests your domain with several distributed servers in the world. So you can use this code:

import re
import requests

domain = 'your domain ex:a.com'
print(f"checking '{domain}' is up?")
isitdown_url = f'https://www.isitdownrightnow.com/check.php?domain={domain}'
r = requests.get(isitdown_url)
r_text = (r.text.lower().replace('</div>', ' '))
status = (re.compile(f'{domain} is (.*) (?:it is not|and reachable)').search(r_text).group(1).split(' ')[0])
if status == 'down':
    raise ValueError(f"'{domain}' is down in all the world!please request later!")

answered May 27, 2024 at 4:32

PouriaDiesel

73510 silver badges13 bronze badges

2 Comments

Georgios Over a year ago

I am wondering, is there any reason to hit the isitdownrightnow.com service instead of the actual website you are trying to check?

PouriaDiesel Over a year ago

Yes my bro. for ex: I am using Caltech electro vehicle API to use it's charging station sessions dataset in my thesis and it has about 20Gigabytes data in separated files about 1257 files but on downloading dataset ev.caltech.edu became down and I didn't now why my program fault. Some sites are banned some nations from their sites like Iranians. So sometimes we need Vpn to test but this sites has distributed servers in world countries that not need to Vpn.

Collectives™ on Stack Overflow

Checking if a website is up via Python

Related

16 Answers 16

9 Comments

6 Comments

1 Comment

Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Related

16 Answers 16

9 Comments

6 Comments

1 Comment

Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related