Getting headers with Python requests library

Question

I am using Python requests library to get the header of html pages and use this to get the encoding. But some of the links the requests fails to get header. For such cases I would like to use the encoding "utf-8". How do I handle such cases? How do I handle error returned by requests.head.

Here is my code:

r = requests.head(link) #how to handle error in case this fails?
charset = r.encoding
if (not charset):
    charset = "utf-8"

Error I am getting when requests fails to get the header :

 File "parsexml.py", line 78, in parsefile
  r = requests.head(link)
 File "/usr/lib/python2.7/dist-packages/requests/api.py", line 74, in head
   return request('head', url, **kwargs)
 File "/usr/lib/python2.7/dist-packages/requests/api.py", line 40, in request
   return s.request(method=method, url=url, **kwargs)
 File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 229, in request
   r.send(prefetch=prefetch)
 File "/usr/lib/python2.7/dist-packages/requests/models.py", line 605, in send
   raise ConnectionError(e)
 requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.standardzilla.com', port=80): Max retries exceeded with url: /2008/08/01/diaries-of-a-freelancer-day-thirty-seven/

The server you are trying to connect to doesn't respond at all; I don't think this has anything to do with your HEAD request, really. — Martijn Pieters
– Martijn Pieters, Commented Feb 18, 2014 at 10:59
Exception handling; catch the exception and move on. See the posted answer. But that's not really requests specific, let alone anything to do with testing for character sets. :-) — Martijn Pieters
– Martijn Pieters, Commented Feb 18, 2014 at 11:02

Noel Evans · Accepted Answer · 2014-02-18 10:59:46Z

2

You should put your code in a try-except block, catching ConnectionErrors. Like this:

try:
    r = requests.head(link) //how to handle error in case this fails?
    charset = r.encoding
    if (not charset):
      charset = "utf-8"
except requests.exceptions.ConnectionError:
    print 'Unable to access ' + link

answered Feb 18, 2014 at 10:59

Noel Evans

8,6568 gold badges53 silver badges61 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Noel Evans Over a year ago

@Lanc Great. You can mark the answer as correct if that's the case

Lanc Over a year ago

Now I am getting this error now, can you please help me with this. File "parsexml.py", line 79, in parsefile r = requests.head(link,timeout=100,allow_redirects=True) File "/usr/lib/python2.7/dist-packages/requests/api.py", line 74, in head return request('head', url, **kwargs) File "/usr/lib/python2.7/dist-packages/requests/api.py", line 40, in request return s.request(method=method, url=url, **kwargs) File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 229, in request r.send(prefetch=prefetch)

Lanc Over a year ago

File "/usr/lib/python2.7/dist-packages/requests/models.py", line 624, in send self._build_response(r) File "/usr/lib/python2.7/dist-packages/requests/models.py", line 301, in _build_response request.send() File "/usr/lib/python2.7/dist-packages/requests/models.py", line 468, in send url = self.full_url

Lanc Over a year ago

File "/usr/lib/python2.7/dist-packages/requests/models.py", line 411, in full_url url = requote_uri(url) File "/usr/lib/python2.7/dist-packages/requests/utils.py", line 448, in requote_uri return quote(unquote_unreserved(uri), safe="!#$%&'()*+,/:;=?@[]~") File "/usr/lib/python2.7/dist-packages/requests/utils.py", line 429, in unquote_unreserved c = chr(int(h, 16)) ValueError: invalid literal for int() with base 16: '&e'

Noel Evans Over a year ago

@Lanc it's hard to tell. You've added some more code it looks like.

|

Collectives™ on Stack Overflow

Getting headers with Python requests library

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related