1

Folks

Below program is for finding out the IP address given in the page http://whatismyipaddress.com/

import urllib2
import re

response = urllib2.urlopen('http://whatismyipaddress.com/')

p = response.readlines()
for line in p:
    ip = re.findall(r'(\d+.\d+.\d+.\d+)',line)
    print ip

But I am not able to trouble shoot the issue as it was giving below error

Traceback (most recent call last):
  File "Test.py", line 5, in <module>
  response = urllib2.urlopen('http://whatismyipaddress.com/')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
  return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 437, in open
  response = meth(req, response)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 550, in http_response
  'http', request, response, code, msg, hdrs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 475, in error
  return self._call_chain(*args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
  result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 558, in http_error_default
  raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)

urllib2.HTTPError: HTTP Error 403: Forbidden

anyone have any idea what change is required to remove the errors and get the required output?

1
  • They are checking the "User-Agent" header Commented Aug 13, 2015 at 6:27

3 Answers 3

3

The http error code 403 tells you that the server does not want to respond to your request for some reason. In this case, I think it is the user agent of your query (the default used by urllib2).

You can change the user agent:

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
response = opener.open('http://www.whatismyipaddress.com/')

Then your query will work.

But there is no guarantee that this will keep working. The site could decide to block automated queries.

Sign up to request clarification or add additional context in comments.

4 Comments

Seems they are using a blacklist. 'wget' is blocked, but 'w' gets through :)
@chris it worked... it provided the output... can u explain what exaclty the 1st and 2nd line of ur code does?
@chris, p = response.readlines() for line in p: IP = re.finditer(r'(\d+.\d+.\d+.\d+)',line) print IP
@chris added above lines to the code u have given..but still not giving the required output.instead it gives <callable-iterator object at 0x102a92d50>
0

Try this

>>> import urllib2
>>> import re
>>> site= 'http://whatismyipaddress.com/'
>>> hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
...        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
...        'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
...        'Accept-Encoding': 'none',
...        'Accept-Language': 'en-US,en;q=0.8',
...        'Connection': 'keep-alive'}
>>> req = urllib2.Request(site, headers=hdr)
>>> response = urllib2.urlopen(req)
>>> p = response.readlines()
>>> for line in p:
...     ip = re.findall(r'(\d+.\d+.\d+.\d+)',line)
...     print ip

urllib2-httperror-http-error-403-forbidden

Comments

0

You may try the requests package here, instead of the urllib2

it is much easier to use :

import requests
url='http://whereismyip.com'
header = {'user-Agent':'curl/7.21.3'}
r= requests.get(url,header)

you can use curl as the user-Agent

2 Comments

installed the module requests.and tried to execute below code import requests r = requests.get('whatismyipaddress.com/') print r.text but there is no response for this code
@Maverick the url should be a valid url with proper protocol defined, in this case you should provide it with the http://' Try use url = 'http://whatismyipaddress.com/' then r = requests.get(url) if you need to custom header you can pass the header to the get method like headers = {'user-agent': 'Mozilla/5.0'} r=requests.get(url,headers)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.