2

I'm having trouble using Python's urllib.

Here is the code I have tried:

import urllib
s = urllib.urlopen("https://www.mci.ir/web/guest/login")

And here is the error I am seeing:

Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
     s = urllib.urlopen("https://www.mci.ir/web/guest/login")
   File "C:\Python27\lib\urllib.py", line 86, in urlopen
    return opener.open(url)
  File "C:\Python27\lib\urllib.py", line 207, in open
  return getattr(self, name)(url)
 File "C:\Python27\lib\urllib.py", line 450, in open_https
   return self.http_error(url, fp, errcode, errmsg, headers)
 File "C:\Python27\lib\urllib.py", line 371, in http_error
   result = method(url, fp, errcode, errmsg, headers)
 File "C:\Python27\lib\urllib.py", line 634, in http_error_302
   data)
 File "C:\Python27\lib\urllib.py", line 660, in redirect_internal
   return self.open(newurl)
 File "C:\Python27\lib\urllib.py", line 207, in open
   return getattr(self, name)(url)
 File "C:\Python27\lib\urllib.py", line 436, in open_https
   h.endheaders(data)
 File "C:\Python27\lib\httplib.py", line 954, in endheaders
   self._send_output(message_body)
 File "C:\Python27\lib\httplib.py", line 814, in _send_output
   self.send(msg)
 File "C:\Python27\lib\httplib.py", line 776, in send
   self.connect()
 File "C:\Python27\lib\httplib.py", line 1161, in connect
   self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
 File "C:\Python27\lib\ssl.py", line 381, in wrap_socket
   ciphers=ciphers)
 File "C:\Python27\lib\ssl.py", line 143, in __init__
   self.do_handshake()
 File "C:\Python27\lib\ssl.py", line 305, in do_handshake
   self._sslobj.do_handshake()
IOError: [Errno socket error] [Errno 8] _ssl.c:504: EOF occurred in         violation of protocol
4
  • 3
    Somehow the actual error message is missing from the traceback. Commented Jun 18, 2015 at 12:58
  • I think there is a problem with the site somehow. Commented Jun 18, 2015 at 12:58
  • Yeah I'm not sure where the rest of the Traceback is :) Commented Jun 18, 2015 at 13:00
  • Could this be the problem? Deprecated since version 2.6: The urlopen() function has been removed in Python 3 in favor of urllib2.urlopen(). Commented Jun 18, 2015 at 13:19

2 Answers 2

2

The remote server does not seem to like the User-Agent header being used by urllib.urlopen() and urllib2.urlopen() (Python 2), nor urllib.request.urlopen() (Python 3). It is closing the connection.

Issuing a request with the requests package does work:

>>> import requests
>>> r = requests.get('https://www.mci.ir/web/guest/login')
>>> r
<Response [200]>

Setting the User-Agent to that used by urllib/urllib2:

>>> r = requests.get('https://www.mci.ir/web/guest/login', headers={'User-Agent': 'Python-urllib/2.7'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 594, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 196, in resolve_redirects
    **adapter_kwargs
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:590)

My advice is to use requests as this is a much better library, however, if you must use the standard library, use urllib2 and set a user agent header that is acceptable to the remote server:

req = urllib2.Request('https://www.mci.ir/web/guest/login')
req.add_header('User-Agent','Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0')
r = urllib2.urlopen(req)
html = r.read()

One other thing worth noting is that once the remote server receives a request that it doesn't like (e.g. with an unaccepted user agent), it blocks requests from the originating IP address until there has been a period of time with no requests (or it might be a random period).

Sign up to request clarification or add additional context in comments.

Comments

-1

I was also facing the same problem, I fixed it by using python3.

 File "/usr/lib/python2.7/ssl.py", line 830, in do_handshake
    self._sslobj.do_handshake()
IOError: [Errno socket error] EOF occurred in violation of protocol (_ssl.c:590)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.