Python - Page Source when calling a URL

Question

Im looking for a really simple code to call a url and print the html source code. This is what I am using. Im following an online course which has the code

def get_page(url):
try:
    import urllib
    return urllib.open(url).read()
except:
    return ""

print(get_page('https://www.yahoo.com/'))

Prints nothing but also no errors. Alternatively from browsing these forums I've tried

from urllib.request import urlopen

print (urlopen('https://xkcd.com/353/'))

when I do this I get

<http.client.HTTPResponse object at 0x000001E947559710>

Check this out print source

Arun
– Arun

2017-03-11 08:37:09 +00:00
Commented Mar 11, 2017 at 8:37 — Arun
– Arun, Commented Mar 11, 2017 at 8:37

Smart Manoj · Accepted Answer · 2017-03-11 08:41:19Z

0

from urllib.request import urlopen    
print (urlopen('https://xkcd.com/353/').read().decode())

answered Mar 11, 2017 at 8:41

Smart Manoj

6,0736 gold badges45 silver badges66 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gerry Hernandez · Accepted Answer · 2017-03-11 08:45:57Z

0

Assuming UTF-8 encoding was used

from urllib import request
def get_src_code(url):
    r = request.urlopen("url")
    byte_code = r.read()
    src_code = bytecode.decode()
    return src_code

answered Mar 11, 2017 at 8:45

Gerry Hernandez

4421 gold badge3 silver badges15 bronze badges

Comments

salmanwahed · Accepted Answer · 2017-04-05 13:00:04Z

0

It prints the empty string at the except block. Your code is generating error because there is no attribute called open in urllib module. You can't see the error because you are using a try-except block which is returning an empty string on every error. In your code, you can see the error like this:

def get_page(url):
    try:
        import urllib
        return urllib.open(url).read()
    except Exception as e:
        return e.args[0]

To get your expected output, do it like this:

def get_page(url):
    try:
        from urllib.request import urlopen
        return urlopen(url).read().decode('utf-8')
    except Exception as e:
        return e.args[0]

edited Apr 5, 2017 at 13:00

answered Mar 11, 2017 at 8:54

salmanwahed

9,6878 gold badges37 silver badges56 bronze badges

Collectives™ on Stack Overflow

Python - Page Source when calling a URL

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related