How can I copy the source code of a website into a text file in Python 3?
EDIT: To clarify my issue, here's what I have:
import urllib.request
def extractHTML(url):
f = open('temphtml.txt', 'w')
page = urllib.request.urlopen(url)
pagetext = page.read()
f.write(pagetext)
f.close()
extractHTML('http:www.google.com')
I get the following error for the f.write() function:
builtins.TypeError: must be str, not bytes
pagetextis NOT a string.. It's actually bytes. So to convert it to a string, you need to usef.write(pagetext.decode('utf-8'))which will a UTF-8 encoded string to the file.UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 8482: invalid start byte. I just literally copied down my answer without thestr()and putf.write(pagetext.decode('utf-8'))in the place off.write(pagetext). Any idea why this is not working for me. If you are using Python 2 that might be why