For the following lines that use urllib:
# some request object exists
response = urllib.request.urlopen(request)
html = response.read().decode("utf8")
What format of string does read() return? I've been trying t figure that out form Python's documentation but it does not mention it at all. Why is there a decode? Does decode decode an object to utf-8 or from utf-8? From what format to what format does it decode it to? decode documentation also mentions nothing about that. Is it that Python's documentation is that terrible, or is it that I don't understand some standard convention?
I want to store that HTML in a UTF-8 file. Would I just do a regular write, or do I need to "encode" back into something and write that?
Note: I know urllib is deprecated, but I cannot switch to urllib2 right now