I'm trying to parse a web page with the Python HTMLParser. I want to get the content of a tag, but I'm not sure how to do it. This is the code I have so far:
import urllib.request
from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
def handle_data(self, data):
print("Encountered some data:", data)
url = "website"
page = urllib.request.urlopen(url).read()
parser = MyHTMLParser(strict=False)
parser.feed(str(page))
If I understand correctly, I can use the handle_data() function to get the data between tags. How do I specify which tags to get the data from? And how do I get the data?