2
class MyHTMLParser(HTMLParser):

    b1 = False

    def handle_starttag(self, tag, attrs):
        if tag =="ul":
            self.b1 = True

    def handle_data(self, data):
        if self.b1:
            print(data)
            self.b1 = False

parser = MyHTMLParser()

parser.feed('<ul class="player-metadata floatleft"></ul><p>Gros caca</p><p>Zuul</p>')

I want to extract the data between <ul class="player-metadata floatleft"> and </ul> which is empty. However, even though I flagged the <ul> tag, the handle_data function prints the first data found after <ul class="player-metadata floatleft"></ul>:

"Gros caca"

I would like to print "nothing" and that len(data) returns 0.

Could you please help me? I am also not allowed to use BeautifulSoup.

1 Answer 1

3

This is pretty much a duplicate of this question.

The idea is to hold on to the start tag and the enclosed data whenever a tag is processed, then using these to do things when the parser handles the end tag, like so:

class MyHTMLParser(HTMLParser):
    _data = ''
    _starttag = ''

    def handle_starttag(self, tag, attrs):
        self._starttag = tag

    def handle_data(self, data):
        self._data = data

    def handle_endtag(self, tag):
        if self._starttag == 'ul' and self._data == '':
            print('nothing')
        elif (...):
            (...)
        else:
            print(self._data)

This will handle empty strings and return 0 for len(self._data).

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you very much. I wish I could upvote your answer! Also I believe you meant print(self._data) for the last line.
Yeah, but you get the gist. ;) Glad I could help.
I believe you can accept the answer even with a low score, which will remove the unanswered tag from the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.