2

I want to parse an HTML table into a 2d array (rows and cols) in python using HTMLParser (only. Don't want to use BeautifulSoup and other non-standard libraries)

This is for a personal project, doing this for fun :P

Anyway, here's my code. Its giving me a really messed up error - it says

1 Answer 1

1

I haven't checked what you exactly want to do, but you assign a string to self.txt and then try to use it as a list.

In the constructor, you initialize self.txt with an empty list :

def __init__(self):
...
self.txt = []
...

and then in the handle_data method :

def handle_data(self, text):
    if (len(self.txt) > 0 ) :
        self.txt.append(text + " ") # <-- Here you consider self.txt is a list

    if (self.in_table == 1 and self.in_th == 0):
        self.txt = text.lstrip() # <-- Here you **assign a string** to self.txt
Sign up to request clarification or add additional context in comments.

1 Comment

Could you check what I did though? I'm trying to get done with this today... Basically I'm trying to add the dehtml'ed data to a new list and then joining the list elements to create one big blob of dehtml'ed text.. That's why self.txt is a list

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.