3

I need to retrieve some data from a web page. After analysing the HTML code of the page, I found the data I need is embeded in a table with a unique table id. I don't know whether it is an HTML rule or not, anyway it's very good for parsing I think.

The data in the table is arranged as below (various attributes and tags have been omitted in order to give you a clear "data structure")

<table .... id = "tablename" .... >
    <tr>
         <td .... >filed1</td>
             ....
         <td .... >filedn</td>
    </tr>
         #several "trs" here
    <tr>
         <td .... >filed1</td>
             ....
         <td .... >filedn</td>
    </tr>
</table>

So my question is how to use Perl's HTML parser utility to meet my needs in this case.

Thanks in advance.

3 Answers 3

12

HTML::TableExtract sounds exactly like what you are looking for.

Sign up to request clarification or add additional context in comments.

Comments

2

Use HTML::Table.

Comments

-1

Look at Ken MacFarlane's Parsing HTML with HTML::Parser in The Perl Journal. I'm not sure if that's the parser you're referring to, but it looks like it can do what you want, or at least point you in the right direction.

1 Comment

You shouldn't have to reach down into HTML::Parser for this. There are many tools built on top of it that should be able to handle the job.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.