4

i'm trying to parse an html page with XPathDocument, but gives error 'cause the html is not an xml... is there a way to do this or not?

1

2 Answers 2

7

Should use HtmlAgilityPack. Still the best!

Sign up to request clarification or add additional context in comments.

Comments

3

Use something like Html Agility Pack which can load your html into a DOM object which can be traversed with for example xpath queries.

Unless your html is in fact xhtml, it is usually not a valid xml structure with correct opening and ending node tags.

2 Comments

I would like to mark this answer up, but htmlagilitypack does not work with the doc I'm giving it, the LoadFile() method does not have a return value, and does not throw an exception either. The document appears to not return anything when I query it either, so I'm assuming the code has "silently failed" when this happens?
Hi @ConradB, Have you tried the sample at htmlagilitypack.codeplex.com/wikipage?title=Examples? Load should not return anything, but it should make you able to loop over nodes doing selections.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.