c# parse html using XPathDocument

Question

i'm trying to parse an html page with XPathDocument, but gives error 'cause the html is not an xml... is there a way to do this or not?

check here: stackoverflow.com/questions/56107/…

pinichi
– pinichi

2010-10-15 07:26:25 +00:00
Commented Oct 15, 2010 at 7:26 — pinichi
– pinichi, Commented Oct 15, 2010 at 7:26

carla · Accepted Answer · 2017-11-26 04:52:47Z

7

Should use HtmlAgilityPack. Still the best!

edited Nov 26, 2017 at 4:52

carla

2,1471 gold badge34 silver badges48 bronze badges

answered Oct 15, 2010 at 7:25

pinichi

2,21516 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mikael Svenson · Accepted Answer · 2010-10-15 07:25:50Z

3

Use something like Html Agility Pack which can load your html into a DOM object which can be traversed with for example xpath queries.

Unless your html is in fact xhtml, it is usually not a valid xml structure with correct opening and ending node tags.

answered Oct 15, 2010 at 7:25

Mikael Svenson

39.8k8 gold badges76 silver badges80 bronze badges

2 Comments

user337598 Over a year ago

I would like to mark this answer up, but htmlagilitypack does not work with the doc I'm giving it, the LoadFile() method does not have a return value, and does not throw an exception either. The document appears to not return anything when I query it either, so I'm assuming the code has "silently failed" when this happens?

Mikael Svenson Over a year ago

Hi @ConradB, Have you tried the sample at htmlagilitypack.codeplex.com/wikipage?title=Examples? Load should not return anything, but it should make you able to loop over nodes doing selections.

Collectives™ on Stack Overflow

c# parse html using XPathDocument

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related