1

I'm looking to get a C++ parser for html but there seems to be only xml parsers for c++ and various sources allude to a fact that XML parsers can parse HTML but I can't find any concrete information that XML parses are acceptable to parse HTML.

If you can parse HTML with it, why is this possible if they're different languages, and I don't think html is a subset of XML?

2
  • This text simply explain the relationships of HTML 4.0, XML 1.x and XHTML: webdesign.about.com/cs/xhtmlxml/a/aa013100a.htm Commented Nov 19, 2015 at 1:51
  • They're kind of like the relations of JavaScript Object and JSON... Commented Nov 19, 2015 at 1:52

1 Answer 1

2

Some HTML can be parsed with an XML parser; some HTML cannot.

SGML begat both XML and HTML. SGML and HTML do not universally require closing tags as XML does (among other differences) and therefore cannot be parsed via XML parsers in the general case. On the other hand, XHTML is by definition well-formed XML and therefore can be parsed via XML parsers.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.