0

I'm trying to build something to parse custom HTML similar to what you can do in Vue.js or React for example. There you can use empty attributes but phps \DOMDocument is giving me an error for markup like this:

<div><div foo></div></div>

PHP Warning: DOMDocument::loadXML(): Specification mandates value for attribute foo in Entity, line: 11

Just do this to reproduce the problem:

$document = new \DOMDocument();
$document->loadXML($html);

I've already read https://www.php.net/manual/en/domdocument.loadxml.php and https://www.php.net/manual/en/libxml.constants.php and tried LIBXML_NOWARNING but the warning still appeared for whatever reason. Then I've tried LIBXML_NOERROR but this lead to no output at all.

I'm not using $document->loadHTML($html); intentionally because if used with unknown tags you'll end up with this warning:

PHP Warning: DOMDocument::loadHTML(): Tag mytag invalid in Entity.

I know that I can suppress this warning but I would prefer to not suppress warnings at all. There might be other warnings and I don't consider it good coding style to suppress warnings, they shouldn't happen, because there can be side effects. If don't mind switching to loadHTML() if there is another way to prevent this warning.

So is there any way I can deal with empty value attributes that don't have a value defined at all in the markup using \DOMHtml?

8
  • 1
    Why don't you load HTML? $document->loadHTML($html). Commented Sep 21, 2019 at 11:49
  • Because this method gives me this warning, because it doesn't like unknown tags: PHP Warning: DOMDocument::loadHTML(): Tag mytag invalid in Entity. I don't mind changing it if there is a way to make it accepting any tags. Commented Sep 21, 2019 at 11:55
  • Then suppress warnings. That's why the libxml options exist. When you parse XML which is none, then it errors. HTML is much more lose. Commented Sep 21, 2019 at 11:59
  • 1
    consider this library simplehtmldom.sourceforge.io . it's open source I use it for scrapping html but you can give it hardcoded html and it should work well. Give it a try you will love it. You can even read it's code and check how this is parsing HTML and implement the idea on your side. Commented Sep 21, 2019 at 12:05
  • 1
    @MarkusZeller I mean generic possible side effects when ignoring warnings, nothing specific to libxml. I simply would prefer to get away with a solution that doesn't require any form of suppression. BlackXero thanks I'll check this lib out! What I'm trying to get done is this github.com/Phauthentic/custom-html-parser Clone it, got to the example folder and run php example.php. Commented Sep 21, 2019 at 12:18

1 Answer 1

1

the standard build in DOMDocument class of php does not support HTML5 notation.

I just did a quick try with the following library and it imports and exports your snippet without warning:

HTML5DOMDocument

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.