6

If I use saveHTML() without the optional DOMnode parameter it works as expected:

$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
echo $dom->saveHTML();
<html><body><div>123</div><div>456</div></body></html>

But when I add a DOMNode parameter to output a subset of the document it seems to ignore the formatOutput property and adds a bunch of unwanted whitespace:

$body = $dom->getElementsByTagName('body')->item(0);
echo $dom->saveHTML($body);
<body>
<div>123</div>
<div>456</div>
</body>

What gives? Is this a bug? Is there a workaround?

3 Answers 3

5

Is this a bug?

Yes, it's a bug and it's reported here

Is there a workaround?

Stick with Nigel's solution for now

Did they fix it?

Yes, as of 7.3.0 alpha3 this is a fixed bug

Check it here

Sign up to request clarification or add additional context in comments.

Comments

4

If you know your document is going to be valid XML as well, you can use saveXML() instead...

$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
$body = $dom->getElementsByTagName('body')->item(0);
echo $dom->saveXML($body);

which gives...

<body><div>123</div><div>456</div></body>

1 Comment

Looking deeper into this, saveXML() respects the formatOutput property with or without the argument, while saveHTML() only works without argument. Seems like a bug to me. Thanks for the workaround!
2

Well, it's a pretty ugly workaround, but it gets the job done:

$html = '<html><body><div>123</div><div>456</div></body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = true;
$dom->formatOutput = false;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD);
$dom->loadHTML(str_replace("\n", "", $dom->saveHTML($dom->getElementsByTagName('body')->item(0))), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

echo $dom->saveHTML();

DEMO

Since saveHTML() returns the string, pass the Node to that, then replace the line breaks, then pass that to loadHTML().

4 Comments

hehe - that is ugly but thanks. at the moment I'm doing this: str_replace(PHP_EOL, '', $dom->saveHTML($dom->getElementsByTagName('body')->item(0))); which I think yields the same result with a bit less effort. I'm still looking for a better method but appreciate your suggestion.
Okay. In the future, if you already have a solution, but just don't like it, you should include that in your question.
Agreed Patrick, I actually posted the question before I came up with that workaround and was in the process of editing my question to make that clear when you posted this. That said, I've upvoted your answer and I think it's useful and you probably wouldn't have written it if I'd posted my workaround to begin with - alls well that ends well.
Yup. I like Nigel's better anyway :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.