You really shouldn't set about modifying a DOM using regex. There are DOM parsers to do this kind of thing. It's not even that hard:
$html = '<p><br></p><div align="justify"><b>Some Text</b></div>
<p>foobar</p>
<p></p>';//empty
$dom = new DOMDocument;
$dom->loadHTML($html);
$pars = $dom->getElementsByTagName('p');
foreach ($pars as $tag)
{
if (!trim($tag->textContent))
{
$tag->parentNode->removeChild($tag);
}
}
That's all. You simply select all of the p tags, then check if its trim-ed text contents is empty, if it is: remove the node by selecting its parent, and invoking the DOMNode::removeChild method...
The snippet above removes 2 of the 3 paragraph nodes, the one containing foorbar is left as is. I thinkg that's what you are trying to do...
To get the actual dom fragment, after removing the tags that needed to be removed, you can simply do this:
echo trim(
substr(
$dom->saveHTML($dom->documentElement),//omit doctype
12, -14//12 => <html><body> and -14 for </body></html>
)
);
preg_replace($re, '', $str);#on either end of that regex is a delimiter. PHP Live regex forces the delimiters to be/, which breaks the/?s in the pattern and makes the#s be interpreted as regular characters. As others have posted, this works fine in PHP itself.