0

I want to remove html comments using preg_replace_callback . But I also want to keep comments that are inside the <script> element, eg:

b/w <script> <!-- Keep Me--></script >

My code:

$str = '
    <script>
    <!-- keep1 -->
    keep </script> <!-- del me1 --> <body> <script> <!-- Keep2 --></script> <!-- Del me2 --> <script><!-- Keep3 --></script> </body><!-- del me 3 -->';


$str =    preg_replace_callback('/(<([^script]\/?)(\w|\d|\n|\r|\v)>)*((.*(<?!--.*-->)|(\w|\d|\n|\r|\v)*)+)(<\/?[^script](\w|\d)*>)/s',
    function($matches) {
        print_r($matches);
        return preg_replace('/<!--.*?-->/s', ' ', $matches[2]);
    }, $str);
2
  • Do you want to remove or keep the html comments? You should clarify your question. Commented Jul 13, 2015 at 11:39
  • @Dzienny- I want to remove <!-- comments --> except those between script tag. Commented Jul 13, 2015 at 11:43

2 Answers 2

3

Technically, "html comments" between script tags are no more html comments. If you use a DOM approach these comments are not selected:

$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xp = new DOMXPath($dom);
$comments = $xp->query('//comment()');

foreach ($comments as $comment) {
    $comment->parentNode->removeChild($comment);
}

$result = $dom->saveHTML();

About conditional comments:

If you want to preserve conditional comments, you need to check the beginning of the comment. You can do it in two ways.

The first way is to check the comment in the foreach loop, and when the test is negative, you remove the node.

But since you use the XPath way (that consists to select what you want once and for all), to follow the same logic, you can change the XPath query to:

//comment()[not(starts-with(., "[if") or starts-with(., "[endif]"))]

Content between square brackets is called a "predicate" (a condition for the current element) and the dot represents this current element or its text content (depending of the context)

However, if this will work most of the time, the slightest leading space will make it fail. You need something more flexible than starts-with.

It is possible to register your own php function to be used in the XPath query like this:

function isConditionalComment($commentNode) {
    return preg_match('~\A(?:\[if\s|\s*<!\[endif])~', $commentNode[0]->nodeValue);
}

$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xp = new DOMXPath($dom);

$xp->registerNamespace('php', 'http://php.net/xpath');
$xp->registerPHPFunctions('isConditionalComment');

$comments = $xp->query('//comment()[not(php:function("isConditionalComment", .))]');

foreach ($comments as $comment) {
    $comment->parentNode->removeChild($comment);
}

Note: DOMDocument doesn't support the default Microsoft syntax (the one nobody uses) that is not an HTML comment:

<![if !IE]>
<link href="non-ie.css" rel="stylesheet">
<![endif]>

This syntax causes a warning (since it is not HTML) and the "tag" is ignored and disappear from the DOM tree.

Sign up to request clarification or add additional context in comments.

2 Comments

Conditional comments are also getting removed using this solution, which should not happen
@HarshalShah: conditional comments are comments, so they are removed. if you want to prevent this, there are two ways: 1) check $comment->nodeValue before removing the node in the foreach loop. 2) change the xpath query to select only comments that are not conditional comments. See the edit.
-1

you cant try with this code:

$str= preg_replace('/<!--(\w|\s)*-->/', '', $str);

And in your Javascript, you can use (instead of <!-- -->):

/* Keep me comment */

4 Comments

except from <script> tag
you don't say that do anything with <script> tag, only ask to remove comments. I update the sample that now return 'b/w <script> Keep Me </script>'
No that will not check if comment is between <script><!-- keep me --> </script> tag. I want to keep preserve script tag comments but remove other outside of that
The javascript comments are different, use /* comment */

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.