0

I need to replace all block comments with preg_replace() in php. For example:

/**asdfasdf
fasdfasdf*/
echo "hello World\n";

For this:

echo "hello World\n";

I tried some solutions from this site, but no one works for me. My code:

$file  = file_get_contents($fileinput);
$file = preg_replace('/\/\*([^\\n]*[\\n]?)*\*\//', '', $file);
echo $file;

My output for example is same as input.Link to my regex test

6
  • possible duplicate of How to remove JS comments using PHP? Commented Mar 6, 2014 at 15:47
  • I know but it's not correct for me .. For example if i have echo"/*asdfasdf*/" it will delete it and output will be like echo"" Commented Mar 6, 2014 at 15:50
  • php.net/manual/en/function.php-strip-whitespace.php Commented Mar 6, 2014 at 15:53
  • @JakubBaskiGabčo: Then please edit your question and explain all the requirements clearly. Commented Mar 6, 2014 at 15:55
  • 1
    This probably isn't a good job for preg_replace - as PHP isn't a regular language, you're going to have trouble with comments that are in and out of strings. You can get close, but what you really want is probably some kind of tokenizer for PHP which will issue you COMMENT_START and COMMENT_END tokens. Commented Mar 6, 2014 at 15:56

3 Answers 3

2

Use the http://www.php.net/manual/en/function.token-get-all.php:

$file  = file_get_contents($fileinput);
$tokens = token_get_all($file); // prepend an open tag if your file doesnt have one

$plain = '';
foreach ($tokens as $token) {
    if (is_array($token)) {
        list($number, $string) = $token;
        if (!in_array($number, [T_OPEN_TAG, T_COMMENT])) { // add all tokens you dont want
             $plain .= $string;
        }
    } else {
        $plain .= $token;
    }
}
print_r($plain);

Output:

 echo "hello World\n";

Here is a list of all PHP tokens:

http://www.php.net/manual/en/tokens.php

Sign up to request clarification or add additional context in comments.

Comments

0

Try this

$file = preg_replace('/^\s*?\/\*.*?\*\//m', '', $file);

Comments

0

The best way to parse PHP code is to use the tokenizer.

However it is not so difficult to do it with a regex. You must only skip all strings:

$pattern = <<<'EOD'
~
(?(DEFINE)
    (?<sq> ' (?>[^'\\]++|\\{2}|\\.)* ' )   # single quotes
    (?<dq> " (?>[^"\\]++|\\{2}|\\.)* " )   # double quotes
    (?<hd> <<< \s* (["']?)(\w+)\g{-2} \R .*? (?<=\n) \g{-1} ;? (\R|$) ) # heredoc like
    (?<string> \g<sq> | \g<dq> | \g<hd>)
)
\g<string> (*SKIP)(*FAIL) | /\* .*? \*/
~xs
EOD;

$result = preg_replace($pattern, '', $data);

2 Comments

Can you please convert it in one line regex, because i don't know how to do that? And can you also fix it? Doesn't work for single line comment (// ...). Thx.
@ValentinTanasescu: Comments are not the place to ask questions. Also, there is nothing to fix since the question is how to remove block comments, not single line comments.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.