2

I've been trying to remove single-line comments from JavaScript files using the below regex:

Pattern p = Pattern.compile("(?m)(?:[\\(|\\)|;|\\}|\\{])\\s*/{2}(.*?)$");

The pattern works when I test it in something like RegexPal using the "^$ matches line break" option on some sample JavaScript source.

However, what seems to be the problem when putting it into my Java program is that the "m" flag doesn't seem to work correctly. Essentially, even though I'm specifying the flag using the "(?m)" at the beginning of the pattern (though I've also tried using Pattern.MULTILINE), it seems to be ignoring it completely making my $ match everything through the end of the entire document rather than just the EOL.

4
  • it looks like you forgot to include to '^' in the regex so it is just matching the $ to the final EOL. Commented Apr 7, 2011 at 19:23
  • I'm not trying to match anything at the beginning of the line. Commented Apr 7, 2011 at 19:30
  • 1
    It would be helpful if you could show some code. Are you operating on the whole file as a chunk, or are you reading it line by line in a loop and processing each line? Are you using matches() or find()? Also, what good is it doing you to match the comments when you're actually trying to remove them? i.e. are you trying to kill them to make your source smaller, or extract them to look at them or print them out? Commented Apr 7, 2011 at 19:47
  • I feel so stupid now. I'm reading in a file line-by-line into a StringBuilder, but I'm never appending a newline at the end! Thanks Carl for bringing that idea to my attention! Commented Apr 7, 2011 at 19:59

1 Answer 1

1

Works for me:

import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class MultilinePattern {

   public static void main( String[] args ) {
      Pattern p = Pattern.compile("(?m)(?:[\\(|\\)|;|\\}|\\{])\\s*/{2}(.*?)$");
      String multilineJS = "var i = 1; // this is the first comment\n" + //
         " i++; // this is the second comment\n" + //
         " alert(i);";
      Matcher matcher = p.matcher(multilineJS);
      while ( matcher.find() ) {
         System.out.println(matcher.group(1));
      }
   }
}

This snippet yields:

this is the first comment
this is the second comment

What about the line-breaks in the String you use to test the Pattern: Are they correct for your OS? Are you certain they are in your String at all?

Sign up to request clarification or add additional context in comments.

2 Comments

Yeah the regex is good enough for that, but all of a sudden if you have var url = 'http://www.google.com'; you're SOL. So the idea was to use the faux-backtrack to match a // that comes after only certain characters such as ;,}, etc
Really? My snippet (and thus your Pattern) also works with your line. Since you know that, update your question to show what's not working.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.