Regex for matching "non-javadoc" comment in Eclipse

Question

I'm trying to write a regular expression that matches the (non-javadoc) comments in the format

/*
 * (non-javadoc)
 *
 * some other comment here
 *
 */

So far I have (?s)/\*\R.*?non-Javadoc.*?\*/, but that is actually matching too much. I have a header at the top of my file that is something like

/*
 * header text
 */
 public class MyClass {

 }

and it is matching the /* at the top of the file, but I really only want to match the generated (non-javadoc) comment. Can anyone help me fix up this regex?

EDIT: I'm trying to use the Eclipse Find/Replace dialog, but I am open to using external tools if needed.

Don't know, which language that is, but it may be PHP and in this case, use the Tokenizer. — KingCrunch
– KingCrunch, Commented May 31, 2011 at 23:11
(+1) for the question, as I am also just hunting down those ugly zero-information code-rubbish in Eclipse... :( — Tom Fink
– Tom Fink, Commented Mar 19, 2015 at 11:27

Alan Moore · Accepted Answer · 2015-03-19 18:33:31Z

11

This should do it:

(?s)/\*[^*](?:(?!\*/).)*\(non-javadoc\)(?:(?!\*/).)*\*/

/\*[^*] matches the beginning of a C-style comment (/* */) but not a JavaDoc comment (/** */)

(?!\*/). matches any single character unless it's the beginning of a */ sequence. Searching for (?:(?!\*/).)* instead of .*? makes it impossible for a match to start in one comment and end in another.

UPDATE: In (belated) response to the comment by Jacek: yes, you'll probably want to add something to the end of the regex so you can replace it with an empty string and not leave a lot of blank lines in your code. But Jacek's solution is more complicated than it needs to be. All you need to add is \s*

The \R escape sequence matches many kinds of newline, including the Unicode Line Separator (\u2028) and Paragraph Separator (\u2029) and the DOS/network carriage-return+linefeed sequence (\r\n). But those are all whitespace characters, so \s matches them (in Eclipse, at least; according to the docs, it's equivalent to [\t\n\f\r\p{Z}]).

The \s* in Jacek's addition was only meant to match whatever horizontal whitespace (spaces or tabs) might exist before the newline, plus the indentation following it. (You have to remove it because you're not removing the indentation before the first line of the comment.) But it turns out \s* can do the whole job:

(?s)/\*[^*](?:(?!\*/).)*\(non-javadoc\)(?:(?!\*/).)*\*/\s*

edited Mar 19, 2015 at 18:33

answered Jun 1, 2011 at 0:24

Alan Moore

75.6k13 gold badges110 silver badges161 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jacek Kołodziejczyk Over a year ago

It would be also good to add (\s*)\R(\s*) at the end of the above regular expression to include the following line break and indentation characters. Otherwise a replacement with empty string will leave an empty line in place of each comment.

Alan Moore Over a year ago

Someone felt strongly enough about this to edit Jacek's addition into my regex. My thanks to the reviewers who rejected that edit, and thank you, @Tom, for bringing this incomplete answer back to my attention.

ikegami · Accepted Answer · 2011-05-31 23:06:32Z

0

In Perl, it would look like

/
   \/\*
   (?: (?! \*\/ ) . )*
   non-javadoc
   (?: (?! \*\/ ) . )*
   \*\/
/sx

answered May 31, 2011 at 23:06

ikegami

391k17 gold badges291 silver badges555 bronze badges

1 Comment

Jeff Storey Over a year ago

I should have specified it is using the Eclipse find/replace

Collectives™ on Stack Overflow

Regex for matching "non-javadoc" comment in Eclipse

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related