0

I have this code for Identifying the comments and print them in java

import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Solution {
    public static void main(String[] args) {
        Pattern pattern = Pattern.compile("(\\/\\*((.|\n)*)\\*\\/)|\\/\\/.*");
        String code = "";
        Scanner scan = new Scanner(System.in);
        while(scan.hasNext())
        {
            code+=(scan.nextLine()+"\n");

        }
        Matcher matcher = pattern.matcher(code);
        int nxtBrk=code.indexOf("\n");
        while(matcher.find())
        {

            int i=matcher.start(),j=matcher.end();
            if(nxtBrk<i)
            {
                System.out.print("\n");
            }
            System.out.print(code.substring(i,j));
            nxtBrk = code.indexOf("\n",j);

        }



    scan.close();
    }

}

Now when I try the code against this input

 /*This is a program to calculate area of a circle after getting the radius as input from the user*/  
\#include<stdio.h>  
int main()  
{ //something

It outputs right and only the comments. But when I give the input

 /*This is a program to calculate area of a circle after getting the radius as input from the user*/  
\#include<stdio.h>  
int main()  
{//ok
}  
/*A test run for the program was carried out and following output was observed  
If 50 is the radius of the circle whose area is to be calculated
The area of the circle is 7857.1429*/  

The program outputs the whole code instead of just the comments. I don't know what wrong is doing the addition of that last lines.

EDIT: parser is not an option because I am solving problems and I have to use programming language . link https://www.hackerrank.com/challenges/ide-identifying-comments

4
  • 1
    Re "parser is not an option", not using a parser is not an option either unless you want to find spurious comments in "/* A string, not a comment */", "http://foo", "/path/*.txt" /* A file path */. You need to recognize all tokens that can contain comment boundaries to recognize comment boundaries correctly. Commented Jan 12, 2014 at 16:08
  • How can I do that in that website? Commented Jan 12, 2014 at 16:33
  • A parser most certainly is an option, especially as you only really need the lexer part (generally the simplest part if you've already got regular expression support available). Beware! This is quite a deep topic to get into properly; it was part of a second-year course back when I took CS (years ago…) Commented Jan 12, 2014 at 17:40
  • @Unbound, As Donal suggested, you can lex (tokenize) using a single regular expression and then filter out the matches that are not comment tokens. For example, Pattern.compile("(?:" + COMMENT_REGEX + ")|(?:" + STRING_REGEX + ")", ...) where STRING_REGEX = "\"(?:[^\"\\\\]|\\\\.)*\"|'(?:[^'\\\\]|\\\\.)*'". That way, quotes will match as string tokens which will effectively hide any apparent comment boundaries inside string tokens. Commented Jan 13, 2014 at 0:15

2 Answers 2

3

Parsing source code with regular expressions is very unreliable. I'd suggest you use a specialized parser. Creating one is pretty simple using antlr. And, since you seem to be parsing C source files, you can use the C grammar.

Sign up to request clarification or add additional context in comments.

Comments

2

Your pattern, shorn of its Java quoting (and some unnecessary backslashes), is this:

(/\*((.|
)*)\*/)|//.*

That's fine enough, except that it has just greedy quantifiers which means that it will match from the first /* to the last */. You want non-greedy quantifiers instead, to get this pattern:

(/\*((.|
)*?)\*/)|//.*

Small change, big consequence since it now matches to the first */ after the /*. Re-encoded as Java code.

Pattern pattern = Pattern.compile("(/\\*((.|\n)*?)\\*/)|//.*");

(Be aware that you are very close to the limit of what it is sensible to match with regular expressions. Indeed, it's actually incorrect since you might have strings with /* or // in. But you'll probably get away with it…)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.