3

I have several directories which contain large Java files and I would like to pull out all the log messages. This includes log.error, .info, etc. In general, they look something like this:

logger.error("some message here");

The problem is that some of these messages include line breaks, and therefore grep is not picking up the full message:

logger.debug("operation [" + j + "] = whatever " + ids[j] + 
" name: " + names[j] + " time: " + times[j]);

Is there a way that I can use regular expressions to get the entire Java statement, up to the semicolon?

Here is what I have so far:

grep -rn --include \*.java "\b\.error(\"\b" *
3
  • Is this for removing log calls from release builds? Or do you want to get the log lines for some other reason? If it's for removing the log calls, then use a tool like proguard which are designed to do that already. Commented Jul 20, 2016 at 17:28
  • There will always be cases that don't work. regular expressions are just not the right tool for programming language parsing. Commented Jul 20, 2016 at 17:28
  • I'm using it for analysis purposes. I'd like to make sure all the messages are consistent across different modules and different files. And I'm no regex pro, so I had no idea if there was some solution I was just overlooking. I figured it wouldn't be pretty though! Commented Jul 20, 2016 at 17:31

1 Answer 1

2

Try:

find . -iname '*.java' -exec awk '/logger/,/;/' *.java +

As an example, let's consider this test file:

$ cat file.java 
some(text);
logger.debug("operation [" + j + "] = whatever " + ids[j] + 
" name: " + names[j] + " time: " + times[j]);
other(text);
logger.error("some message here");
more(text); 

Let's extract its logger statements:

$ find . -iname '*.java' -exec awk '/logger/,/;/' {} +
logger.debug("operation [" + j + "] = whatever " + ids[j] + 
" name: " + names[j] + " time: " + times[j]);
logger.error("some message here");

This works by looking for lines that contain logger and printing every line from there to the first line that contains ;.

As Henry points out in the comments, regex algorithms like this are not foolproof. But, if you are using this just for visual inspection, this should be a good start.

If you also want to record the file name and line number:

$ find . -iname '*.java' -exec awk '/logger/,/;/{printf "%s:%s: %s\n",FILENAME,FNR,$0}' {} +
./file.java:2: logger.debug("operation [" + j + "] = whatever " + ids[j] + 
./file.java:3: " name: " + names[j] + " time: " + times[j]);
./file.java:5: logger.error("some message here");
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.