1

I have a scenario where I want to do a nested expression matching in Java.

Consider the following expression:

SUM_ALL(2:3,4:5)>20

where SUM_ALL has a reserved operator meaning in the application. Now, I want to extract the operator name and its arguments from a given expression. For doing the same, I have defined my pattern expression as follows:

Pattern testPattern = Pattern.compile("[^a-zA-Z]*([a-zA-Z_]+)\\s*\\(\\s*([0-9:,]+)\\s*\\).*");

This works fine if the expression is limited simply to the above. Here is the output for the same:

Group 1: SUM_ALL
Group 2: 2:3,4:5

Now, in a given expression, I may not be aware of the number of such operators present. For example consider the following case:

SUM_ALL(4:5,6:7)>MAX(2:3,4:4)+MIN(3:4,5:7)

Now, I want to extract each of the above operators and their respective arguments to perform the calculation according to their reserved meaning and then evaluate a simple math expression.

If there were a nesting capability in the Java pattern matcher it would have helped to extract the operators one by one by considering the rest of the expression once an operator is resolved. I know it is possible to do it by capturing the rest of the expression in a separate group and then running the matcher on that group value and keep doing it until we reach the end of the expression, but I would be more interested to know if the pattern-matcher has an inherent functionality for the same.

3
  • 2
    When managing such formulae which can have free forms, you should start looking into lexers / parsers, like Parboiled or Javacc... Commented Oct 25, 2012 at 11:00
  • ... or ANTLR, a fine tool for parsing your own grammar Commented Oct 25, 2012 at 11:15
  • 1
    The absolute best resource to learn regex for Java (and Perl, .NET, PHP/PCRE, etc.) is of course the book: Mastering Regular Expressions (3rd Edition) By Jeffrey Friedl. Hands down the most useful book I've ever read. Commented Oct 25, 2012 at 12:57

2 Answers 2

1

You can use code like this:

String str = "SUM_ALL(4:5,6:7)>MAX(2:3,4:4)+MIN(3:4,5:7)";
Matcher m = 
    Pattern.compile("(?i).*?([a-z_]+)\\s*\\(\\s*([\\d:,]+)\\s*\\)").matcher(str);
while (m.find())
   System.out.printf("%s :: %s%n", m.group(1), m.group(2));

OUTPUT:

SUM_ALL :: 4:5,6:7
MAX :: 2:3,4:4
MIN :: 3:4,5:7
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for a wonderful answer. Can you please suggest some good sources to read on regex and parsing in java?
Java official doc: docs.oracle.com/javase/tutorial/essential/regex and this tutorial here: vogella.com/articles/JavaRegularExpressions/article.html are good starting points.
1

Well, have this:

(?:(SUM_ALL|MAX|MIN|addmorehere)\\(((?:\d+:\d+,?){2})\\)[+-><*/addmorehere]?)+

It's not really scaped for java or any language but you get the idea

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.