3

I'm trying to extract a list of function names from a formula, but my regex is not working.

Given ( aaa(111) + bbb(222) ) / ccc ( 333 ) I need to obtain an array of strings containing aaa, bbb and ccc. Instead, I'm getting aaa(, bbb( and ccc (. how to make this work?

This is my attempt:

    String formula = "( aaa(111) + bbb(222) ) / ccc ( 333 )";
    Pattern pattern = Pattern.compile("((\\w+)\\s*\\()");
    Matcher matcher = pattern.matcher(formula);

    while(matcher.find()){
        System.out.println(matcher.group(1));
    }
2
  • Looks like you'll need a parser. You haven't specified the exact rules, but a regex generally isn't powerful enough to do this. Commented Apr 28, 2015 at 23:18
  • You should take group(2) from the match Commented Apr 28, 2015 at 23:22

2 Answers 2

2

You have 2 capturing groups in your pattern:

  • the external parentheses
  • the ones around \\w+

Since you are only interested in the second one, you should either

  • take the second group: matcher.group(2)
  • remove the external parentheses:

    Pattern pattern = Pattern.compile("(\\w+)\\s*\\(");
    

Notice that matcher.group(0) is always the match of the whole pattern (so equivalent to your external parentheses here)

Sign up to request clarification or add additional context in comments.

Comments

0

You've got nested capture groups, make the first one a non-capture group:

Pattern pattern = Pattern.compile("(?:(\\w+)\\s*\\()");

or as pointed out by Didier L just remove the outer group:

Pattern pattern = Pattern.compile("((\\w+)\\s*\\(");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.