3

I have defined following pattern to parse our own custom for a function which is: <functionName>[<arg1>, <arg2>, ..] and the pattern is:
([a-zA-Z0-9]+)\\[(([a-zA-Z0-9]+)(,([a-zA-Z0-9])+)*?)*?\\]

Now when I run this against an example function: properCase[ARG1,ARG2], I get the following output:

Total matches are: 5
Group number 0is: properCase[ARG1, ARG2]
Group number 1is: properCase
Group number 2is: ARG1
Group number 3is: ARG1
Group number 4is: ,ARG2

Code:

        Matcher m = funcPattern.matcher("properCase[ARG1, ARG2]");
        System.out.println("Match found: " + m.matches());
        System.out.println("Total matches are: " + m.groupCount());
        if (m.matches())
        {
            for (int index= 0 ; index < m.groupCount(); index++)
            {
                System.out.println("Group number "+ index + "is: " +m.group(index));
            }
        }

How can I only extract out the function name (as group 1) and argument list (as group 2, group 3)? I am not able to eliminate the , from the group.

2 Answers 2

2

I'm not able to use the regex you provided to match properCase[ARG1, ARG2], but to answer your question more generally, you should use non capturing groups (?:your_regex) to not include it in the matcher groups

EDIT:

If you aren't married to using a single regex to do the parsing, consider the following: split up the string into function name and arguments groups, and then split the arguments group by the delimiter ,

import java.util.regex.*
String regex="([a-zA-Z0-9]+)\\[([ ,.a-zA-Z0-9]+)\\]"
Pattern funcPattern = Pattern.compile(regex);
Matcher m = funcPattern.matcher("properCase[ARG1, ARG2, class.otherArg]");
        System.out.println("Match found: " + m.matches());
        System.out.println("Total matches are: " + m.groupCount());
        if (m.matches())
        {
            for (int index= 0 ; index <= m.groupCount(); index++)
            {
                System.out.println("Group number "+ index + "is: " +m.group(index));
            }
        }
println "Arguments: " + m.group(2).split(",");

Produces:

Match found: true
Total matches are: 2
Group number 0is: properCase[ARG1, ARG2, class.otherArg]
Group number 1is: properCase
Group number 2is: ARG1, ARG2, class.otherArg
Arguments: [ARG1,  ARG2,  class.otherArg]
Sign up to request clarification or add additional context in comments.

1 Comment

thank you! You won't be able to capture if the input string has spaces between the arguments. ARG1,ARG2 should work fine
1

Enclose the comma in its own group to work around it.

([a-zA-Z0-9]+)\\[(([a-zA-Z0-9]+)(,)(([a-zA-Z0-9])+)*?)*?\\]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.