2

Considering the following string: "${test.one}${test.two}" I would like my regex to return two matches, namely "test.one" and "test.two". To do that I have the following snippet:

import java.util.regex.Matcher; import java.util.regex.Pattern;

public class RegexTester {

    private static final Pattern pattern = Pattern.compile("\\$\\{((?:(?:[A-z]+(?:\\.[A-z0-9()\\[\\]\"]+)*)+|(?:\"[\\w/?.&=_\\-]*\")+)+)}+$");

    public static void main(String[] args) {
        String testString = "${test.one}${test.two}";

        Matcher matcher = pattern.matcher(testString);

        while (matcher.find()) {
            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println(matcher.group(i));
            }
        }
    }
}

I have some other stuff in there as well, because I want this to also be a valid match ${test.one}${"hello"}.

So, basically, I just want it to match on anything inside of ${} as long as it either follows the format: something.somethingelse (alphanumeric only there) or something.somethingElse() or "something inside of quotations" (alphanumeric plus some other characters). I have the main regex working, or so I think, but when I run the code, it finds two groups,

${test.two} test.two

I want the output to be

test.one test.two

6
  • 1
    Something like \$\{(\"[^\"]*\"|\w+(?:\(\))?(?:\.\w+(?:\(\))?)*)}? See regex101.com/r/ILmyTj/1 Commented May 1, 2020 at 17:40
  • And ignore group zero, which represents the entire expression. Commented May 1, 2020 at 17:43
  • I expect you meant ${test.one} test.two for your third-to-last line. Commented May 1, 2020 at 18:25
  • @WiktorStribiżew - that basically works. I tweaked it a little. So, what was the issue with the regex that I had? I'd just like to understand what I was doing wrong so I can hopefully not repeat the same mistake :) Commented May 1, 2020 at 18:46
  • @cloudwalker I tried to explain, but your regex is too cumbersome. If you need more drill-through, please let me know. Commented May 1, 2020 at 19:20

2 Answers 2

2

Basically, your regex main problem is that it matches only at the end of string, and you match many more chars that just letters with [A-z]. Your grouping also seem off.

If you load your regex at regex101, you will see it matches

  • \$\{
  • ( - start of a capturing group
    • (?: - start of a non-capturing group
      • (?:[A-z]+ - start of a non-capturing group, and it matches 1+ chars between A and z (your first mistake)
        • (?:\.[A-z0-9()\[\]\"]+)* - 0 or more repetitions of a . and then 1+ letters, digits, (, ), [, ], ", \, ^, _, and a backtick
      • )+ - repeat the non-capturing group 1 or more times
      • | - or
      • (?:\"[\w/?.&=_\-]*\")+ - 1 or more occurrences of ", 0 or more word, /, ?, ., &, =, _, - chars and then a "
      • )+ - repeat the group pattern 1+ times
    • ) - end of non-capturing group
  • }+ - 1+ } chars
  • $ - end of string.

To match any occurrence of your pattern inside a string, you need to use

\$\{(\"[^\"]*\"|\w+(?:\(\))?(?:\.\w+(?:\(\))?)*)}

See the regex demo, get Group 1 value after a match is found. Details:

  • \$\{ - a ${ substring
  • (\"[^\"]*\"|\w+(?:\(\))?(?:\.\w+(?:\(\))?)*) - Capturing group 1:
    • \"[^\"]*\" - ", 0+ chars other than " and then a "
    • | - or
    • \w+(?:\(\))? - 1+ word chars and an optional () substring
    • (?:\.\w+(?:\(\))?)* - 0 or more repetitions of . and then 1+ word chars and an optional () substring
  • } - a } char.

See the Java demo:

String s = "${test.one}${test.two}\n${test.one}${test.two()}\n${test.one}${\"hello\"}";
Pattern pattern = Pattern.compile("\\$\\{(\"[^\"]*\"|\\w+(?:\\(\\))?(?:\\.\\w+(?:\\(\\))?)*)}");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} 

Output:

test.one
test.two
test.one
test.two()
test.one
"hello"
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, that was super helpful! I've been trying to get into regex stuff more here lately as it is something that I do not have a ton of experience with in the past, and it's definitely been a rocky start :)
0

You could use the regular expression

(?<=\$\{")[a-z]+(?="\})|(?<=\$\{)[a-z]+\.[a-z]+(?:\(\))?(?=\})

which has no capture groups. The characters classes [a-z] can be modified as required provided they do not include a double-quote, period or right brace.

Demo

Java's regex engine performs the following operations.

(?<=\$\{")  # match '${"' in a positive lookbehind
[a-z]+      # match 1+ lowercase letters 
(?="\})     # match '"}' in a positive lookahead
|           # or 
(?<=\$\{)   # match '${' in a positive lookbehind
[a-z]+      # match 1+ lowercase letters 
\.[a-z]+    # match '.' followed by 1+ lowercase letters
(?:\(\))?   # optionally match `()`
(?=\})      # match '}' in a positive lookahead

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.