0

I'm struggling with a simple regex, for which I can't seem to get right.

I have some text like so:

This comment is great **[@madeUpUser1](/madeUpUser1)** You said something similar did you mate? **[@madeUpUser2](/madeUpUser2)**

What I would like to end up with is an array list containing the usernames inbetween the parentheses i.e.:

0.madeUpUser1
1.madeUpUser2

And here is the code I have so far:

List<String> matches = Pattern.compile("\\((.+?)\\)")
        .matcher("This comment is great **[@madeUpUser1](/madeUpUser1)** You said something similar did you mate? **[@madeUpUser2](/madeUpUser2)**")
        .results()
        .map(MatchResult::group)
        .collect(Collectors.toList());

However what I'm getting back is this:

0."(/madeUpUser1)"
1."(/madeUpUser2)"

Again, where I want:

0.madeUpUser1
1.madeUpUser2

i.e. without the parentheses and without the forwardslash

Can anyone shed any light on what I'm doing wrong with my regex please?

2
  • You can try (?<=\(\/)[^)]+(?=\)). Commented Jan 5, 2022 at 13:22
  • I suggest adjusting the title of the question. So far, the accepted answer has no solution for "Text extraction into an array list", only a regex that can be used to do that. Maybe it should sound as "Extraction of string between parentheses excluding the first underscore", or something similar. Commented Jan 7, 2022 at 1:00

3 Answers 3

1

Try this regex:

(?<=\\(/)[^)]+(?=\\))

Click for Demo

Explanation

  • (?<=\\(/) - positive lookbehind to make sure that the current position is preceded by a (/

  • [^)]+ - matches 1 or more occurences(as many as possible) of any character that is not a )

  • (?=\\)) - positive lookahead to make sure that the current position is followed by a )

With the regex you are using, \\((.+?)\\), the following happens:

  • \\( - matches the opening parenthesis (
  • (.+?) - matches any character(except a new line character) 1 or more times, as few as possible. This subpattern will keep on expanding the match until it reaches the ). That's why it is matching everything between the parenthesis(even the /)
  • \\) - matches the closing parenthesis )
Sign up to request clarification or add additional context in comments.

2 Comments

Wow thank you for the fast response, this works as expected. May I be a pain and ask can you explain it a bit for me so I can understand why it works? once again thank you very much!!
Performance-wise, Wiktor's answer is more efficient. I made use of lookarounds which are more expensive.
1

You can match ](/ and then capture any zero or more chars other than ( and ) till the next ), and collect Group 1 matches only:

import java.util.*;
import java.util.regex.*;
import java.util.stream.Collectors;


class Test
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String text = "This comment is great **[@madeUpUser1](/madeUpUser1)** You said something similar did you mate? **[@madeUpUser2](/madeUpUser2)**";

        Pattern p = Pattern.compile("]\\(/([^()]*)\\)");
        List<String> results = p.matcher(text)
            .results()
            .map(mr -> mr.group(1))
            .collect(Collectors.toList());
        
        // Or, to get a string array:
        // String[] results = p.matcher(text).results().map(mr -> mr.group(1)).toArray(String[]::new);

        for (String x: results) {
            System.out.println(x);
        }
    }
}

See the online demo. Output:

madeUpUser1
madeUpUser2

See the regex demo. Details:

  • ]\(/ - a ])/ string
  • ([^()]*) - Capturing group 1: any zero or more chars other than ) and (
  • \) - a ) char.

Comments

1

You can use a capture group, and match the outer parenthesis/square brackets:

\(/([^\s()]+)\)
  • \(/ Match (/
  • ( Capture group 1
    • [^\s()]+ Match 1+ chars other than a whitespace char or ( )
  • ) Close group 1
  • \) Match )

Regex demo

List<String> matches = Pattern.compile("\\(/([^\\s()]+)\\)")
    .matcher("This comment is great **[@madeUpUser1](/madeUpUser1)** You said something similar did you mate? **[@madeUpUser2](/madeUpUser2)**")
    .results()
    .map(m -> m.group(1))
    .collect(Collectors.toList());

for (String s : matches)
    System.out.println(s);

Output

madeUpUser1
madeUpUser2

Or in the example, the string between the square brackets seems to be the same, so another option using the same code could be:

\[@([^\s\]\[]+)]
  • \[@ match [@
  • ( Capture group 1
    • [^\s\]\[]+ Match 1+ chars other than a whitespace char or [ ]
  • ) Close group 1
  • ] Match ]

Regex demo | Java demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.