138

I have this code, and I want to know, if I can replace only groups (not all pattern) in Java regex. Code:

 //...
 Pattern p = Pattern.compile("(\\d).*(\\d)");
    String input = "6 example input 4";
    Matcher m = p.matcher(input);
    if (m.find()) {

        //Now I want replace group one ( (\\d) ) with number 
       //and group two (too (\\d) ) with 1, but I don't know how.

    }
1
  • 14
    Can you clarify your question, like maybe give the expected output for that input? Commented Jun 12, 2009 at 20:12

8 Answers 8

163

Use $n (where n is a digit) to refer to captured subsequences in replaceFirst(...). I'm assuming you wanted to replace the first group with the literal string "number" and the second group with the value of the first group.

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher m = p.matcher(input);
if (m.find()) {
    // replace first number with "number" and second number with the first
    // the added group ("(.*)" which is $2) captures unmodified text to include it in the result
    String output = m.replaceFirst("number$2$1"); // "number example input 6"
}

Consider (\D+) for the second group instead of (.*). * is a greedy matcher, and will at first consume the last digit. The matcher will then have to backtrack when it realizes the final (\d) has nothing to match, before it can match to the final digit.

Edit

Years later, this still gets votes, and the comments and edits (which broke the answer) show there is still confusion on what the question meant. I've fixed it, and added the much needed example output.

The edits to the replacement (some thought $2 should not be used) actually broke the answer. Though the continued votes shows the answer hits the key point - Use $n references within replaceFirst(...) to reuse captured values - the edits lost the fact that unmodified text needs to be captured as well, and used in the replacement so that "only groups (not all pattern)".

The question, and thus this answer, is not concerned with iterating. This is intentionally an MRE.

Sign up to request clarification or add additional context in comments.

6 Comments

Would have been nice if you would have posted an example output
This works on the first match, but wont work if there are many groups and you are iterating over them with a while(m.find())
I Agree with Hugo, this is a terrible way to implement the solution... Why on Earth is this the accepted answer and not acdcjunior's answer - which is the perfect solution: small amount of code, high cohesion and low coupling, much less chance (if not no chance) of unwanted side effects... sigh...
This answer is currently not valid. The m.replaceFirst("number $2$1"); should be m.replaceFirst("number $3$1");
if we want the output proposed in this answer "number example input 6" to avoid confusion with a group nr $n, you can name groups e.g. start, middle, end: Pattern p = Pattern.compile("(?<start>\\d)(?<middle>.*)(?<end>\\d)"); then refer: m.replaceFirst("number${middle}${start}");
|
76

You could use Matcher#start(group) and Matcher#end(group) to build a generic replacement method:

public static String replaceGroup(String regex, String source, int groupToReplace, String replacement) {
    return replaceGroup(regex, source, groupToReplace, 1, replacement);
}

public static String replaceGroup(String regex, String source, int groupToReplace, int groupOccurrence, String replacement) {
    Matcher m = Pattern.compile(regex).matcher(source);
    for (int i = 0; i < groupOccurrence; i++)
        if (!m.find()) return source; // pattern not met, may also throw an exception here
    return new StringBuilder(source).replace(m.start(groupToReplace), m.end(groupToReplace), replacement).toString();
}

public static void main(String[] args) {
    // replace with "%" what was matched by group 1 
    // input: aaa123ccc
    // output: %123ccc
    System.out.println(replaceGroup("([a-z]+)([0-9]+)([a-z]+)", "aaa123ccc", 1, "%"));

    // replace with "!!!" what was matched the 4th time by the group 2
    // input: a1b2c3d4e5
    // output: a1b2c3d!!!e5
    System.out.println(replaceGroup("([a-z])(\\d)", "a1b2c3d4e5", 2, 4, "!!!"));
}

Check online demo here.

3 Comments

This really should be the accepted answer it's the most complete and "ready to go" solution without introducing a level of coupling to the accompanying code. Although I would recommend changing the method names of one of those. At first glance it looks like a recursive call in the first method.
Missed edit opportunity. Take back the part about the recursive call, didn't analyse the code properly. The overloads work well together
This solution out-of-the-box is fit only to replace single occurence and one group, and because of copying of the full string with each replace would be highly suboptimal for any other purpose. But it's a good starting point. A pity Java hat a lot of nonsense, but is lacking basic string manipulation facilities.
41

Sorry to beat a dead horse, but it is kind-of weird that no-one pointed this out - "Yes you can, but this is the opposite of how you use capturing groups in real life".

If you use Regex the way it is meant to be used, the solution is as simple as this:

"6 example input 4".replaceAll("(?:\\d)(.*)(?:\\d)", "number$11");

Or as rightfully pointed out by shmosel below,

"6 example input 4".replaceAll("\d(.*)\d", "number$11");

...since in your regex there is no good reason to group the decimals at all.

You don't usually use capturing groups on the parts of the string you want to discard, you use them on the part of the string you want to keep.

If you really want groups that you want to replace, what you probably want instead is a templating engine (e.g. moustache, ejs, StringTemplate, ...).


As an aside for the curious, even non-capturing groups in regexes are just there for the case that the regex engine needs them to recognize and skip variable text. For example, in

(?:abc)*(capture me)(?:bcd)*

you need them if your input can look either like "abcabccapture mebcdbcd" or "abccapture mebcd" or even just "capture me".

Or to put it the other way around: if the text is always the same, and you don't capture it, there is no reason to use groups at all.

4 Comments

The non-capturing groups are unnecessary; \d(.*)\d will suffice.
I don't understand the $11 here. Why 11 ?
@Alexis - This is a java regex quirk: if group 11 has not been set, java interprets $11 as $1 followed by 1.
Won't this approach result in the regular expression being compiled over and over again, with each use? Is there a similar approach using a precompiled Pattern?
4

Here is a different solution, that also allows the replacement of a single group in multiple matches. It uses stacks to reverse the execution order, so the string operation can be safely executed.

private static void demo () {

    final String sourceString = "hello world!";

    final String regex = "(hello) (world)(!)";
    final Pattern pattern = Pattern.compile(regex);

    String result = replaceTextOfMatchGroup(sourceString, pattern, 2, world -> world.toUpperCase());
    System.out.println(result);  // output: hello WORLD!
}

public static String replaceTextOfMatchGroup(String sourceString, Pattern pattern, int groupToReplace, Function<String,String> replaceStrategy) {
    Stack<Integer> startPositions = new Stack<>();
    Stack<Integer> endPositions = new Stack<>();
    Matcher matcher = pattern.matcher(sourceString);

    while (matcher.find()) {
        startPositions.push(matcher.start(groupToReplace));
        endPositions.push(matcher.end(groupToReplace));
    }
    StringBuilder sb = new StringBuilder(sourceString);
    while (! startPositions.isEmpty()) {
        int start = startPositions.pop();
        int end = endPositions.pop();
        if (start >= 0 && end >= 0) {
            sb.replace(start, end, replaceStrategy.apply(sourceString.substring(start, end)));
        }
    }
    return sb.toString();       
}

Comments

4

replace the password fields from the input:

{"_csrf":["9d90c85f-ac73-4b15-ad08-ebaa3fa4a005"],"originPassword":["uaas"],"newPassword":["uaas"],"confirmPassword":["uaas"]}



  private static final Pattern PATTERN = Pattern.compile(".*?password.*?\":\\[\"(.*?)\"\\](,\"|}$)", Pattern.CASE_INSENSITIVE);

  private static String replacePassword(String input, String replacement) {
    Matcher m = PATTERN.matcher(input);
    StringBuffer sb = new StringBuffer();
    while (m.find()) {
      Matcher m2 = PATTERN.matcher(m.group(0));
      if (m2.find()) {
        StringBuilder stringBuilder = new StringBuilder(m2.group(0));
        String result = stringBuilder.replace(m2.start(1), m2.end(1), replacement).toString();
        m.appendReplacement(sb, result);
      }
    }
    m.appendTail(sb);
    return sb.toString();
  }

  @Test
  public void test1() {
    String input = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"123\"],\"newPassword\":[\"456\"],\"confirmPassword\":[\"456\"]}";
    String expected = "{\"_csrf\":[\"9d90c85f-ac73-4b15-ad08-ebaa3fa4a005\"],\"originPassword\":[\"**\"],\"newPassword\":[\"**\"],\"confirmPassword\":[\"**\"]}";
    Assert.assertEquals(expected, replacePassword(input, "**"));
  }

Comments

3

You can use matcher.start() and matcher.end() methods to get the group positions. So using this positions you can easily replace any text.

Comments

1

Since Java 9 you can use Matcher.replaceAll. The usage is as follows:

Pattern p = Pattern.compile("(\\d)(.*)(\\d)");
String input = "6 example input 4";
Matcher matcher = p.matcher(input);
String output = matcher.replaceAll(matchResult -> "%s%s%s".formatted("number", matchResult.group(2), matchResult.group(1) ));

output should be equal to number example input 6

matchResult.group(0) is the whole pattern, so groups are indexed from 1

Comments

0

Here is a search / replace solution which supports:

  1. Regex with groups OR no groups.
  2. Replace just the first match OR all matches.
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Utils {

    public static String searchReplace(String content, 
                                       String matchRegex, 
                                       String replaceText, 
                                       boolean firstOnly) {
        Pattern pattern = Pattern.compile(matchRegex, Pattern.MULTILINE);
        Matcher matcher = pattern.matcher(content);
        int group = matcher.groupCount() > 0 ? 1 : 0;

        StringBuilder output = new StringBuilder();
        while (matcher.find()) {
            Matcher m = pattern.matcher(matcher.group(0));
            if (m.find()) {
                StringBuilder stringBuilder = new StringBuilder(m.group(0));
                String result = stringBuilder.replace(m.start(group), m.end(group), replaceText).toString();
                matcher.appendReplacement(output, result);
            }
            if (firstOnly) {
                break;
            }
        }
        matcher.appendTail(output);
        return output.toString();
    }

}

To test all these scenarios, here is a unit test (uses AssertJ) library:

import org.junit.jupiter.api.Test;
import static org.assertj.core.api.Assertions.assertThat;

public class UtilsTest {

    @Test
    public void shouldSearchReplace() {
        String input = "black cat, black cat, black dog, white cat";

        // 1. Search with (a) NO GROUPS + (b) REPLACE FIRST ONLY
        assertThat(GeneratorUtils.searchReplace(input, "black", "red", true))
                .isEqualTo("red cat, black cat, black dog, white cat");

        // 2. Search with (a) NO GROUPS + (b) REPLACE ALL
        assertThat(GeneratorUtils.searchReplace(input, "black", "red", false))
                .isEqualTo("red cat, red cat, red dog, white cat");

        // 3. Search with (a) GROUPS + (b) REPLACE FIRST ONLY
        assertThat(GeneratorUtils.searchReplace(input, "black (cat)", "pig", true))
                .isEqualTo("black pig, black cat, black dog, white cat");

        // 4. Search with (a) NO GROUPS + (b) REPLACE ALL
        assertThat(GeneratorUtils.searchReplace(input, "black (cat)", "pig", false))
                .isEqualTo("black pig, black pig, black dog, white cat");
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.