1

I put together this simplified version of my code to demonstrate the issue:

public static void main(String []args){
    String content="1 [thing i want]\n" +
    "2 [thing i dont want]\n" +
    "3 [thing i dont want] [thing i want]\n" +
    "4 // [thing i want]\n" +
    "5 [thing i want]  // [thing i want]\n";

    String BASE_REGEX = "(?!//)\\[%s\\]";
    Pattern myRegex = Pattern.compile(String.format(BASE_REGEX, "thing i want"));
    Matcher m= myRegex.matcher(content);
    System.out.println("match? "+m);
    String newContent = m.replaceAll("best thing ever");
    System.out.println("regex "+myRegex);
    System.out.println("content:\n"+content);
    System.out.println("new content:\n"+newContent);
 }

I expect my output to be:

new content:
1 best thing ever
2 [thing i dont want]
3 [thing i dont want] best thing ever
4 // [thing i want]
5 best thing ever  // [thing i want]

but I see:

new content:
1 best thing ever
2 [thing i dont want]
3 [thing i dont want] best thing ever
4 // best thing ever
5 best thing ever  // best thing ever

How do I fix the regex?

The unmodified string:

content:
1 [thing i want]
2 [thing i dont want]
3 [thing i dont want] [thing i want]
4 // [thing i want]
5 [thing i want]  // [thing i want]
3
  • 1
    The (?!//) is always true as the next consumed char is [. You seem to avoid replacing in single line comments, right? Match those comments, and only replace the matches in other contexts. Commented Aug 31, 2016 at 19:06
  • 1
    I don't see the relationship between the things you want and don't. Could you post a separate text block of the string content as it exists if printed ? Commented Aug 31, 2016 at 19:09
  • @sln . I added it to the original question. Commented Aug 31, 2016 at 19:58

1 Answer 1

2

There's no real simple way to test if something is in an inline comment or not. The Java regex engine is able to look backward but with a limited "distance" (in other words it allows limited variable length lookbehinds) and I'm not sure building a pattern with this feature is very efficient.

What you can do is to check all from the start of each line with:

(?m)((?:\G|^)[^\[/\n]*+(?:\[(?!thing i want\])[^\[/\n]*|/(?!/)[^\[/\n]*)*+)\[thing i want\]

(escape each backslash to write the pattern string in Java)

With the replacement:

$1best thing ever

explanation: The goal is to capture all from the start of the line before the target or from the previous target in the same line to the next. In this way, you can describe precisely what is allowed or not before a target occurrence (all that isn't the target or two consecutive slashes).

(?m) # switch the multi-line mode on: the ^ means "start of the line"
(    # open the capture group $1
    (?:    # non-capturing group: two possible starts
        \G # contiguous to a previous match (on the same line) 
      |    # OR
        ^  # at the start of the line
    )

    [^\[/\n]*+ # all that is not: an opening bracket, a slash or a newline
              # * stands for "0 or more times" and the + after forbids
              # to backtrack in this part if the pattern fails later
              # "*+" is called a "possessive quantifier"
    (?:
        \[                   # literal [
         (?!thing i want\])  # not followed by "thing i want]"
         [^\[/\n]*            
      |                      # OR
         /                   # literal /
         (?!/)               # not followed by an other /
         [^\[/\n]*
     )*+  # zero or more times
) # close the capture group $1
\[thing i want\] # the target
Sign up to request clarification or add additional context in comments.

4 Comments

are there 2 missing ] in the above?
@MDKF: no, but the [ must be escaped inside character classes in Java. My mistake, it's corrected now.
Thanks! Its really close. I'm loosing the two square brackets during replacement. Do I have to reinsert them in the replacement string ("$1[best thing ever]"), or can the regex be changed to keep them?
@MDKF: reinserting them in the replacement string is more simple.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.