2

I have the following string

aaaa#include(soap1.xml)bbbb #include(soap2.xml)cccc #include(soap2.xml)

I want to find all occurrences of #include([anyfilename]) where [anyfilename] varies.

I have the regex (?<=#include\()(.*?)(?=\)*\)) which matches [anyfilename] but then performing a replace using this leaves behind there #include()

Can someone suggest show me how to find/replace the entire #include([anyfilename])?

5
  • 1
    What is the expected result? Please add your current code to the answer. Do you want to remove #include(soap2.xml) and #include(soap1.xml)? Commented Feb 26, 2016 at 21:56
  • "performing a replace using this leaves behind there #include()" well, look-around mechanisms are zero-length (they are not included in match - the one you want to replace) so that behaviour is expected. What else did you expect and why? Commented Feb 26, 2016 at 22:02
  • I shared a wrong link :) I will post an answer. Commented Feb 26, 2016 at 22:13
  • \)*\) is the same as \)+, which is clearer on intent. As for that, why match multiple close-parenthesis? Commented Feb 26, 2016 at 22:15
  • Right now, the positive look-behind/-ahead will check for existence, without being included. If you want to include them in the match, just drop the look-behind/-ahead parts. And since your capture group would then equal the entire match, there's no need for capturing. So that would result in: #include\(.*?\)+ (see regex101 for result). Commented Feb 26, 2016 at 22:18

1 Answer 1

1

You may use the following regex:

#include\(([^)]*)\)

See the regex demo

I replaced lookarounds (that are zero-width assertions and do not consume text, do not return it in the match value) with consuming equivalents.

The regex breakdown:

  • #include\( - match a sequence of literal symbols #include(
  • ([^)]*) - Group 1 (we'll refer to the value inside the group with matcher.group(1)) matching zero or more characters other than )
  • \) - match a literal )

The same pattern can be used to retrieve the filenames, and remove whole #include()s from the input.

IDEONE demo:

String str = "aaaa#include(soap1.xml)bbbb#include(soap2.xml)cccc";
String p = "#include\\(([^)]*)\\)";
Pattern ptrn = Pattern.compile(p);
Matcher matcher = ptrn.matcher(str);
List<String> arr = new ArrayList<String>();
while (matcher.find()) {
    arr.add(matcher.group(1));       // Get the Group 1 value, file name
}
System.out.println(arr); // => [soap1.xml, soap2.xml]
System.out.println(str.replaceAll(p, "")); // => aaaabbbbcccc
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.