0

I found String.replaceAll() in Java with regular expression underlying. It works fine in short string. But in case of long string, I need a more efficient algorithm instead of String.replaceAll(). Anyone could give advice? Thanks!

3
  • You'll need to provide more details if you want a meaningful answer. Commented Aug 19, 2010 at 13:43
  • What language are you talking about ? Commented Aug 19, 2010 at 13:43
  • 1
    replaceAll() is quite tuned for the general case. Sharing the specific string you are trying to search for might help come up with a better answer. Commented Aug 19, 2010 at 14:10

2 Answers 2

4

If you want to do incremental replacement, you can use an explicit appendReplacement/Tail loop to a StringBuffer (unfortunately no StringBuilder overloads as of yet).

Here's the idiom from the documentation:

 Pattern p = Pattern.compile(PATTERN);
 Matcher m = p.matcher(INPUT_SOURCE);

 StringBuffer sb = new StringBuffer();
 while (m.find()) {
     m.appendReplacement(sb, REPLACEMENT);
 }
 m.appendTail(sb);

 System.out.println(sb.toString());

This is pretty much how replaceAll is implemented.

The benefit of this method is that since you have full control over the replacement iteration, you don't have to store the entire output in memory at any given time as a potentially long String. You can incrementally build the output, flushing the StringBuffer content to the disk periodically. That is, using this method can be more memory efficient than using replaceAll.

(You can also do fancy replacements that is not supported by the current replacement syntax, e.g. toUpperCase() transformation).

Note that there is a request for enhancement for Matcher to be able to append to any Appendable. If granted, not only can you use StringBuilder, but you can also directly replace to e.g. a FileWriter.

Related questions

See also

Sign up to request clarification or add additional context in comments.

Comments

0

You could try String.replace(CharSequence target, CharSequence replacement). It still uses a pattern and a matcher but target is not a regular expression.

1 Comment

Precisely because this uses a Pattern/Macher underneath, it's wrong to think that this is a lot more efficient performance wise (though it should be acceptable).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.