0

Requirement : String "richText" which can include plain text +anchor tag. The anchor tag is rewritten to modify its target, append JS, etc

Issue: The pattern matcher find() & appendReplacement() works fine till there is no special character "$" in the anchor tag. It throws an exception when $ is part of anchor tag.

Line 1 fixes up the exception part but creates an issue if "$" or "\" is present in plain text since plain text now has additional escape characters around the above 2 special characters(bcoz of quoteReplacement()). How do I strip the additional escape characters from plain text(undo affect of quoteReplacement)?

Method:

    String richText = Matcher.quoteReplacement(rText); //Line 1-escape characters   
    String anchorTagPattern = "<a[^>]*?href\\s*=[^>]*>(.*?)</a>";
    StringBuffer result = new StringBuffer(richText.length());
    Pattern pattern = Pattern.compile(anchorTagPattern);
    Matcher matcher = pattern.matcher(richText);
    while (matcher.find()) {
               String aTag = matcher.group();
               .......
               String formattedAnchorTag = rewriteTag(aTag);
               matcher.appendReplacement(result, formattedAnchorTag); ....
    }
    matcher.appendTail(result);
    //Plain text with $ \ has some additional escape characters because of Line 1. How    to remove them:

rText entered is

Plain text having $. Anchor tag to be rewritten is <a href=\"http://www.google.com\">google$</a>

If Line1 in the method- quoteReplacement is commented then I get java.lang.IllegalArgumentException: Illegal group reference at java.util.regex.Matcher.appendReplacement(Matcher.java:724)

If I leave it, the exception goes away but the string returned is

Plain text having \$. Anchor tag to be rewritten is <a href="http://www.google.com" target="_blank">google$</a>
1
  • after richText is assigned, but before anchorTagPattern is created, what does richText look like for error inputs? also, what does rText look like for error inputs? Commented Mar 14, 2012 at 19:29

1 Answer 1

1

Matcher.quoteReplacement should not be called on rText. The first question mark in the pattern seems superfluous. Only rewriteTag may be the cause.


formattedAnchorTag = Matcher.quoteReplacement(formattedAnchorTag);
matcher.appendReplacement(result, formattedAnchorTag);
Sign up to request clarification or add additional context in comments.

1 Comment

That should not matter. The source text may contain $ without special significance.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.