1

Using java, I am writting a script to anchor link an html bibliography. That is going from: [1,2] to: <a href="o100701.html#bib1">[1, 2]</a>

I think I have found the right regex expression: \[.*?\]

What I am having trouble with is writting the code that will retain the values inside the expression while surounding it with the link tags.

This is the most of I can think of

while(myScanner.hasNext())
{
 line = myScanner.nextLine();
 myMatcher = myPattern.matcher(line);
 ...
 outputBufferedWritter.write(line+"\n");
}

The files aren't very large and there almost always less then 100 matches, so I don't care about performance.

2
  • 1
    Will the link URL be different depending on what's between the brackets? Commented Jul 27, 2010 at 9:22
  • Yes, so [1] should go to anchor 1, [29] to anchor 29. I am not quite sure what i'm going to do for [1,29] or [1-29], maybe just go to the first anchor. Commented Jul 27, 2010 at 17:52

1 Answer 1

2

First of all I think a better pattern to match the [tag] content is [\[\]]* instead of .*? (i.e. anything but opening and closing brackets).

For the replacement, if the URL varies depending on the [tag] content, then you need an explicit Matcher.find() loop combined with appendReplacement/Tail.

Here's an example that sets up a Map<String,String> of the URLs and a Matcher.find() loop for the replacement:

    Map<String,String> hrefs = new HashMap<String,String>();
    hrefs.put("[1,2]", "one-two");
    hrefs.put("[3,4]", "three-four");
    hrefs.put("[5,6]", "five-six");

    String text = "p [1,2] \nq [3,4] \nr [5,6] \ns";

    Matcher m = Pattern.compile("\\[[^\\[\\]]*\\]").matcher(text);
    StringBuffer sb = new StringBuffer();
    while (m.find()) {
        String section = m.group(0);
        String url = String.format("<a href='%s'>%s</a>",
            hrefs.get(section),
            section
        );
        m.appendReplacement(sb, url);
    }
    m.appendTail(sb);

    System.out.println(sb.toString());

This prints:

p <a href='one-two'>[1,2]</a> 
q <a href='three-four'>[3,4]</a> 
r <a href='five-six'>[5,6]</a> 
s

Note that appendReplacement/Tail do not have StringBuilder overload, so StringBuffer must be used.

References

Related questions

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.