11

I am trying to do some stuff with replacing String containing some URL to a browser compatible linked URL.

My initial String looks like this :

"hello, i'm some text with an url like http://www.the-url.com/ and I need to have an hypertext link !"

What I want to get is a String looking like :

"hello, i'm some text with an url like <a href="http://www.the-url.com/">http://www.the-url.com/</a> and I need to have an hypertext link !"

I can catch URL with this code line :

String withUrlString = myString.replaceAll(".*://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"null\">HereWasAnURL</a>");

Maybe the regexp expression needs some correction, but it's working fine, need to test in further time.

So the question is how to keep the expression catched by the regexp and just add a what's needed to create the link : catched string

Thanks in advance for your interest and responses !

1

6 Answers 6

7
public static String textToHtmlConvertingURLsToLinks(String text) {
    if (text == null) {
        return text;
    }

    String escapedText = HtmlUtils.htmlEscape(text);

    return escapedText.replaceAll("(\\A|\\s)((http|https|ftp|mailto):\\S+)(\\s|\\z)",
        "$1<a href=\"$2\">$2</a>$4");
}

There may be better REGEXs out there, but this does the trick as long as there is white space after the end of the URL or the URL is at the end of the text. This particular implementation also uses org.springframework.web.util.HtmlUtils to escape any other HTML that may have been entered.

Sign up to request clarification or add additional context in comments.

1 Comment

Doesn't work for two links that are just separated by one space.
6

Try to use:

myString.replaceAll("(.*://[^<>[:space:]]+[[:alnum:]/])", "<a href=\"$1\">HereWasAnURL</a>");

I didn't check your regex.

By using () you can create groups. The $1 indicates the group index. $1 will replace the url.

I asked a simalir question: my question
Some exemples: Capturing Text in a Group in a regular expression

1 Comment

This doesn't work for multiple links in a text because the .* takes too much.
5

For anybody who is searching a more robust solution I can suggest the Twitter Text Libraries.

Replacing the URLs with this library works like this:

new Autolink().autolink(plainText) 

1 Comment

url must be well formatted. unable to detect: www.example.com (http:// missing). :(
2

Belows code replaces links starting with "http" or "https", links starting just with "www." and finally replaces also email links.

  Pattern httpLinkPattern = Pattern.compile("(http[s]?)://(www\\.)?([\\S&&[^.@]]+)(\\.[\\S&&[^@]]+)");

  Pattern wwwLinkPattern = Pattern.compile("(?<!http[s]?://)(www\\.+)([\\S&&[^.@]]+)(\\.[\\S&&[^@]]+)");

  Pattern mailAddressPattern = Pattern.compile("[\\S&&[^@]]+@([\\S&&[^.@]]+)(\\.[\\S&&[^@]]+)");

    String textWithHttpLinksEnabled = 
  "ajdhkas www.dasda.pl/asdsad?asd=sd www.absda.pl [email protected] klajdld http://dsds.pl httpsda http://www.onet.pl https://www.onsdas.plad/dasda";

    if (Objects.nonNull(textWithHttpLinksEnabled)) {

      Matcher httpLinksMatcher = httpLinkPattern.matcher(textWithHttpLinksEnabled);
      textWithHttpLinksEnabled = httpLinksMatcher.replaceAll("<a href=\"$0\" target=\"_blank\">$0</a>");

      final Matcher wwwLinksMatcher = wwwLinkPattern.matcher(textWithHttpLinksEnabled);
      textWithHttpLinksEnabled = wwwLinksMatcher.replaceAll("<a href=\"http://$0\" target=\"_blank\">$0</a>");

      final Matcher mailLinksMatcher = mailAddressPattern.matcher(textWithHttpLinksEnabled);
      textWithHttpLinksEnabled = mailLinksMatcher.replaceAll("<a href=\"mailto:$0\">$0</a>");

      System.out.println(textWithHttpLinksEnabled);
    }

Prints:

ajdhkas <a href="http://www.dasda.pl/asdsad?asd=sd" target="_blank">www.dasda.pl/asdsad?asd=sd</a> <a href="http://www.absda.pl" target="_blank">www.absda.pl</a> <a href="mailto:[email protected]">[email protected]</a> klajdld <a href="http://dsds.pl" target="_blank">http://dsds.pl</a> httpsda <a href="http://www.onet.pl" target="_blank">http://www.onet.pl</a> <a href="https://www.onsdas.plad/dasda" target="_blank">https://www.onsdas.plad/dasda</a>

Comments

0

Assuming your regex works to capture the correct info, you can use backreferences in your substitution. See the Java regexp tutorial.

In that case, you'd do

myString.replaceAll(....., "<a href=\"\1\">\1</a>")

Comments

0

In case of multiline text you can use this:

text.replaceAll("(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)",
        "$1<a href='$2'>$2</a>$4");

And here is full example of my code where I need to show user's posts with urls in it:

private static final Pattern urlPattern = Pattern.compile(
        "(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)");


String userText = ""; // user content from db
String replacedValue = HtmlUtils.htmlEscape(userText);
replacedValue = urlPattern.matcher(replacedValue).replaceAll("$1<a href=\"$2\">$2</a>$4");
replacedValue = StringUtils.replace(replacedValue, "\n", "<br>");
System.out.println(replacedValue);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.