0

My function works without any problem with ftp, http and https

function makeClickableLinks($s) {
    return preg_replace('!(((f|ht)tp(s)?://)[-a-zA-Z?-??-?()0-9@:%_+.~#?&;//=]+)!i', 
    '<a href="$1">$1</a>', $s);
}

However, it doesnt make clickable if url is www.example.org (if there is no http)

If I replace the ((f|ht)tp(s)?://) to www , it works, however, if url has http, it makes clickable only after http part.

How can I make it work correctly both with http and without http ?

3
  • Can you make a regex101.com Commented Oct 21, 2016 at 8:53
  • 3
    If the URL is just www.example.org, you'll have to fix it up into http://www.example.org to begin with; just href="www.example.org" either won't work as you expect or is invalid, depending on your definition. So, you'll need to produce a separate regex and replace for that, you cannot cram it into this regex. Commented Oct 21, 2016 at 9:06
  • 1
    This seems to work: regex101.com/r/s49eS9/1 - But just like @deceze says, this will produce an invalid link for all www links (that's missing http(s)). You can solve this by simply run str_replace('a href="www.', 'a href="http://www.', $string); after your regex. Commented Oct 21, 2016 at 9:10

1 Answer 1

1

This regex seem to cut it. It checks if any string starts with http, https, ftp or www.

It also fixes all invalid links (that starts with just www).

Here you can test just the regex: https://regex101.com/r/s49eS9/2

function makeClickableLinks($s)
{
    return preg_replace_callback('/((((f|ht)tp(s)?:\/\/)|www)[-a-zA-Z?-??-?()0-9@:%_+.~#?&;\/\/=]+)/i', function($matches) {
        if (substr($matches[0], 0 , 4) == 'www.') {
            // The match starts with www., add a protocol (http:// being the most common).
            $matches[0] = 'http://' . $matches[0];
        }

        return '<a href="' . $matches[0] . '">' . $matches[0] . '</a>';
    }, $s);
}

Note: Just like @deceze points out in his comment, this will not work for ALL url's, like example.com. Making a regex that converts all versions of all valid URL's would be a much bigger task and you would probably need to list all valid TLD's.

Edit: Change from str_replace() to use preg_replace_callback() to sort out the invalid www-link-situation, as suggested by @deceze

Sign up to request clarification or add additional context in comments.

7 Comments

This of course doesn't work for just example.com, or www2.example.com, or any other domain not starting with www. (which become increasingly less common I'd say). It also feels pretty icky to me to do a str_replace on the replaced HTML…
@deceze - I know that this won't work for a bunch of different URL's. Today, it's almost impossible to make this work 100% since there are so many new TLD's (and more keeps getting added). But this does answer the OP's question. I don't agree with the icky part, though. It's not worse than running multiple regex's, for example.
I'd use a preg_replace_callback and fix up the link in there before putting it into HTML… Again, it just feels icky to me… Just my .02€.
@deceze - That was actually a valid and good .02€ that I didn't think of!
Really, in practice, I'd use a dedicated library for this task. As you say, this can get really complex really quickly and is really not a wheel that needs reinventing…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.