Replace URLs in text with HTML links

Question

Here is a design though: For example is I put a link such as

http://example.com

in textarea. How do I get PHP to detect it’s a http:// link and then print it as

print "<a href='http://www.example.com'>http://www.example.com</a>";

I remember doing something like this before however, it was not fool proof it kept breaking for complex links.

Another good idea would be if you have a link such as

http://example.com/test.php?val1=bla&val2blablabla%20bla%20bla.bl

fix it so it does

print "<a href='http://example.com/test.php?val1=bla&val2=bla%20bla%20bla.bla'>";
print "http://example.com/test.php";
print "</a>";

This one is just an after thought.. stackoverflow could also probably use this as well :D

Any Ideas

ooo i see stackoverflow already do the first part.. post the code, u know you want to :D — Angel.King.47
– Angel.King.47, Commented Jul 27, 2009 at 13:30

Søren Løvborg · Accepted Answer · 2019-09-06 18:11:57Z

124

Let's look at the requirements. You have some user-supplied plain text, which you want to display with hyperlinked URLs.

The "http://" protocol prefix should be optional.
Both domains and IP addresses should be accepted.
Any valid top-level domain should be accepted, e.g. .aero and .xn--jxalpdlp.
Port numbers should be allowed.
URLs must be allowed in normal sentence contexts. For instance, in "Visit stackoverflow.com.", the final period is not part of the URL.
You probably want to allow "https://" URLs as well, and perhaps others as well.
As always when displaying user supplied text in HTML, you want to prevent cross-site scripting (XSS). Also, you'll want ampersands in URLs to be correctly escaped as &.
You probably don't need support for IPv6 addresses.
Edit: As noted in the comments, support for email-adresses is definitely a plus.
Edit: Only plain text input is to be supported – HTML tags in the input should not be honoured. (The Bitbucket version supports HTML input.)

Edit: Check out GitHub for the latest version, with support for email addresses, authenticated URLs, URLs in quotes and parentheses, HTML input, as well as an updated TLD list.

Here's my take:

<?php
$text = <<<EOD
Here are some URLs:
stackoverflow.com/questions/1188129/pregreplace-to-detect-html-php
Here's the answer: http://www.google.com/search?rls=en&q=42&ie=utf-8&oe=utf-8&hl=en. What was the question?
A quick look at http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax is helpful.
There is no place like 127.0.0.1! Except maybe http://news.bbc.co.uk/1/hi/england/surrey/8168892.stm?
Ports: 192.168.0.1:8080, https://example.net:1234/.
Beware of Greeks bringing internationalized top-level domains: xn--hxajbheg2az3al.xn--jxalpdlp.
And remember.Nobody is perfect.

<script>alert('Remember kids: Say no to XSS-attacks! Always HTML escape untrusted input!');</script>
EOD;

$rexProtocol = '(https?://)?';
$rexDomain   = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
$rexPort     = '(:[0-9]{1,5})?';
$rexPath     = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';
$rexQuery    = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
$rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';

// Solution 1:

function callback($match)
{
    // Prepend http:// if no protocol specified
    $completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";

    return '<a href="' . $completeUrl . '">'
        . $match[2] . $match[3] . $match[4] . '</a>';
}

print "<pre>";
print preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",
    'callback', htmlspecialchars($text));
print "</pre>";

To properly escape < and & characters, I throw the whole text through htmlspecialchars before processing. This is not ideal, as the html escaping can cause misdetection of URL boundaries.
As demonstrated by the "And remember.Nobody is perfect." line (in which remember.Nobody is treated as an URL, because of the missing space), further checking on valid top-level domains might be in order.

Edit: The following code fixes the above two problems, but is quite a bit more verbose since I'm more or less re-implementing preg_replace_callback using preg_match.

// Solution 2:

$validTlds = array_fill_keys(explode(" ", ".aero .asia .biz .cat .com .coop .edu .gov .info .int .jobs .mil .mobi .museum .name .net .org .pro .tel .travel .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw .ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs .bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn .co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg .er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gb .gd .ge .gf .gg .gh .gi .gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kp .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .me .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .ms .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .no .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr .ps .pt .pw .py .qa .re .ro .rs .ru .rw .sa .sb .sc .sd .se .sg .sh .si .sj .sk .sl .sm .sn .so .sr .st .su .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tp .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw .xn--0zwm56d .xn--11b5bs3a9aj6g .xn--80akhbyknj4f .xn--9t4b11yi5a .xn--deba0ad .xn--g6w251d .xn--hgbk6aj7f53bba .xn--hlcj6aya9esc7a .xn--jxalpdlp .xn--kgbechtv .xn--zckzah .arpa"), true);

$position = 0;
while (preg_match("{\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))}", $text, &$match, PREG_OFFSET_CAPTURE, $position))
{
    list($url, $urlPosition) = $match[0];

    // Print the text leading up to the URL.
    print(htmlspecialchars(substr($text, $position, $urlPosition - $position)));

    $domain = $match[2][0];
    $port   = $match[3][0];
    $path   = $match[4][0];

    // Check if the TLD is valid - or that $domain is an IP address.
    $tld = strtolower(strrchr($domain, '.'));
    if (preg_match('{\.[0-9]{1,3}}', $tld) || isset($validTlds[$tld]))
    {
        // Prepend http:// if no protocol specified
        $completeUrl = $match[1][0] ? $url : "http://$url";

        // Print the hyperlink.
        printf('<a href="%s">%s</a>', htmlspecialchars($completeUrl), htmlspecialchars("$domain$port$path"));
    }
    else
    {
        // Not a valid URL.
        print(htmlspecialchars($url));
    }

    // Continue text parsing from after the URL.
    $position = $urlPosition + strlen($url);
}

// Print the remainder of the text.
print(htmlspecialchars(substr($text, $position)));

edited Sep 6, 2019 at 18:11

answered Jul 27, 2009 at 14:55

Søren Løvborg

8,8012 gold badges49 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

25 Comments

Angel.King.47 Over a year ago

I will try and test your implementation my friend.. And then post your answer to be correct if it works. Will take some time though, cause im not at home. Ps Thanks for releasing it into the public domain :D

Søren Løvborg Over a year ago

@Rahul: Simply make the regular expression case insensitive: In the call to preg_match, add an i after the final } in the regular expression.

bart Over a year ago

I suggest doing a detection whether the url is enclosed by <a href=''></a>. If so, do nothing.

Søren Løvborg Over a year ago

@Guy: That is not a URL. :) Rather, it is an IRI. But feel free to create an enhancement request on Bitbucket, and I may look into whether it's a feasible to support.

Søren Løvborg Over a year ago

@Sajad: The two problems are listed just above the last "Edit", most importantly that htmlspecialchars can turn a valid URL into an invalid one. And you should not use either version shown here; use the up-to-date version on Bitbucket. The code here just demonstrates the general idea, while the Bitbucket version contains numerous bugfixes.

|

Raheel Hasan · Accepted Answer · 2015-04-20 10:13:09Z

18

You guyz are talking way to advance and complex stuff which is good for some situation, but mostly we need a simple careless solution. How about simply this?

preg_replace('/(http[s]{0,1}\:\/\/\S{4,})\s{0,}/ims', '<a href="$1" target="_blank">$1</a> ', $text_msg);

Just try it and let me know what crazy url it doesnt satisfy.

edited Apr 20, 2015 at 10:13

answered Apr 20, 2015 at 9:57

Raheel Hasan

6,0634 gold badges47 silver badges75 bronze badges

3 Comments

pperrin Over a year ago

Yes... but... why not add the code to make it cut/pasteable?!?! $text_msg= preg_replace('/(http[s]{0,1}\:\/\/\S{4,})\s{0,}/ims', '<a href="$1" target="_blank">$1</a> ', $text_msg);

user5147563 Over a year ago

Good solution, but if you have HTML in the string, then you might want to replace \S with [^<]

mickmackusa Over a year ago

[s] is too verbose. {0,1} is too verbose. \: is too verbose. {0,} is too verbose. ms is nonsensical. I do not endorse this answer.

Angel.King.47 · Accepted Answer · 2009-07-27 14:24:38Z

15

Here is something i found that is tried and tested

function make_links_blank($text)
{
  return  preg_replace(
     array(
       '/(?(?=<a[^>]*>.+<\/a>)
             (?:<a[^>]*>.+<\/a>)
             |
             ([^="\']?)((?:https?|ftp|bf2|):\/\/[^<> \n\r]+)
         )/iex',
       '/<a([^>]*)target="?[^"\']+"?/i',
       '/<a([^>]+)>/i',
       '/(^|\s)(www.[^<> \n\r]+)/iex',
       '/(([_A-Za-z0-9-]+)(\\.[_A-Za-z0-9-]+)*@([A-Za-z0-9-]+)
       (\\.[A-Za-z0-9-]+)*)/iex'
       ),
     array(
       "stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\">\\2</a>\\3':'\\0'))",
       '<a\\1',
       '<a\\1 target="_blank">',
       "stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\">\\2</a>\\3':'\\0'))",
       "stripslashes((strlen('\\2')>0?'<a href=\"mailto:\\0\">\\0</a>':'\\0'))"
       ),
       $text
   );
}

It works for me. And it works for emails and URL's, Sorry to answer my own question. :(

But this one is the only that works

Here is the link where i found it : http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_21878567.html

Sry in advance for it being a experts-exchange.

answered Jul 27, 2009 at 14:24

Angel.King.47

8,03414 gold badges62 silver badges85 bronze badges

7 Comments

Søren Løvborg Over a year ago

I'll just note that this solution fails most of the requirements I suggested, namely #1, 2, 3, 5 and 7, but if this meets your requirements, great. Just don't use it on untrusted input, since it performs no HTML escaping. :-)

Angel.King.47 Over a year ago

You talk about this escaping.. if you could explain what this escaping is, it may make it better for me and who knows someone else, to better understand your answer :D

Søren Løvborg Over a year ago

To prevent cross site scripting, you must never allow a visitor to add arbitrary HTML code to a page. A simple example is a form handler which simply does a print($_POST["text"]);. The simplest (and safest) way to prevent this is to run all user supplied text through htmlspecialchars(), which escapes HTML tags and entities, effectively turning them into plain text. For this question, you want to allow some HTML in the output (namely, link tags), which complicates matters, since we can no longer simply use htmlspecialchars().

Benjamin Crouzier Over a year ago

As stackoverflow does, you could add rel="nofollow" to user links

Cedric Ipkiss Over a year ago

If the string you're converting is coming from user input stored somewhere like a database, you could prevent XSS by escaping before saving, so you retrieve the escaped text to use with this function

|

Armand · Accepted Answer · 2015-05-02 15:26:34Z

4

I've been using this function, it works for me

function AutoLinkUrls($str,$popup = FALSE){
    if (preg_match_all("#(^|\s|\()((http(s?)://)|(www\.))(\w+[^\s\)\<]+)#i", $str, $matches)){
        $pop = ($popup == TRUE) ? " target=\"_blank\" " : "";
        for ($i = 0; $i < count($matches['0']); $i++){
            $period = '';
            if (preg_match("|\.$|", $matches['6'][$i])){
                $period = '.';
                $matches['6'][$i] = substr($matches['6'][$i], 0, -1);
            }
            $str = str_replace($matches['0'][$i],
                    $matches['1'][$i].'<a href="http'.
                    $matches['4'][$i].'://'.
                    $matches['5'][$i].
                    $matches['6'][$i].'"'.$pop.'>http'.
                    $matches['4'][$i].'://'.
                    $matches['5'][$i].
                    $matches['6'][$i].'</a>'.
                    $period, $str);
        }//end for
    }//end if
    return $str;
}//end AutoLinkUrls

All credits goes to - http://snipplr.com/view/68586/

Enjoy!

answered May 2, 2015 at 15:26

Armand

2,8672 gold badges28 silver badges39 bronze badges

1 Comment

dan-iel Over a year ago

This one has an issue if your string has comma separated URLs, for example "google.com, google.com". The first URL in this example would end up with a href="google.com," including the comma. A URL that ends with a comma is valid, so I guess this is up to the use case if you think it is more likely the string intended the comma as punctuation or as part of the URL.

Dharmendra Jadon · Accepted Answer · 2016-10-11 18:46:11Z

4

Here is the code using Regular Expressions in function

<?php
//Function definations
function MakeUrls($str)
{
$find=array('`((?:https?|ftp)://\S+[[:alnum:]]/?)`si','`((?<!//)(www\.\S+[[:alnum:]]/?))`si');

$replace=array('<a href="$1" target="_blank">$1</a>', '<a href="http://$1" target="_blank">$1</a>');

return preg_replace($find,$replace,$str);
}
//Function testing
$str="www.cloudlibz.com";
$str=MakeUrls($str);
echo $str;
?>

edited Oct 11, 2016 at 18:46

answered Mar 21, 2014 at 23:51

Dharmendra Jadon

1411 silver badge10 bronze badges

3 Comments

Amien Over a year ago

Does this cater for multiple url's in a string?

Amien Over a year ago

Sweet, it caters for multiple url's in a string, you're just missing a "<" at $replace=array('a href

mickmackusa Over a year ago

The s pattern modifier is pointless if there is no . (any character) in your pattern.

fresskoma · Accepted Answer · 2009-07-27 13:29:00Z

1

This RegEx should match any link except for these new 3+ character toplevel domains...

{
  \\b
  # Match the leading part (proto://hostname, or just hostname)
  (
    # http://, or https:// leading part
    (https?)://[-\\w]+(\\.\\w[-\\w]*)+
  |
    # or, try to find a hostname with more specific sub-expression
    (?i: [a-z0-9] (?:[-a-z0-9]*[a-z0-9])? \\. )+ # sub domains
    # Now ending .com, etc. For these, require lowercase
    (?-i: com\\b
        | edu\\b
        | biz\\b
        | gov\\b
        | in(?:t|fo)\\b # .int or .info
        | mil\\b
        | net\\b
        | org\\b
        | [a-z][a-z]\\.[a-z][a-z]\\b # two-letter country code
    )
  )

  # Allow an optional port number
  ( : \\d+ )?

  # The rest of the URL is optional, and begins with /
  (
    /
    # The rest are heuristics for what seems to work well
    [^.!,?;"\\'()\[\]\{\}\s\x7F-\\xFF]*
    (
      [.!,?]+ [^.!,?;"\\'()\\[\\]\{\\}\s\\x7F-\\xFF]+
    )*
  )?
}ix

It's not written by me, I'm not quite sure where I got it from, sorry that I can give no credit...

answered Jul 27, 2009 at 13:29

fresskoma

25.9k12 gold badges87 silver badges132 bronze badges

1 Comment

Angel.King.47 Over a year ago

I understand that the above are patterns but im so lost. sry

Stephen Fuhry · Accepted Answer · 2009-07-27 13:41:31Z

1

this should get you email addresses:

$string = "bah bah [email protected] foo";
$match = preg_match('/[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)*\@[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)+/', $string, $array);
print_r($array);

// outputs:
Array
(
    [0] => [email protected]
)

answered Jul 27, 2009 at 13:41

Stephen Fuhry

13.1k6 gold badges59 silver badges57 bronze badges

Comments

lepe · Accepted Answer · 2012-04-05 05:32:51Z

1

I know this answer has been accepted and that this question is quite old, but it can be useful for other people looking for other implementations.

This is a modified version of the code posted by: Angel.King.47 on July 27,09:

$text = preg_replace(
 array(
   '/(^|\s|>)(www.[^<> \n\r]+)/iex',
   '/(^|\s|>)([_A-Za-z0-9-]+(\\.[A-Za-z]{2,3})?\\.[A-Za-z]{2,4}\\/[^<> \n\r]+)/iex',
   '/(?(?=<a[^>]*>.+<\/a>)(?:<a[^>]*>.+<\/a>)|([^="\']?)((?:https?):\/\/([^<> \n\r]+)))/iex'
 ),  
 array(
   "stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>&nbsp;\\3':'\\0'))",
   "stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>&nbsp;\\4':'\\0'))",
   "stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\" target=\"_blank\">\\3</a>&nbsp;':'\\0'))",
 ),  
 $text
);

Changes:

I removed rules #2 and #3 (I'm not sure in which situations are useful).
Removed email parsing as I really don't need it.
I added one more rule which allows the recognition of URLs in the form: [domain]/* (without www). For example: "example.com/faq/" (Multiple tld: domain.{2-3}.{2-4}/)
When parsing strings starting with "http://", it removes it from the link label.
Added "target='_blank'" to all links.
Urls can be specified just after any(?) tag. For example: <b>www.example.com</b>

As "Søren Løvborg" has stated, this function does not escape the URLs. I tried his/her class but it just didn't work as I expected (If you don't trust your users, then try his/her code first).

answered Apr 5, 2012 at 5:32

lepe

25.4k9 gold badges107 silver badges119 bronze badges

1 Comment

mickmackusa Over a year ago

preg_replace() will not respect any function calls in the replacement string. The x pattern modifier seams needlessly or incorrectly used. An unescaped dot will mean any character. Much is wrong about this answer.

Svetoslav Marinov · Accepted Answer · 2016-10-14 10:35:19Z

As I mentioned in one of the comments above my VPS, which is running php 7, started emitting warnings Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead. The buffer after the replacement was empty/false.

I have rewritten the code and made some improvements. If you think that you should be in the author section feel free to edit the comment above the function make_links_blank name. I am intentionally not using the closing php ?> to avoid inserting whitespace in the output.

<?php

class App_Updater_String_Util {
    public static function get_default_link_attribs( $regex_matches = [] ) {
        $t = ' target="_blank" ';
        return $t;
    }

    /**
     * App_Updater_String_Util::set_protocol();
     * @param string $link
     * @return string
     */
    public static function set_protocol( $link ) {
        if ( ! preg_match( '#^https?#si', $link ) ) {
            $link = 'http://' . $link;
        }
        return $link;
    }

/**
     * Goes through text and makes whatever text that look like a link an html link
     * which opens in a new tab/window (by adding target attribute).
     * 
     * Usage: App_Updater_String_Util::make_links_blank( $text );
     * 
     * @param str $text
     * @return str
     * @see http://stackoverflow.com/questions/1188129/replace-urls-in-text-with-html-links
     * @author Angel.King.47 | http://dashee.co.uk
     * @author Svetoslav Marinov (Slavi) | http://orbisius.com
     */
    public static function make_links_blank( $text ) {
        $patterns = [
            '#(?(?=<a[^>]*>.+?<\/a>)
                 (?:<a[^>]*>.+<\/a>)
                 |
                 ([^="\']?)((?:https?|ftp):\/\/[^<> \n\r]+)
             )#six' => function ( $matches ) {
                $r1 = empty( $matches[1] ) ? '' : $matches[1];
                $r2 = empty( $matches[2] ) ? '' : $matches[2];
                $r3 = empty( $matches[3] ) ? '' : $matches[3];

                $r2 = empty( $r2 ) ? '' : App_Updater_String_Util::set_protocol( $r2 );
                $res = ! empty( $r2 ) ? "$r1<a href=\"$r2\">$r2</a>$r3" : $matches[0];
                $res = stripslashes( $res );

                return $res;
             },

            '#(^|\s)((?:https?://|www\.|https?://www\.)[^<>\ \n\r]+)#six' => function ( $matches ) {
                $r1 = empty( $matches[1] ) ? '' : $matches[1];
                $r2 = empty( $matches[2] ) ? '' : $matches[2];
                $r3 = empty( $matches[3] ) ? '' : $matches[3];

                $r2 = ! empty( $r2 ) ? App_Updater_String_Util::set_protocol( $r2 ) : '';
                $res = ! empty( $r2 ) ? "$r1<a href=\"$r2\">$r2</a>$r3" : $matches[0];
                $res = stripslashes( $res );

                return $res;
            },

            // Remove any target attribs (if any)
            '#<a([^>]*)target="?[^"\']+"?#si' => '<a\\1',

            // Put the target attrib
            '#<a([^>]+)>#si' => '<a\\1 target="_blank">',

            // Make emails clickable Mailto links
            '/(([\w\-]+)(\\.[\w\-]+)*@([\w\-]+)
                (\\.[\w\-]+)*)/six' => function ( $matches ) {

                $r = $matches[0];
                $res = ! empty( $r ) ? "<a href=\"mailto:$r\">$r</a>" : $r;
                $res = stripslashes( $res );

                return $res;
            },
        ];

        foreach ( $patterns as $regex => $callback_or_replace ) {
            if ( is_callable( $callback_or_replace ) ) {
                $text = preg_replace_callback( $regex, $callback_or_replace, $text );
            } else {
                $text = preg_replace( $regex, $callback_or_replace, $text );
            }
        }

        return $text;
    }
}

The s pattern modifier is pointless if there is no . (any character) in your pattern.
And only affects dots in a pattern. It otherwise serves no purpose.

Max · Accepted Answer · 2014-03-05 04:53:35Z

0

If you want to trust the IANA you can get your current list of offcially supported TLDs in use there like:

  $validTLDs = 
explode("\n", file_get_contents('http://data.iana.org/TLD/tlds-alpha-by-domain.txt')); //get the official list of valid tlds
  array_shift($validTLDs); //throw away first line containing meta data
  array_pop($validTLDs); //throw away last element which is empty

Makes Søren Løvborg's solution #2 a bit less verbose and spares you the hassle of updating the list, nowadays new tlds are thrown out so carelessly ;)

answered Mar 5, 2014 at 4:53

Max

2,6431 gold badge28 silver badges32 bronze badges

Comments

OneOfOne · Accepted Answer · 2009-07-27 13:30:11Z

-1

Something along the lines of :

<?php
if(preg_match('@^http://(.*)\s|$@g', $textarea_url, $matches)) {
    echo '<a href=http://", $matches[1], '">', $matches[1], '</a>';
}
?>

answered Jul 27, 2009 at 13:30

OneOfOne

100k22 gold badges195 silver badges187 bronze badges

1 Comment

mickmackusa Over a year ago

PHP does not respect the g modifier. .* doesn't seem like a reliable pattern for isolating a url.

amarjit singh · Accepted Answer · 2013-07-21 12:06:00Z

-1

This class changes the urls into text and while keeping the home url as it is. I hope this will help and save time for you.Enjoy.

class RegClass 
{ 

     function preg_callback_url($matches) 
     { 
        //var_dump($matches); 
        //Get the matched URL  text <a>text</a>
        $text = $matches[2];
        //Get the matched URL link <a href ="http://www.test.com">text</a>
        $url = $matches[1];

        if($url=='href ="http://www.test.com"'){
         //replace all a tag as it is
         return '<a href='.$url.' rel="nofollow"> '.$text.' </a>'; 

         }else{
         //replace all a tag to text
         return " $text " ;
         }
} 
function ParseText($text){ 

    $text = preg_replace( "/www\./", "http://www.", $text );
        $regex ="/http:\/\/http:\/\/www\./"
    $text = preg_replace( $regex, "http://www.", $text );
        $regex2 = "/https:\/\/http:\/\/www\./";
    $text = preg_replace( $regex2, "https://www.", $text );

        return preg_replace_callback('/<a\s(.+?)>(.+?)<\/a>/is',
                array( &$this,        'preg_callback_url'), $text); 
      } 

} 
$regexp = new RegClass();
echo $regexp->ParseText($text);

edited Jul 21, 2013 at 12:06

answered May 12, 2013 at 15:48

amarjit singh

4615 silver badges14 bronze badges

1 Comment

amarjit singh Over a year ago

This class has used preg_replace _callback function to search and repace URL with text .If you have any error in ParseText Function then just replace the $regex and regex2 with actual patterns.

OmniPotens · Accepted Answer · 2017-09-02 18:18:55Z

-1

This worked for me (turned one of the answers into a PHP function)

function make_urls_from_text ($text){
   return preg_replace('/(http[s]{0,1}\:\/\/\S{4,})\s{0,}/ims', '<a href="$1" target="_blank">$1 </a>', $text);
}

edited Sep 2, 2017 at 18:18

OmniPotens

1,12713 silver badges30 bronze badges

answered Jul 21, 2015 at 19:02

Shawn Gervais

1

1 Comment

mickmackusa Over a year ago

There is much to fix with this answer. [s] is the same as s. {0,1} is more simply written as ? Colons do not require escaping. Using a pattern delimiter other than a forward slash will remove the need to escape forward slashes inside the pattern. {0,} is more simply written as *. The m modifier doesn't have a ^ or $ to affect. The s pattern modifier doesn't have a . to affect.

user13611442 · Accepted Answer · 2020-05-27 21:33:50Z

-1

This class I created works for my needs, admittedly it does needs some work though;

class addLink
{
    public function link($string)
    {
        $expression = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,63}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
        if(preg_match_all($expression, $string, $matches) == 1)// If the pattern is found then
        {
            $string = preg_replace($expression, '<a href="'.$matches[0][0].'" target="_blank">$1</a>', $string);
        }

        return $string;       
    }
}

An example of using this code;

include 'PHP/addLink.php';

if(class_exists('addLink')) 
{                  
    $al = new addLink();                  
}
else{
    echo 'Class not found...';
} 

$paragraph = $al->link($paragraph);

edited May 27, 2020 at 21:33

answered May 26, 2020 at 10:36

user13611442

12 bronze badges

5 Comments

Toto Over a year ago

[a-z]{2,4} is really short for TLDs, have a look at: TLD list

Toto Over a year ago

moreover, your regex matches http://qdj$$$-=, demo, not sure it's a valid URL ;)

user13611442 Over a year ago

I changed the TLD length to 63 as per RFC 1034 and updated above...

user13611442 Over a year ago

I'm currently reading RFC 1035 to fix my regex pattern matching...

mickmackusa Over a year ago

When you feel like you need preg_match_all() followed by preg_replace() calls, you probably mean to use preg_replace_callback().

dan-iel · Accepted Answer · 2020-07-30 00:17:01Z

-1

This is just a variation of the solution posted by Dharmendra Jadon, so if you like it up vote his instead!

I just added a parameter to make opening the link in a new window (target="_blank") optional, as I saw this in some of the other solutions and liked the flexibility:

function MakeUrls($str, $popup = FALSE)
{
    $find=array('`((?:https?|ftp)://\S+[[:alnum:]]/?)`si','`((?<!//)(www\.\S+[[:alnum:]]/?))`si');

    $replace=array('<a href="$1"' . ($popup ? ' target="_blank"' : '') . '>$1</a>', '<a href="http://$1"' . ($popup ? ' target="_blank"' : '') . '>$1</a>');

    return preg_replace($find,$replace,$str);
}

answered Jul 30, 2020 at 0:17

dan-iel

8598 silver badges4 bronze badges

2 Comments

mickmackusa Over a year ago

The s pattern modifier is useless if there are no "any character" dots in the pattern.

abe1432181 Over a year ago

This will fail if your link is within quotes (e.g. xxxxxxxx "http://www.bbc.com/list"<br>Received yyyyy) see regex101.com/r/puRu94/1

Collectives™ on Stack Overflow

Replace URLs in text with HTML links

15 Answers 15

25 Comments

3 Comments

7 Comments

1 Comment

3 Comments

1 Comment

Comments

1 Comment

3 Comments

Comments

1 Comment

1 Comment

1 Comment

5 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

15 Answers 15

25 Comments

3 Comments

7 Comments

1 Comment

3 Comments

1 Comment

Comments

1 Comment

3 Comments

Comments

1 Comment

1 Comment

1 Comment

5 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related