1

This is my php function. Its replacing things that are already replaced, thus messing the HTML. How do I prevent a second time replacement on the same text was replaced the first time?

function text2link($str){
$str="\r\n$str\r\n";
$pattern= array(  
'/(http:\/\/)(.*?)(\n|\<|"|\s)/is', 
'/(https:\/\/)(.*?)(\n|\<|"|\s)/is', 
'/\[url\=(.*?)\](.*?)\[\/url\]/is' 
//'/[^\"|\>](http:\/\/)([a-zA-Z0-9\?\&\%\.\;\:\/\=\+\_\-]*)[^\"|\<]/is', 
//'/[^\"|\>](https:\/\/)([a-zA-Z0-9\?\&\%\.\;\:\/\=\+\_\-]*)[^\"|\<]/is', 
//'/[^\"|\>](ftp:\/\/)([a-zA-Z0-9\?\&\%\.\;\:\/\=\+\_\-]*)[^\"|\<]/is' 
 );
$replace=array(  ' <a target="_blank" href="http://$2">$2</a> $3', ' <a target="_blank" href="https://$2">$2</a> $3', '<a target="_blank" href="$1">$2</a>', 
//' <a target="_blank" href="http://$2">$2</a> ', ' <a target="_blank" href="https://$2">$2</a> ', ' <a target="_blank" href="ftp://$2">ftp: $2</a> '
);
$str = preg_replace( $pattern, $replace, $str);
return $str;
}


echo text2link(' A link to [url=https://www.google.com] secure google [/url] and www.google.com this is http://www.google.com and another [url=http://www.google.com] google [/url]  '); exit;

If you run the code above, you'll see the first link as:

<a target="_blank" href=" <a target="_blank" href="https://www.google.com">">www.google.com]</a>  secure google </a>  

It should be:

<a target="_blank" href="https://www.google.com"> secure google </a>

For some reason the http part is getting replaced again.

The ones already replaced by the [url] patterns are getting replaced again using previous patterns. The commented patterns are where I tried to detect a quote or a greater/less sign and avoid replace. Didn't work...

5
  • Also, how do I add a pattern for text starting with www and not http? Thanks Commented May 6, 2017 at 17:36
  • More better will be if you update your post with input string and expected output. Commented May 6, 2017 at 17:38
  • regex101.com/r/VswcRB/1 or '/\[url=([^\]]+)\](.+?)(\[\/url\])/', '<a target="_blank" href="/$1">$2</a>' Commented May 6, 2017 at 17:44
  • 1
    @SahilGulati You might want to read: meta.stackoverflow.com/q/253833 Commented May 6, 2017 at 18:27
  • @SumitKumar This looks like MyBB tag structure, but could be the syntax for many other softwares. What is your project using to have square bracketed urls? It may help people to find your question if you include that name in your question (or even title). Commented May 7, 2017 at 20:13

3 Answers 3

1

If I understand you correctly, you wish to only replace the square bracketed links with html <a> links.

This will execute that:

$str=' A link to [url=https://www.google.com] secure google [/url] and www.google.com this is http://www.google.com and another [url=http://www.google.com] google [/url]  ';
$pattern="/(\[url=(https?[^]]+)\] ?(.*?) ?\[\/url\])/i";
$replace="<a target=\"_blank\" href=\"$2\">$3</a>";
echo preg_replace($pattern,$replace,$str);  // I recommend trim() around preg_replace() here

Here is a Regex Pattern Demo of the regex pattern so you can see how it works.

Output:

 A link to <a target="_blank" href="https://www.google.com">secure google</a> and www.google.com this is http://www.google.com and another <a target="_blank" href="http://www.google.com">google</a>

If you want to include square bracketed urls that do not have a protocol use:

(\[url=((?:https?:\/\/)?[^]]+)\] ?(.*?) ?\[\/url\])

Regex Pattern Demo


If you want to add www. when it is missing:

Regex Pattern Demo

Code:

$str='An https://www link to [url=https://www.google.com] secure google [/url] and www.google.com this is http://www.google.com and another [url=http://www.google.com] google [/url] and this is just www. [url=www.google.com] google [/url] and this url has no www. [url=google.com] google [/url]';
$pattern="/(\[url=(https?:\/\/)?(www.)?([^]]+)\] ?(.*?) ?\[\/url\])/i";
$replace='<a target=\"_blank\" href=\"$2www.$4\">$5</a>';
echo preg_replace($pattern,$replace,$str);

Output:

An https://www link to <a target=\"_blank\" href=\"https://www.google.com\">secure google</a> and www.google.com this is http://www.google.com and another <a target=\"_blank\" href=\"http://www.google.com\">google</a> and this is just www. <a target=\"_blank\" href=\"www.google.com\">google</a> and this url has no www. <a target=\"_blank\" href=\"www.google.com\">google</a>
Sign up to request clarification or add additional context in comments.

Comments

1

I hope this is what you are looking for. Here we are using preg_match_all to gather all matches which we want to replace and then replace one by one.

Regex: \[([a-z]+)\=((?:https?:\/\/)?(?:www\.)?[^\]]+)\](.*?)\[\/\\1\]

1. [([a-z]+)\= This will match [ and then some characters a-z then =.

2. ((?:https?:\/\/)?(?:www\.)?[^\]]+)\] This will capture complete link and then ].

3. (.*?)\[ This will match all till [.

4. \/\\1\] this will match \ and then first captured group like here first captured group is url and then ] at the end.

Try this code snippet here

function text2link($str)
{
    preg_match_all("/\[([a-z]+)\=((?:https?:\/\/)?(?:www\.)?[^\]]+)\](.*?)\[\/\\1\]/", $str,$matches);
    foreach($matches[0] as $key => $toReplace)
    {
        $str=str_replace($toReplace, '<a target="_blank" href="'.$matches[2][$key].'">'.$matches[3][$key]."</a>", $str);
    }
    return $str;
}

echo text2link(' A link to [url=https://www.google.com] secure google [/url] and www.google.com this is http://www.google.com and another [url=http://www.google.com] google [/url]  ');

Output:

A link to <a target="_blank" href="https://www.google.com"> secure google </a> and www.google.com this is http://www.google.com and another <a target="_blank" href="http://www.google.com"> google </a>

2 Comments

That is not the string I want replaced. That is the output I got when I ran preg on my original string. The original string is in this line echo text2link(). Also, I cannot use str_replace because the string is coming from user. Its not going to be google all the time...
@SumitKumar I am sorry, I misunderstood your question, I hope this post will help you know i have updated it.
0

@mickmackusa Thanks for your code! Extended it a little bit and this one below worked for various types of URL etc in the string. Thanks :) Posting here so someone else could use it as well.

$str=' A link to [url=https://www.google.com] secure google [/url] and www.google.com this is http://www.google.com and another [url=http://www.google.com] google [/url]  ';

$pattern=array(
'/(\[url=(https?[^]]+)\] ?(.*?) ?\[\/url\])/is',
 '/([\\s|\\n])(http:\/\/)(.*?)([\\n|\<|\"|\\s])/is',
 '/([\\s|\\n])(https:\/\/)(.*?)([\\n|\<|\"|\\s])/is',
'/([\\s|\\n])(www\.)(.*?)([\\n|\<|\"|\\s])/is'
);
$replace=array(
' <a target="_blank" href="$2">$3</a> ',
' <a target="_blank" href="http://$3">$3</a> ',
' <a target="_blank" href="https://$3">$3</a> ',
' <a target="_blank" href="http://www.$3">$3</a> '
);
echo preg_replace($pattern,$replace,$str);

1 Comment

Its not an extended scenario. The problem in the first code was that when "[URL]" pattern replaced the http/https , then later on other patterns also "re-replaced" the already replaced http and https that were within the URL pattern. The answer I posted here solves that "duplicate replacement" problem i.e the subject of this question. If you still think that I should edit the question then let me know. I'll edit it. Thanks :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.