I need to find and replace http links to hyperlinks. These http links are inside span tags.
$text has html page. One of the span tags has something like
<span class="styleonetwo" >http://www.cnn.com/live-event</span>
Here is my code:
$doc = new DOMDocument();
$doc->loadHTML($text);
foreach($doc->getElementsByTagName('span') as $anchor) {
$link = $anchor->nodeValue;
if(substr($link, 0, 4) == "http")
{
$link = "<a href=\"$link\">$link</a>";
}
if(substr($link, 0, 3) == "www")
{
$link = "<a href=\"http://$link\">$link</a>";
}
$anchor->nodeValue = $link;
}
echo $doc->saveHTML();
It works ok. However...I want this to work even if the data inside span is something like:
<span class="styleonetwo" > sometexthere http://www.cnn.com/live-event somemoretexthere</span>
Obviously above code wont work for this situation. Is there a way we can search and replace a pattern using DOMDocument without using preg_replace?
Update: To answer phil's question regarding preg_replace:
I used regexpal.com to test the following pattern matching:
\b(?:(?:https?|ftp|file)://|(www|ftp)\.)[-A-Z0-9+&@#/%?=~_|$!:,.;]*[-A-Z0-9+&@#/%=~_|$]
It works great in the regextester provided in regexpal. When I use the same pattern in PHP code, I got tons of weird errors. I got unknown modifier error even for escape character! Following is my code for preg_replace
$httpRegex = '/\b(\?:(\?:https?|ftp|file):\/\/|(www|ftp)\.)[-A-Z0-9+&@#/%\?=~_|$!:,.;]*[-A-Z0-9+&@#/%=~_|$]/';
$cleanText = preg_replace($httpRegex, "<a href='$0'>$0</a>", $text);
I was so frustrated with "unknown modifiers" and pursued DOMDocument to solve my problem.
preg_replace()?