$match = q(<a href="#google"><h1><b>Google</b></h1></a>);
if($match =~ /<a.*?href.*?><.?>(.*?)<\/a>/){
$title = $1;
}else {
$title="";
}
print"$title";
OUTPUT: Google</b></h1>
It Should be : Google
Unable to extract value from link using Regex in Perl, it could have one more or less nesting:
<h1><b><i>Google</i></b></h1>
Please Try this:
1) <td><a href="/wiki/Unix_shell" title="Unix shell">Unix shell</a>
2) <a href="http://www.hp.com"><h1><b>HP</b></h1></a>
3) <a href="/wiki/Generic_programming" title="Generic programming">generic</a></td>);
4) <a href="#cite_note-1"><span>[</span>1<span>]</span></a>
OUTPUT:
Unix shell
HP
generic
[1]
</a>", and that's what you get. You need to use<\/b>