2

is there a really easy way to grab the text of a rel attribute i.e

<a href='#' rel='i want this text here'></a>.

I have tried this morning with regex but am having no luck.

1
  • 1
    Have you tried a parser? Commented Mar 5, 2010 at 12:25

5 Answers 5

4

Do not use regular expressions for irregular languages like HTML. You can achieve that using XPath. Example:

$doc = new DOMDocument();
$doc->loadHtml($htmlAsString);
$xpath = new DOMXPath($doc);
$nodelist = $xpath->query('//a[@rel]');
Sign up to request clarification or add additional context in comments.

Comments

1

Unless the HTML is 100% static and controlled by you, I recommend you use a HTML parser like one of the built-in ones like DOMDocument, or the PHP Simple HTML DOM Parser. It's more effort to set up than a simple Regex, but it will work much more reliably in all cases and variations.

 <a href='#' rel="i want this text here"></a>
 <a href='#' REL="i want this text here"></a>
 <a rEL='i want this text here' href='#' ></a>

Comments

0

This should work:

preg_match_all('%<a[^>]+rel=("([^"]+)"|\'([^\']+)\')[^>]*>%i', $html, $matches);
print_r($matches);

2 Comments

We don't parse HTML with regEx 'round these parts
I never claimed it to be the best solution. It is, however, the answer to his question ;)
0

As said by others, you should avoid using regex for parsing HTML as its not regular. But if you are sure that the structure of the HTML you can use the regex. The following program will extract the stuff you want:

<?php
$a = "<a href='#' rel='i want this text here'></a>";

if(preg_match("{<a href.*?rel='(.*?)'.*?>}",$a,$matches)) {
        echo $matches[1]; // prints i want this text here
}
?>

Comments

0

Like the other posters have pointed out: It's really a bad idea to use regex for html parsing, to many things can go wrong and you'll need to do more support. ( See Pekka's comment !)

To add some value here i postet a full example of getting every rel attribute:

<?php
$html = "<a href='#' rel='i want this text here'></a>";

$dom = new DomDocument();
$dom->loadHtml($html);

$xpath = new DomXPath($dom);
$refAttributes = $xpath->query("//a[@rel]");
// ^^ This means: Get my every <a...></a> that has a rel attribute

foreach($refAttributes as $refAtt) {
    var_dump($refAtt->getAttribute("rel"));
}

And for additional reading one can try:

http://kore-nordmann.de/blog/do_NOT_parse_using_regexp.html

http://kore-nordmann.de/blog/0081_parse_html_extract_data_from_html.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.