XPath - Get text from parent using php xpath

Question

I am trying to get the text from a specific node's parent. For example:

<td colspan="1" rowspan="1">
  <span>
    <a class="info" shape="rect" 
             rel="empLinkData" href="/employee.htm?id=8468524">
        Jack Johnson
    </a>
  </span>
   (*)&nbsp;
</td>

I am able to successfully process the anchor tag by using:

$xNodes = $xpath->query('//a[@class="info"][@rel="empLinkData"]');

// $xNodes contains employee ids and names
foreach ($xNodes as $xNode)
{
    $sLinktext = @$xNode->firstChild->data;
    $sLinkurl = 'http://www.company.com' . $xNode->getAttribute('href');

    if ($sLinktext != '' && $sLinkurl != '')
    {
        echo '<li><a href="' . $sLinkurl . '">' .
                $sLinktext . '</a></li>';
    }
}

Now, I need to retrieve the text from the <td> tag (in this case, the (*)  appearing right after the span tag closes), but I can't seem to refer to it properly.

The xpath for this that seems to make the most sense to me is:

$xNodes = $xpath->query('//a[@class="info"]
          [@rel="empLinkData"]/ancestor::*');

but it is retrieving the wrong data from elsewhere nested above this code.

Thanks for the quick response! Assuming this query is correct, how would I actually display the data (see the foreach example above)? $xNode->firstChild->data is not working.. — blearn
– blearn, Commented Jul 8, 2012 at 23:01
Kimono is a real cool tool for uncovering xpath: kimonolabs.com — blearn
– blearn, Commented May 2, 2014 at 23:14

Wayne · Accepted Answer · 2012-07-09 00:45:16Z

It's not necessary to retreat back up the tree. Instead, directly select the td that contains the relevant element:

//td[descendant::a[@class="info"][@rel="empLinkData"]]/text()

Edit: As @Dimitre rightly pointed out, this selects all text children. Your td has two such nodes: the whitespace-only text node that precedes the span and the text node that follows it. If you only want the second text node, then use:

//td[descendant::a[@class="info"][@rel="empLinkData"]]/text()[2]

Or:

//td[descendant::a[@class="info"][@rel="empLinkData"]]/text()[last()]

As you can see, the resulting expressions are essentially the same, but you do need to target the correct text node (if you want only one). Note also that if the target text is truly in a td then it's safer to target that element type directly (without wildcards). As this is HTML, your actual document almost certainly contains several other elements, including multiple other anchors that you may not want to target.

Sample PHP:

$nodes = $xpath->query(
    '//td[descendant::a[@class="info"][@rel="empLinkData"]]/text()[last()]');
echo "[". $nodes->item(0)->nodeValue . "]";

BeniBela · Accepted Answer · 2012-07-08 22:42:48Z

0

Deepest td ancestor:

//a[@class="info"][@rel="empLinkData"]/ancestor::td[1]

answered Jul 8, 2012 at 22:42

BeniBela

17.1k4 gold badges48 silver badges55 bronze badges

Comments

Dimitre Novatchev · Accepted Answer · 2012-07-08 23:52:55Z

0

Use:

//*[a[@class="info"][@rel="empLinkData"]]/following-sibling::text()[1]

This selects a single text node -- exactly the wanted one.

Do note that an XPath expression like:

//td[descendant::a[@class="info"][@rel="empLinkData"]]/text()

selects more than one text nodes -- not only the wanted text node.

answered Jul 8, 2012 at 23:52

Dimitre Novatchev

244k27 gold badges307 silver badges438 bronze badges

Collectives™ on Stack Overflow

XPath - Get text from parent using php xpath

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related