3

I have the following html from a curl scrap of a webpage:

<div id="box">
<br>
Your word(s):
<br>
<br>
functionally
<br>
<br>
<br>

I want what is after the third <br>: /html/body/div[2]/div/br[3] - that being functionality

@$itemCell = $xpath->query( "/html/body/div[2]/div/br[3]" );
$word = $itemCell->item( 0 );
return $word->nodeValue;

this does not return anything. If I back up to just /div I of course get the entire contents of box. How do I extract the word after the second <br>. My word is always going to be after the third <br>.

Seems so simple, yet it escapes me.

2 Answers 2

4

Try something like this query

$textNodes = $xpath->query('//div[@id="box"]/br[3]/following-sibling::text()[1]');

Working demo here - http://codepad.viper-7.com/00oeZh

The key here is the following-sibling Axes.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for this. It does work, and you actually just allowed me to clear a big hurdle with xpath and following-sibling.
-1
<dl>
        <dt>info</dt>
        <dd>
            <a>a1</a>b2
            <a>a2</a>
        </dd>
    </dl>

getting the b2 after tag. the xpath is like the following. //dl/dd/a[1]/following-sibling::text()

1 Comment

KiloJKilo, why don't you select my answer? The key is following-sibling::text(). I just don't tell you the detail answer for your problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.