PHP Simple HTML Dom: Get childNodes nodeValue?

Question

a.php:

<ul id="ul1">
    <li id="pt1">Point 1
         <ul id="ul2">
             <li id="pt11">Point 1.1</li>
             <li id="pt12">Point 1.2</li>
                <pre class="CodeDisplay">
                some codes
                </pre>
             <li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>
         </ul>
    </li> 
</ul>

I would like to get the nodeValue "Point 1" only. In JS, it is:

alert(document.getElementsByTagName("li")[0].childNodes[0].nodeValue);

But I would like to get the nodeValue in PHP (Simple HTML Dom); Here's the code snippet in another PHP page (b.php):

<?php

include('simple_html_dom.php');
$html = file_get_html('http://lifelearning.net63.net/a.php');

// stuck here:
echo $html->getElementsByTagName('ul',0)->getElementsByTagName('li',0)->nodeValue;
//

?>

I have used textContent but it just extracts the content descendents under Point 1. This is not what I want. I only want "Point 1". Any help is appreciated!

AKS · Accepted Answer · 2013-02-09 16:01:20Z

1

Try this:

<?php
include('simple_html_dom.php');
$html = file_get_html('http://lifelearning.net63.net/a.php');
echo $html->find('li[id=pt1] li', 0)->innertext;

Above snippet finds the first (descent to li#pt1)matching li tag and gives your the inner text (content between the text, including all HTML in it, if any).

Have a look at SimpleHTMLDom docs. There are many ways and examples that your can find content (ID, classes, etc) from the HTML output. SimpleHTMLDom mostly follows jQuery/CSS selectors.

Note that if you do not use innertext method, it returns a SimpleHTMLDom node that you need to process before displaying.

If there were no matching elements, it will return an E_WARNING error message. So make sure your input contain the require elements or make sure the element is present with an isset()

edited Feb 9, 2013 at 16:01

answered Feb 9, 2013 at 15:55

AKS

4,6582 gold badges31 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Tin Amaranth Over a year ago

Thanks for your reply. But it actually returns "Point 1.1" instead of "Point 1".

Tin Amaranth · Accepted Answer · 2013-02-10 11:05:43Z

1

With the help of others online, a simpler solution is suggested:

$html = new DOMDocument();
$html->loadHTMLFile('http://lifelearning.net63.net/a.php');
echo $html->getElementsByTagName('li')->item(0)->childNodes->item(0)->textContent; // returns "Point 1"

What I've learnt is that

first, any external library is not required in my case, DOMDocument does the job of getting the HTML DOM of a webpage.

Second, use item() and childNodes. Very much like what it is in JS:

document.getElementsByTagName("li")[0].childNodes[0].nodeValue

But thank you for all your replies.

answered Feb 10, 2013 at 11:05

Tin Amaranth

7012 gold badges15 silver badges24 bronze badges

1 Comment

mickmackusa Over a year ago

Frankly, you should accept your own answer because that regex solution is not recommended.

echo_Me · Accepted Answer · 2013-02-09 15:50:24Z

0

u may looking for this

 <?php  $str2 =     ' <ul id="ul1"> ' ;?>
 <?php  $str2 .=    '<li id="pt1"><div>Point 1</div> ' ;?>
 <?php  $str2 .=    ' <ul id="ul2"> ' ; ?>
 <?php  $str2 .=    '     <li id="pt11">Point 1.1</li>' ; ?>
 <?php  $str2 .=    '    <li id="pt12">Point 1.2</li>' ; ?>
 <?php  $str2 .=    '     <pre class="CodeDisplay">' ; ?>
 <?php  $str2 .=    '     some codes' ; ?>
 <?php  $str2 .=    '     </pre>' ; ?>
 <?php  $str2 .=    '    <li id="ref">Reference: <a href="link.html" target="_blank">link</a></li>' ; ?>
 <?php  $str2 .=    '  </ul>' ; ?>
 <?php  $str2 .=    '   </li> ' ; ?>
 <?php  $str2 .=    ' </ul>' ; ?>

 <?php

 function getTextBetweenTags($string, $tagname) {
     $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
     preg_match($pattern, $string, $matches);
     return $matches[1];
     }

   $txt = getTextBetweenTags($str2, "div");
   echo $txt;
   ?>

   will output : -->  Point 1

answered Feb 9, 2013 at 15:50

echo_Me

37.2k5 gold badges62 silver badges81 bronze badges

2 Comments

AKS Over a year ago

OP is using SimpleHTMLDom already. [insert "Regex to parse HTML is bad" comment here]

mickmackusa Over a year ago

This is error prone advice. Regex is not DOM -aware.

Collectives™ on Stack Overflow

PHP Simple HTML Dom: Get childNodes nodeValue?

3 Answers 3

1 Comment

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related