2

I am trying to parse a folder full of .htm files. All these files contain 1 specific element that needs to be removed. It's a td element with class="hide". So far, this is my code. $dir. entry is the full path to the file.

$page = ($dir . $entry);
$this->domDoc->loadHTMLFile($page);
// Use xpath query to find the menu and remove it
$nodeList = $xpath->query('//td[@class="hide"]');

Unfortunately, this is where things already go wrong. If I do a var_dump of the node list, I get the following:

object(DOMNodeList)#5 (0) { } 

Just so you folks get an idea of what I'm trying to select, here's an excerpt:

<td width="160" align="left" valign="top" class="hide">
    lots of other TD's and content here
</td>

Does anybody see anything wrong with what I've come up with so far?

8
  • 3
    It would help us/you if you submitted some sample XML data. Commented Oct 2, 2012 at 13:25
  • Firefox provides Xpath addons which can be useful to double check your paths :) Commented Oct 2, 2012 at 13:27
  • 1
    Most of the DOM objects cannot be var_dumped, so this is expected. Commented Oct 2, 2012 at 13:28
  • @ZnArK: It's not actually XML, it's HTML I'm parsing. Still, I'll add it to clarify. Commented Oct 2, 2012 at 14:13
  • 3
    Is your initial file xhtml (i.e. with <html xmlns="http://www.w3.org/1999/xhtml">)? If so then your elements will be namespaced and you'll need to set up a prefix mapping using $xpath->registerNamespace and use this prefix in the expression e.g. //xhtml:td Commented Oct 2, 2012 at 14:27

3 Answers 3

6

Is your initial file xhtml (i.e. with <html xmlns="http://www.w3.org/1999/xhtml">)? If so then your elements will be namespaced and you'll need to set up a prefix mapping using $xpath->registerNamespace and then use this prefix in the expression

$xpath->registerNamespace('xhtml', 'http://www.w3.org/1999/xhtml');
$nodeList = $xpath->query('//xhtml:td[@class="hide"]');
Sign up to request clarification or add additional context in comments.

2 Comments

this was the issue for me, I actually simply disabled the namespaces entirely while using this library github.com/Masterminds/html5-php
Same for me (processing PHPUnit Coverage xml) php $xml = new DOMDocument; $xml->preserveWhiteSpace = false; $xml->load($coverageIndex); $xml = new DOMXPath($xml); $xml->registerNamespace('phpunit', 'https://schema.phpunit.de/coverage/1.0'); $items = $xml->query('//phpunit:build');
5

Var dumping an xpath node list object doesn't show anything. Var dump the node list's length.

var_dump($nodeList->length);

If the value is over 0, then you can iterate over it using foreach:

foreach($nodeList as $node)var_dump($node->tagName);

Hope this helps.

For further clarification, here is a full working code snippet:

<?php
$html = <<<END
<html>
    <body>
        <td>

        </td>
        <td class="hide"></td>
        <td class="hide"></td>
    </body>
</html>
END;
$dom = new DOMDocument;
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
$nodeList = $xpath->query('//td[@class="hide"]');
// Shows a blank object
var_dump($nodeList);
// Shows 2
var_dump($nodeList->length);
// Echo out all the tag names.
foreach($nodeList as $node){
    echo $node->tagName . "\n";
}
?>

1 Comment

You're absolutely right about the var_dump. I've changed that now. I've also checked my code -yet- again, and I see no difference compared to your snippet. Thank you for your reply though. Unfortunately I still get no output (the length of the nodelist returns int(0).
3

Maybe you have more then one class in the class attribute of your td element:

<td class="hide anotherclass">

So '//td[@class="hide"]' would only match:

<td class="hide">

Try it like this to see if it contains the hide class you are looking for:

$nodeList = $xpath->query('//td[contains(@class,"hide")]');

Check out this blog post: XPath: Select element by class

1 Comment

Good advice. I double-checked this, but "hide" really is the only class the TD has. Still, thank you, interesting reply.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.