0

I'm trying to parse a well formed xhtml document.
I'm having problems during the nodes iteration.
My xHtml has a structure like

<?xml version="1.0" encoding="UTF-8"?>
<html>
  <head>...</head>
  <body>
   ...
    <form>
    ...
      <div class="AB">    (1 or 2 times)
      ...                       
        <div class="CD">  
        ...
          <table>          
             <tbody>
                <tr>    (1 to N times)
                   <td> XXX </td>
                       <td> YYY </td> ...

The information I need is contained in the columns (td).
I want to construct N objects. So every row (tr) contains in its columns the info I need to construct an object.
I've 1 or 2 div of class="AB". So basically I'll have 1 or 2 objects AB containing a list of other objects created from every row in the table

So at first I extract a NodeList of these AB divs

NodeList ABlist= (NodeList) xpath.evaluate("//div[@class='AB']", document, XPathConstants.NODESET)

Now I'm trying to get a NodeList of all the tr elems of the first div AB.

NodeList trList = (NodeList) xpath.evaluate("/div/table//tr", ABlist.item(0), XPathConstants.NODESET);

In this case the trList is empty. Do you know what's wrong with my code?
Thank you

2 Answers 2

2

The problem in your second failing XPath is that you start it with a /:

/div/table//tr

In XPath, just as in file paths, starting a path with a / means "start from the root of the document". But you don't actually want to do that there - you want to start from your node. So:

div/table//tr

will do what you want.

Sign up to request clarification or add additional context in comments.

2 Comments

You're right Pavel! I thought that (as 2nd parameter) I was passing the 'context' to the evaluate() method. I think I tried without / before posting here but maybe I changed also something else in the meantime and that didn't work at the time. Anyway it's working now. Thanks a lot for your help!
You are passing the context there. The problem is that by using leading / in the query you're telling it to start the path not from the context node, but from the root of the document to which the node belongs.
0

Are you sure this is XHTML? There's no namespace declared in your sample document, and without that namespace, it's not XHTML. If there is a namespace, and you missed that out of your sample for brevity, then your XPath expressions need to reference the namespace also, otherwise they won't select anything.

1 Comment

Hi skaffman, I'm correctly retreiving the ABlist of divs. It's only the way I try to extract the trList that is not working. Actually you're right, the document doesn't specify any namespace so maybe it can be only called xml. It only conforms the xml spec without specifing any namespace.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.