0

I was trying to parse a collada(.dae) file in java using plane DOM parser. When I try to get value it returns me null. I tried with answers and suggestions from other discussions but was not a success. The code I used is below.

for(int k1=0;k1<meshlist.getLength();k1++) {
    Element geometryItr1 = (Element)geometrylist.item(k);

    NodeList trianglelist = geometryItr1.getElementsByTagName("triangles");

    //System.out.println("Triangles length is " + trianglelist.getLength());     

        for(int o=0;o<trianglelist.getLength();o++) {

            Element trichildnodes = (Element) trianglelist.item(o);
            NodeList inputs = trichildnodes.getElementsByTagName("input");
        NodeList p = trichildnodes.getElementsByTagName("p");
        Element ppp = (Element) p.item(0);
        System.out.println("Node Value " + ppp.getNodeValue());
        System.out.println(inputs.getLength() + "Input length");

        for(int in=0;in<inputs.getLength();in++) {

            Element inn = (Element) inputs.item(in);
            System.out.println(inn.getAttribute("semantic") + " " + inn.getAttribute("source") + " Attributes");

        }


        //System.out.println(p.getLength() +  " P's length" );
        //System.out.println("P's content " + ppp.getFirstChild().getNodeValue());


    }   
}

The XML is very large and I am posting a part which I was trying to parse.

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
</mesh>

I was trying to get the value of <p>. Everything works fine except getting p's value. But when I debug I can see the values, its associated with first child. I even tried using firstChild. I am completely lost with parsing trying to find out a solution on this. Please some one help me find a solution on How to get the value of p ?

When I use getTextContent instead I get the output like below:

NodeValue null
NodeValue 24 262 2 72 72 72 72 2222 8198219
NodeValue null

The output is blank for two tags.

3 Answers 3

3

I would recommend using the javax.xml.xpath APIs available in the JDK/JRE since Java SE 5 to make the processing of your XML document easier:

package forum11688757;

import java.io.File;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class Demo {

    public static void main(String[] args) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(new File("src/forum11688757/input.xml"));

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xpath = xpf.newXPath();
        NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);
        for(int x=0; x<nodeList.getLength(); x++) {
            System.out.println(nodeList.item(x).getTextContent());
        }
    }

}

input.xml

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
</mesh>

Output

 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  

UPDATE

You could also get the p elements using the following line of code. You need to be careful though since it will get all p elements not just those in the /mesh/triangles/p path

NodeList nodeList = document.getElementsByTagName("p");

The following approach will always get you the data you are looking for, even if p eleements are later added somewhere else in the document.

NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);
Sign up to request clarification or add additional context in comments.

Comments

1

The nodeValue() of an Element is documented as being null.

Instead, you probably want to call getTextContent(). But beware that it has its own idiosyncrasies (if you call it on the root of a tree, it will concatenate the text of all elements in the tree).

4 Comments

I have tried with getTextContent() but didn't work. It prints some not all.
"Some not all" is not useful. Show exact output and what you expect for output.
That isn't the exact output from the program you show; for one thing, there should be output related to the "input" elements. If it is in fact output from the same line of code, then take a look at your data: you will find empty "p" elements.
And as a general comment: when you're creating debugging output, be as specific as possible. Rather than saying "NodeValue", you should say something like "<P> content". Even better, keep count of the number of times you've written this, and write "<p> content #1234".
1

You don't have to iterate over the previous nodes if you don't need them. For example it's how to print all the text content in <p> tags :

    File xmlPath = new File("test.xml");

    DocumentBuilderFactory fabrique = DocumentBuilderFactory.newInstance();
    fabrique.setCoalescing(true);
    fabrique.setIgnoringElementContentWhitespace(true);

    DocumentBuilder constructeur = fabrique.newDocumentBuilder();

    Document document = constructeur.parse(xmlPath);  
    document.setXmlVersion("1.0");
    Element racine = document.getDocumentElement();
    NodeList liste = racine.getElementsByTagName("p");

    for(int i=0; i<liste.getLength(); i++) {
        Element e = (Element)liste.item(i);  
        System.out.println(e.getFirstChild().getTextContent());
    }

You can use that and elaborate to obtain what you want I guess. If you want the attribute value, just use: e.getAttribute("att_name").

2 Comments

Thanks. But I doesn't read the first and third <p> it displays nothing.
@VinayKumarjg I used the xml example you provide and it display well the 4 <p> content. I just replace the end <triangles> tags with </triangles>. If you miss entry, it's probably because you have kept your nested loop or because the xml file is not as you describe it in the example.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.