Null value returnred with XML parsing in java using plane DOM parser

Question

I was trying to parse a collada(.dae) file in java using plane DOM parser. When I try to get value it returns me null. I tried with answers and suggestions from other discussions but was not a success. The code I used is below.

for(int k1=0;k1<meshlist.getLength();k1++) {
    Element geometryItr1 = (Element)geometrylist.item(k);

    NodeList trianglelist = geometryItr1.getElementsByTagName("triangles");

    //System.out.println("Triangles length is " + trianglelist.getLength());     

        for(int o=0;o<trianglelist.getLength();o++) {

            Element trichildnodes = (Element) trianglelist.item(o);
            NodeList inputs = trichildnodes.getElementsByTagName("input");
        NodeList p = trichildnodes.getElementsByTagName("p");
        Element ppp = (Element) p.item(0);
        System.out.println("Node Value " + ppp.getNodeValue());
        System.out.println(inputs.getLength() + "Input length");

        for(int in=0;in<inputs.getLength();in++) {

            Element inn = (Element) inputs.item(in);
            System.out.println(inn.getAttribute("semantic") + " " + inn.getAttribute("source") + " Attributes");

        }


        //System.out.println(p.getLength() +  " P's length" );
        //System.out.println("P's content " + ppp.getFirstChild().getNodeValue());


    }   
}

The XML is very large and I am posting a part which I was trying to parse.

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
</mesh>

I was trying to get the value of . Everything works fine except getting p's value. But when I debug I can see the values, its associated with first child. I even tried using firstChild. I am completely lost with parsing trying to find out a solution on this. Please some one help me find a solution on How to get the value of p ?

When I use getTextContent instead I get the output like below:

NodeValue null
NodeValue 24 262 2 72 72 72 72 2222 8198219
NodeValue null

The output is blank for two tags.

bdoughan · Accepted Answer · 2012-07-27 15:49:37Z

I would recommend using the javax.xml.xpath APIs available in the JDK/JRE since Java SE 5 to make the processing of your XML document easier:

package forum11688757;

import java.io.File;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class Demo {

    public static void main(String[] args) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(new File("src/forum11688757/input.xml"));

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xpath = xpf.newXPath();
        NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);
        for(int x=0; x<nodeList.getLength(); x++) {
            System.out.println(nodeList.item(x).getTextContent());
        }
    }

}

input.xml

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
</mesh>

Output

 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219

UPDATE

You could also get the p elements using the following line of code. You need to be careful though since it will get all p elements not just those in the /mesh/triangles/p path

NodeList nodeList = document.getElementsByTagName("p");

The following approach will always get you the data you are looking for, even if p eleements are later added somewhere else in the document.

NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);

parsifal · Accepted Answer · 2012-07-27 13:52:14Z

1

The nodeValue() of an Element is documented as being null.

Instead, you probably want to call getTextContent(). But beware that it has its own idiosyncrasies (if you call it on the root of a tree, it will concatenate the text of all elements in the tree).

answered Jul 27, 2012 at 13:52

parsifal

111 bronze badge

4 Comments

Vinay Over a year ago

I have tried with getTextContent() but didn't work. It prints some not all.

parsifal Over a year ago

"Some not all" is not useful. Show exact output and what you expect for output.

parsifal Over a year ago

That isn't the exact output from the program you show; for one thing, there should be output related to the "input" elements. If it is in fact output from the same line of code, then take a look at your data: you will find empty "p" elements.

parsifal Over a year ago

And as a general comment: when you're creating debugging output, be as specific as possible. Rather than saying "NodeValue", you should say something like " content". Even better, keep count of the number of times you've written this, and write " content #1234".

alain.janinm · Accepted Answer · 2012-07-27 14:23:09Z

1

You don't have to iterate over the previous nodes if you don't need them. For example it's how to print all the text content in  tags :

    File xmlPath = new File("test.xml");

    DocumentBuilderFactory fabrique = DocumentBuilderFactory.newInstance();
    fabrique.setCoalescing(true);
    fabrique.setIgnoringElementContentWhitespace(true);

    DocumentBuilder constructeur = fabrique.newDocumentBuilder();

    Document document = constructeur.parse(xmlPath);  
    document.setXmlVersion("1.0");
    Element racine = document.getDocumentElement();
    NodeList liste = racine.getElementsByTagName("p");

    for(int i=0; i<liste.getLength(); i++) {
        Element e = (Element)liste.item(i);  
        System.out.println(e.getFirstChild().getTextContent());
    }

You can use that and elaborate to obtain what you want I guess. If you want the attribute value, just use: e.getAttribute("att_name").

answered Jul 27, 2012 at 14:23

alain.janinm

20.1k11 gold badges67 silver badges114 bronze badges

2 Comments

Vinay Over a year ago

Thanks. But I doesn't read the first and third it displays nothing.

alain.janinm Over a year ago

@VinayKumarjg I used the xml example you provide and it display well the 4  content. I just replace the end <triangles> tags with </triangles>. If you miss entry, it's probably because you have kept your nested loop or because the xml file is not as you describe it in the example.

Collectives™ on Stack Overflow

Null value returnred with XML parsing in java using plane DOM parser

3 Answers 3

Comments

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related