1

I have one XML file as "sample.xml" and has 4 records .

    <?xml version='1.0' encoding='UTF-8'?>
<hello xmlns:show="http://www.example.com" xmlns:css="http://www.example.com" xml_version="2.0">


  <entry id="2008-0001">
    <show:id>2008-0001</show:id>
    <show:published-datetime>2008-01-15T15:00:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2012-03-19T00:00:00.000-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>3.6</css:score>
        <css:access-vector>LOCAL</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-01-15T15:22:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is first entry.</show:summary>
  </entry>
  <entry id="2008-0002">
    <show:id>2008-0002</show:id>
    <show:published-datetime>2008-02-11T20:00:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2014-03-15T23:22:37.303-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>5.8</css:score>
        <css:access-vector>NETWORK</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-02-12T10:12:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is second entry.</show:summary>
  </entry>

  <entry id="2008-0003">
    <show:id>2008-0003</show:id>
    <show:published-datetime>2009-03-26T06:12:08.780-04:00</show:published-datetime>
    <show:last-modified-datetime>2009-03-26T06:12:09.313-04:00</show:last-modified-datetime>
    <show:summary>This is 3rd entry with missing "css" tag and their metrics.</show:summary>
  </entry>

  <entry id="2008-0004">
    <show:id>CVE-2008-0004</show:id>
    <show:published-datetime>2008-01-11T19:46:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2011-09-06T22:41:45.753-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>4.3</css:score>
        <css:access-vector>NETWORK</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-01-14T09:37:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is 4th entry.</show:summary>
  </entry>
</hello>

and 1 Java file as "Test.java" -

    import java.io.File;
    import java.util.ArrayList;
    import java.util.List;

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathFactory;

    import org.w3c.dom.Document;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;

public class Test {

    public static void main(String[] args) {



        List<String> list = new ArrayList<String>();


        File fXmlFile = new File("/home/ankit/sample.xml");

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

        try
        {
            DocumentBuilder dBuilder = factory.newDocumentBuilder();

            Document doc = dBuilder.parse(fXmlFile);

            doc.getDocumentElement().normalize();

            NodeList nList = doc.getElementsByTagName("entry");

            XPathFactory xPathfactory = XPathFactory.newInstance();

            XPath xpath = xPathfactory.newXPath();


            for (int i = 0; i < nList.getLength(); i++)
            {

                XPathExpression expr1 = xpath.compile("//hello/entry/css/metrics/score");

                NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);

                if(nodeList1.item(i)!=null)
                {
                    Node currentItem = nodeList1.item(i);

                    if(!currentItem.getTextContent().isEmpty())
                    {
                        list.add(currentItem.getTextContent());
                        }
                }
            }
        }
        catch(Exception e)
        {
            e.printStackTrace();
        }

        System.out.println("size----"+list.size());
        for(int i=0;i<list.size();i++)
        {
            System.out.println("list----"+list.get(i));
        }
    }
}

I need to read the <entry> tag from the XML and for that I am using XPath . In the XML file there are 4 entry tags and inside entry tag there is <show:css> tag, but in 3rd <entry> tag this <show:css> tag is missing and putting those css tag's score values in the list. So when I am running this java code first 2 values got stored in the list and at the 3rd place it stores 4th tag's css's score value.

I want a list as output which will have first, second and forth element as "3.6", “4.8” and “5.3” and 3rd element should be empty string or nill. But I am getting only 3 elements in the list with values of 1,2 and 4.

I need to put empty string “” at 3rd place and original value at 4th. Means If that tag is not present then put blank value in the list.

Current output - [“3.6” , “4.8” , “5.3”]

I expect - [“3.6” , “4.8” , “” , “5.3”]

Could anyone please help me with this that how to do this.

4
  • XPath 1.0 doesn't return sets of strings, only sets of nodes. And it can't include a node that doesn't exist in a node set. I'd say your best bet is to select the set of entries, then iterate through that, selecting the scores for each of them one by one. Commented Mar 11, 2015 at 5:02
  • @JLRishe Suppose there would be 1L enteries then it would be performance issue and also as well it would more complex. Commented Mar 11, 2015 at 5:12
  • So? You're asking to do something that XPath doesn't do. And what does 1L mean? Commented Mar 11, 2015 at 5:22
  • Also note that you can't guarantee that XPath will work as you expect (or even at all) when you abuse namespaces like this. The particular XPath implementation provided by default in this version of Java does what you require, but your code is likely to break if you add a different XML parser or XPath library to the classpath. To play by the rules you need to parse with a namespace-aware XML parser, and declare a NamespaceContext that binds your namespaces to suitable prefixes. This is awkward in javax.xml.xpath, consider an alternative like Dom4J which handles namespaces more cleanly. Commented Mar 12, 2015 at 10:08

3 Answers 3

2

There's probably a few ways this might be achieved...

My basic take on it is to find all the entry nodes which have a css/metrics/score child node and which don't (you could probably just get ALL the entry nodes, but this demonstrates the power of the query language)

Something like...

XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");

I know the conditional expression is meaning less, I wanted the OP to see that they can use additional conditional to expand on there requirements, thank you all for pointing out despite the fact that I already did mention it...hope we can all move on from it

Then, loop through the resulting NodeList and query each entry Node for the css/metrics/score node. If it's null, then add a null value into the list (or what ever else you want), for example...

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(JavaApplication908.class.getResourceAsStream("/Hello.xml"));

XPathFactory xf = XPathFactory.newInstance();
XPath xPath = xf.newXPath();

XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
XPathExpression expr2 = xPath.compile("css/metrics/score");

List<String> values = new ArrayList<>(25);

NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
for (int index = 0; index < nodeList1.getLength(); index++) {
    Node node = nodeList1.item(index);
    System.out.println(node.getAttributes().getNamedItem("id"));

    Node css = (Node) expr2.evaluate(node, XPathConstants.NODE);
    if (css != null) {
        values.add(css.getTextContent());
    } else {
        values.add(null);
    }

}

for (String value : values) {
    System.out.println(value);
}

This outputs...

id="2008-0001"
id="2008-0002"
id="2008-0003"
id="2008-0004"
3.6
5.8
null
4.3

(The first four lines are the entry node ids, the last four are the resulting css/metrics/score values)

Sign up to request clarification or add additional context in comments.

12 Comments

[css/metrics/score or not(css/metrics/score)]? That's completely meaningless.
@JLRishe As I said, you could just get the entry nodes themselves, but I wanted to demonstrate the use of the query language to select a node with a given child node. Yes, it's meaningless, but I'm not sure what other conditions might need to be meet in order for a entry node to be considered valid...it's simply a demonstration that it's possible to query a node and it's sub nodes...
@MadProgrammer Thanks for the solution...It is showing the result as expected. In this solution you used 2 expressions..is there any other way to do the same with only 1 expression.
Interesting, does with work without declaring the namespaces? Exactly because your XPath expression is only there for demonstration purposes, make sure it is a meaningful one, e.g. simply //hello/entry.
@MathiasMüller Yes and maybe, I want the OP to know that can add additional conditions into the query
|
0

I am not an expert in XPath but from looking at your code, I think you are just missing a couple of lines of code,

if(nodeList1.item(i)!=null)
{
   Node currentItem = nodeList1.item(i);
   if(!currentItem.getTextContent().isEmpty())
   {
     list.add(currentItem.getTextContent());
   }
   else
     list.add("");
}
else
 list.add("");

Comments

0

@MathiasMüller could you please let me know how it can be done in 1 expression in XPath 2.0. – ankit

The equivalent XPath 2.0 expression would be

for $x in //entry return (if ($x//*:score) then $x//*:score else '')

which makes heavy use of new constructs introduced in XPath 2.0. The output would then be

3.6
5.8
[Empty string]
4.3

But be aware that currently, most XPath implementations only support 1.0. Try this XPath 2.0 expression within an XSLT stylesheet online here, a site that uses Saxon 9.5 EE.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.