1

I have the following xml.. and I am trying to parse it.

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>CB9A0BB76</manager>
    </contact>
</employee>

But.. well... I am not able to do so.. Posting my code.. but my code works for "proper" formatted xml though? (uncomment "xmlString")

public class XMLReader {
    public static void main(String[] args) throws JDOMException, IOException {

        //String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>[email protected]</email>\n</employee>";
        String xmlString = "<employee>\n" + 
                "       <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" + 
                "       <name>Lareina</name>\n" + 
                "       <age>50</age>\n" + 
                "       </personal><contact><dept>Fusce</dept>\n" + 
                "       <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" + 
                "   </employee>";
        System.out.println(xmlString);


        SAXBuilder builder = new SAXBuilder();
        Reader in = new StringReader(xmlString);

        Document doc = builder.build(in);
        Element root = doc.getRootElement();
        List children = root.getChildren();
        //System.out.println(children);
        String value = "";
        for (int i = 0; i < children.size(); i++) {

                Element dataNode = (Element) children.get(i);
               // Element dataNode = (Element) dataNodes.get(j);
                value += ", " +dataNode.getText().trim();
                System.out.println(dataNode.getName() + " : " + dataNode.getText());

                //context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));

            }
        //System.out.println(in);



    }
}
3
  • I un-commented the code and it works fine for me. Commented Sep 26, 2013 at 22:28
  • @SotiriosDelimanolis: which code? It works fine with the uncommented "xmlString" but the one xml I have given.. Does that works?? Commented Sep 26, 2013 at 22:29
  • Instead of parsing the XML manually, use JAXB or a similar POJO-XML marshaling library. It only takes a few lines of code to effortlessly convert between your Java objects and XML. Commented Sep 26, 2013 at 23:19

1 Answer 1

2

Your two xml strings are different. The first is

<employee>
    <firstname xml:space="preserve">John</firstname>
    <lastname>Watson</lastname>
    <age>30</age>
    <email>[email protected]</email>
</employee>

Which has four (4) children that each has text. So it prints

firstname : John
lastname : Watson
age : 30
email : [email protected]

And the second is

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager>
    </contact>
</employee>

In this last one, you get two children personal and contact which have no text. So you get output like

personal : 



contact : 

This is the expected output.

Sign up to request clarification or add additional context in comments.

7 Comments

So i guess, is there a way to get "<id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id> <name>Lareina</name> <age>50</age>" as value for personal??
No, this is not HTML and there is no 'inner-xml' capability. After parsing there is only the element tree. Each node contains sub-nodes, which can be elements or text (and some other types like attributes, PIs). If you need to represent a subtree as serialized XML (i.e. the string you showed) you must serialize it yourself.
Of course. The Element class has a getChild(name) method. You can just do getChild("personal") on the root and iterate over the children elements. I suggest you use XPath to parse the xml.
@Jim Did I misunderstand the question? You can very easily get the elements within <personal>.
You can get each node individually, but there is nothing built-in that will produce a string as the OP requested in his first comment. To make that string he would have to serialize the subtree.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.