4

I am producing some XML in a java application. I am looking at the variety of XML parsing options. I am not intending to do anything more than traverse the structure and extract values out of it. I need to use one of them that is built into the Java API (1.5+) without any additional plugins. I don't need to create "events" or transform it into anything else. I am not producing XML, merely reading and extracting data. I am not enforcing a schema either.

Sun provide a list here, but it's not really obvious what I should use.

http://java.sun.com/developer/technicalArticles/xml/JavaTechandXML/

What would be the most appropriate XML API to use in this case ? JAXP ? JDom ? XPath ?

4
  • Jaxp, JDOM and XPath have not been updated for a while. I think for ease of use, vtd-xml may be worth looking into. Commented Mar 1, 2011 at 21:43
  • @vtd-xml-author - The JDK patch releases often contain updates to the JAXP implementations included. Don't confuse spec updates with implementation updates. Commented Mar 1, 2011 at 21:53
  • @All Thanks for all the rapid responses. I am getting it down to DOM or XPath. Sounds like DOM. Would XPath be any easier ? Commented Mar 1, 2011 at 22:36
  • 1
    Check out the following question. It has two answers one demonstrates DOM the other XPath: stackoverflow.com/questions/3717215/… Commented Mar 2, 2011 at 0:19

9 Answers 9

5

I think using a DOM parser to parse the XML and load it into memory in a Document sounds sufficient for your needs.

You wouldn't use XPath in that case, just the Document API.

JAXP is just a synonym for the XML parsing technology build into the JDK. The term JAXP (P is for Parsing) distinguishes it from JAXB (B is for binding).

Some 3rd party libraries built on top of DOM might make your life easier. Think about JDOM or DOM4J.

Sign up to request clarification or add additional context in comments.

5 Comments

DOM sounds like what's required. I need to keep it standard within the JDK. No 3rd party solutions/plugins/addons unfortunately. I need to extract the data from the XML, so XPath would be over kill for this purpose ?
If you parse XML into a DOM tree, you should use the API provided for you to access data.
Thanks... will give DOM a go.
A belated comment on this: the statement that "JAXP is just a synonym for the XML parsing technology build into the JDK" is absolutely not true. JAXP is an interface (API) supported by multiple XML parsers, and the parser built into the JDK is just one implementation of it. Moreover, the scope of JAXP extends beyond parsing - it also covers validation and transformation.
Probably true nine years after the question was first answered, but at the time it might not have been. I hope this late comment helps somebody.
2

The most classical way of doing things in IMO would be combination of JAXP and XPath. Java 5.0 includes JAXP 1.3 and this is standard stuff. Please see this answer to a similar question for a minimalist coding sample.

Comments

1

DOM parser is what you looking for i think. easy to implement it and it has fast searching node capability

Comments

1

As the parsing strategy you can use either DOM strategy which has the advantage that the hole document is kept in memory and you can access it via xpath. i recommend this if you have small xml documents or if you really NEED all the data to be present and accessable all the time because this consumes a lot of heap space.

if you have bigger documents or if you dont need to access the all the time you should either use the SAX method or the Stax method (xml pull parsing) if this is available in your java distribution. These methods are event based. so they traverse through the xml tree and make a kind of callback to a class defined by you. so you can react on events like "element xy starts" "element xy ends"

1 Comment

I think DOM Parser is the solution for the project. I found this example that demonstrates the complexity. mkyong.com/java/how-to-read-xml-file-in-java-dom-parser
1

Using the standard DOM Parser is good enough for your purpose. Try out this example.

Comments

1

I think that the most practical tool to use is XStream, from ThoughtWorks. Some modern mvc frameworks like VRaptor use it to serve and consume xml. Take a look at: http://x-stream.github.io/

3 Comments

this has absolutely nothing to do with the question. the question is about xml parsing. your answer refers to xml serialisation...
"I am not intending to do anything more than traverse the structure and extract values out of it." Indeed, my bad!
0

XOM.

Use xpath.

1 Comment

Can't use it unless it's built into the standard SDK 1.5+
0

If it is very trivial - do it in SAX parser.

1 Comment

Thanks for the direction. But it seems that SAX uses events ? This is already overkill for my purposes.
0

It seems that SAX is the API you want.

Google "SAX Parsing" and you will find many examples.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.