0

I want to parse a large xml file(785mb) and write the data to csv. I am getting java heapspace error(out of memory) when I try to parse the file. I tried increasing the heap size to 1024mb but the code could handle a file of 50mb maximum.

Please let me know a solution for parsing large xml file in java.

5
  • 2
    One thing that I've learned when parsing data from a file is not to cache it. Ensure that you're not pulling the entire file into an object. Commented Oct 16, 2014 at 14:08
  • 1
    Do you use a SAX or DOM parser? Commented Oct 16, 2014 at 14:09
  • If you use 32-bit java you won't go over cca 1.5GB of heap space Commented Oct 16, 2014 at 14:09
  • I am working on a 32 bit machine and using DOM parser. Commented Oct 16, 2014 at 14:15
  • Use a SAX parser. It takes stream approach to loading and parsing of the xml file. Whichever parser you are using seem to be loading the entire xml file into memory and that could be/likely causing the issue. Google for SAX parsers, you will find one. Commented Oct 16, 2014 at 14:16

2 Answers 2

1

You should use a SAXParser instead of a DOMParser The difference is that it doesn't load the complete XML data in memory.

Look at this tutorial : http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/

Regards,

Romain.

Sign up to request clarification or add additional context in comments.

2 Comments

I cannot use a SAX parser as the tags in the xml file are unknown. So I won't be able to create a error handler for SAX parser.
Ok so the StAX parser might be a better choice for you because it uses an iterator approach.
0

The solution here is to use Streaming Api for XML (StAX). Here is good tutorial.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.