0

I have an InputStream object which contains several million file informations (Name, create date, author etc.) in XML format. I've already tried to convert it to String using IOUtils.copy method, but since the size of that information is pretty large it throws an java.lang.OutOfMemoryError, after running for few minutes.

Increasing JVM memory is not an option, since the number of files, which I collect info from, is increasing forever. So can someone suggest me what should I do to solve the issue?

9
  • What's the concrete implementation of the InputStream? Is it a ByteArrayInputStream, for example? Commented Sep 12, 2014 at 10:43
  • If the input is just too massive, your other options are (a) First trying to ETL it into a database or (b) using Hadoop or something similar Commented Sep 12, 2014 at 10:45
  • 2
    It just seems to be the wrong approach to convert the huge data into one string. What do you want to do with the string? Commented Sep 12, 2014 at 10:49
  • 5
    If you can't fit your object in memory, then avoid cases that requires you to store it in memory. Work with it as stream. Commented Sep 12, 2014 at 10:49
  • 1
    If the data is too large to store in memory and you can't increase memory then your options are limited - either process as a stream, extracting what you need, or persisting the data somewhere for later access. Commented Sep 12, 2014 at 11:35

1 Answer 1

2

The problem that you are having is the very reason stream based IO exists, it is simply not viable to slurp huge amounts of data into memory before consuming it.

Parse your stream as... a stream! See the Oracle tutorials for more information on stream based XML parsing using SAX.

XMLReader xmlreader =
    SAXParserFactory.newInstance().newSAXParser().getXMLReader();
xmlreader.setContentHandler(new ContentHandler() {
    ...
});

xmlreader.parse(new InputSource(myInputStream));
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.