I have a fairly large XML file (~280 MB) and each row in the XML file has many attributes, I want to extract 3 attributes from it and store it somewhere. But I ran out of memory when I do that. My code looks like this:
File xmlFile = new File(xml);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = null;
try {
doc = dBuilder.parse(xmlFile);
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
NodeList nList = doc.getElementsByTagName("row");
for (int index = 0; index < nList.getLength(); index++) {
Node nNode = nList.item(index);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
System.out.print("F1 : " +
nNode.getAttributes().getNamedItem("F1").getTextContent());
System.out.print(" F2: " +
nNode.getAttributes().getNamedItem("F2").getTextContent());
System.out.println(" F3: " +
nNode.getAttributes().getNamedItem("F3").getTextContent());
}
}
This is the error I get:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeObject(DeferredDocumentImpl.java:974)
at com.sun.org.apache.xerces.internal.dom.DeferredElementImpl.synchronizeData(DeferredElementImpl.java:121)
at com.sun.org.apache.xerces.internal.dom.ElementImpl.getTagName(ElementImpl.java:314)
at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.nextMatchingElementAfter(DeepNodeListImpl.java:199)
at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.item(DeepNodeListImpl.java:146)
at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.getLength(DeepNodeListImpl.java:117)
at Parser.parsePosts(Parser.java:55)
at Parser.main(Parser.java:72)
How do I change it to prevent going over too much space?
EDIT: Wrote a new parser using SAX, seems to get the job done. The code is:
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.print(attributes.getValue("F1") + " ");
System.out.print(attributes.getValue("F2") + " ");
System.out.println(attributes.getValue("F3"));
}
};
saxParser.parse("file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}