0

I am using below method to parse an XML file -

package com.kcs.xml;

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;

public class ParseXMLOld {
    public static void main(String[] args) 
    {
        final String FILE_PATH="C:\\abc.xml";
        File file=new File(FILE_PATH);
        ParseXMLOld pxo=new ParseXMLOld();
        pxo.parseUTFXML(file);
    }

    public Document parseUTFXML(File file) 
    {
        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder=null;
        Document doc=null;
        try {
            docBuilder = docBuilderFactory.newDocumentBuilder();
            InputStream inputStream= new FileInputStream(file);
            Reader reader = new InputStreamReader(inputStream,"UTF-16");
            InputSource is = new InputSource(reader);
            is.setEncoding("UTF-16");
            doc = docBuilder.parse(is);
            System.out.println("Done");
            } 
        catch(Exception e)
            {
            e.printStackTrace();
            }
        finally
            {
            docBuilderFactory=null;
            docBuilder=null;
            }
    return doc;
    }
}

I have two file with encoding UTF-8 and UTF-16. If value of "UTF_ENCODING" in above code is UTF-8 then the file with "UTF-8" encoding works fine. But my code fails to parse other file with encoding "UTF-16" and vice versa.

I would like to mention one more interesting thing, if I create an sample XML file manually with encoding UTF-16, IE7 fails to open it. But the file with UTF-16 encoding, that I am trying to parse (I am getting it from another system) is opening in IE7. But if you edit the first line of this file (change encoding to UTF-8 and then change it to UTF-16 again), it doesn't open. I have no idea why this is happening.

Please help.

I dont know how I can share these file. If required please tell me how can I share these two files?

For example, how can I parse below file?

<?xml version="1.0" encoding="UTF-16"?>
  <Details>
    <Content>
      <id>1234¥£€$¢</id>
      <Valid_From_Date>2013-01-01</Valid_From_Date>
      <Valid_To_Date>9999-12-31</Valid_To_Date>
      <Company>1210</Company>
      <Description>2nd Life Transaction</Description>
    </Content>
    <Totals>
      <Count>1</Count>
    </Totals>
</Details>

I am getting below error -

[Fatal Error] :1:1: Content is not allowed in prolog. org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at com.kcs.xml.ParseXMLOld.parseUTF8XML(ParseXMLOld.java:34) at com.kcs.xml.ParseXMLOld.main(ParseXMLOld.java:19)

1 Answer 1

1

The following works of me if the files either have a BOM or the encoding specified in the preamble:

 File fXmlFile = … ;
 DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
 DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
 Document doc = dBuilder.parse(new FileInputStream(fXmlFile))
Sign up to request clarification or add additional context in comments.

6 Comments

Yes you are right. But this doesn't work if XML file contains any UTF-8 character for example "¥£€$¢₡". Thanks for your reply.
Do you have an example file?
Yes, I have few files. Can I share here somehow? You can enter above characters under any tag and try to parse it.
Yes, you are right. But in case it doesn't contain BOM character, then it fails. To test it just copy paste the posted XML. Thanks for your reply.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.