3

I am trying to implement a small example where I want to convert content in a text file to XML file using XSL as transformer. I came across this example - XSL - create well formed xml from text file in SO and I was trying to implement the same but facing some issues.

I am using the same text file as input and the XSL file mentioned in answer of the SO post. This is the Java program I am trying to use:

public class Parser {
    public static void main(String[] args) {
        String path="src/";
        String text = path+"input.txt";
        String xslt = path+"input.xsl";
        String output = path+"output.xml";

        System.setProperty("javax.xml.transform.TransformerFactory",    
                "net.sf.saxon.TransformerFactoryImpl");
        try {
            TransformerFactory tf = TransformerFactory.newInstance();

            Transformer tr = tf.newTransformer(new StreamSource(xslt));
            tr.transform(new StreamSource(text), new StreamResult(
                    new FileOutputStream(output)));

            System.out.println("Output to " + output);
        } catch (Exception e) {
            System.out.println(e);
            e.printStackTrace();
        }
    }
}

I am getting exception as:

Error on line 1 column 1 of input.txt:
  SXXP0003: Error reported by XML parser: Content is not allowed in prolog.
net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: Content is not allowed in prolog.
net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:418)
    at net.sf.saxon.event.Sender.send(Sender.java:214)
    at net.sf.saxon.event.Sender.send(Sender.java:50)
    at net.sf.saxon.Controller.transform(Controller.java:1611)
    at three.Parser.main(Parser.java:21)
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:174)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1427)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1036)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:647)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
    at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:404)
    ... 4 more

It seems I cannot use the text file as input in my program. Can someone please help me in fixing the issue.

Update:

I have solved it using the Saxon S9 API (using Jar - saxon9he.jar) as suggested by Martin in his answer, here is the JAVA code that worked.

import java.io.File;

import javax.xml.transform.stream.StreamSource;

import net.sf.saxon.s9api.Processor;
import net.sf.saxon.s9api.QName;
import net.sf.saxon.s9api.SaxonApiException;
import net.sf.saxon.s9api.Serializer;
import net.sf.saxon.s9api.XsltCompiler;
import net.sf.saxon.s9api.XsltExecutable;
import net.sf.saxon.s9api.XsltTransformer;
public class Parser {
    public static void main(String[] args) throws SaxonApiException {
        Processor proc = new Processor(false);
        XsltCompiler comp = proc.newXsltCompiler();
        XsltExecutable exp = comp.compile(new StreamSource(new File(
                "src/input.xsl")));
        Serializer out = new Serializer();
        out.setOutputProperty(Serializer.Property.METHOD, "xml");
        out.setOutputProperty(Serializer.Property.INDENT, "yes");
        out.setOutputFile(new File("src/output.xml"));
        XsltTransformer trans = exp.load();
        trans.setInitialTemplate(new QName("main"));
        trans.setDestination(out);
        trans.transform();

        System.out.println("Output written to text file");
    }
}
2
  • Looking at the SO reference that you cited it is mentioned that you have to convert the text file to a flat XML file before you can feed it into the XSLT processor. I would assume that you have skipped that step so far. You cannot feed a simple text file since it is - as a rule - not a valid XML file. Commented Oct 9, 2014 at 12:24
  • stackoverflow.com/questions/2310926/… Commented Oct 9, 2014 at 12:26

2 Answers 2

2

The code to transform text to XML depends on XSLT version 2.0 and an XSLT 2.0 processor like Saxon 9. The JAXP API you are trying to use is solely useful with an XSLT 1.0 approach of having an XML input document as the primary source to the XSLT code. Thus if you want to use that API then you need to make sure you pass a dummy input XML to the transformer, while the URI of the plain text file should be passed in as a parameter. I would however suggest to use the Saxon S9 API to simply start the stylesheet with a named template main, also passing in the plain text URI as a parameter.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Martin, I will check the API and see how to implement it in standalone program.
Thanks Martin, I have used the API and created a program, I updated my question with the working program.
1

You can't feed plain text to an XSL transformer. It only accepts well-formed XML as input.

So the code in the linked question starts the transformer with no input and then inside of XSLT, it loads the text with

<xsl:variable name="csv" select="unparsed-text($pathToCSV, $encoding)" />

1 Comment

Thanks Aaron, but when I run the command as mentioned in the answer of the post java -jar saxon9he.jar -it:main -xsl:sheet.xsl, I was able to see the generated xml in console. I am trying to achieve the same thing using standalone java code, but got stuck and how to do that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.