5

I have an XML document built with

org.xmlpull.v1.XmlSerializer

This document contains following XML prolog

<?xml version='1.0' encoding='utf-8' standalone='yes' ?>

When I try to parse this document using

import org.xmlpull.v1.XmlPullParser;

with following configuration code

XmlPullParser pullParser = Xml.newPullParser();
pullParser.setInput(theInputStream, "utf-8");

I get undecoded utf-8 strings when I call

String text = pullParser.getText();

So it seems that XmlPullParser in Android (I use 1.5) doesn't support utf-8. Did I miss something?

Thank you in advance.

0

2 Answers 2

2

Not sure if it matter but can you try two things

  1. Use UTF-8 instead of lower case

And

  1. Try using pullParser.setInput(theInputStream); and seeing if the pullparser can determine the encoding on it's own.
Sign up to request clarification or add additional context in comments.

1 Comment

pullParser.setInput(theInputStream); did the trick for me - it seems that the BOM is correctly handled by XmlPullParser when using an InputStream
1

This question is old but I recently ran into the same issue using XMLPullParser. In my case, I was parsing a stream of UTF-8 encoded XML from an OkHttp ResponseBody. It was necessary for me to specify the input encoding charset for this to work. In case someone else lands here:

override fun convert(response: ResponseBody): ArchNewsFeed? {
        val encoding = Charsets.UTF_8.name()
        val factory = XmlPullParserFactory.newInstance()
        factory.isNamespaceAware = true;
        val parser = factory.newPullParser()
        parser.setInput(response.byteStream(), encoding)
        ...

    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.