1

I am reading in pipe delimitated text in from a flat file and am having an error parsing the text. I am an old Java hand but I haven't touched it for a few years. Here is the code:

        String zipString = tokenizerForOneLine.nextToken();
        System.out.println( "Zip String: -->" + zipString + "<--");
        //zipString = "18103"; <<<This works!!!
        int zipInt = Integer.parseInt( zipString );
        aProvider.setZipCode( zipInteger );

Here is the output:

Zip String: -->�1�8�1�0�3�<--
java.lang.NumberFormatException: For input string: "�1�8�1�0�3�"
NumberFormatException while reading file.
Detailed Message: For input string: "�1�8�1�0�3�"

My naive guess is that it is an encoding issue. Is this possible? It makes no sense to me. Or I am doing something really dumb and just don't see it?

How do I diagnose the encoding issue? (My data vendor claims it is in standard UNICODE).

Thanks-in-advance,

Guido

3
  • 2
    Well that's wierd. After StackOverflow processed it it showed a whole bunch of wierd question marks. Now I am really thinking it is the encoding. Those question marks do not appear in the standard output display (in Netbeans 7.01). Commented Nov 4, 2011 at 17:06
  • 1
    Seems like you are correct sir.. Commented Nov 4, 2011 at 17:07
  • 2
    @GuidoAnselmi: As ever, to check encoding issues, look at the text file in a binary editor (or hexdump or whatever). My guess is that it's UTF-16. Commented Nov 4, 2011 at 17:10

1 Answer 1

6

Make sure you are building a reader with the proper encoding. Your code should look this:

    BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream("data.csv"), encoding));
    String line;
    while ((line = in.readLine()) != null) {
        StringTokenizer tokenizer = new StringTokenizer(line, "|");

        ...
    }

The encoding is probably UTF-16.

Also, if the file has byte order marks you might use the BOMInputStream from Commons IO to detect the encoding automatically.

http://commons.apache.org/io/api-release/org/apache/commons/io/input/BOMInputStream.html

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.