0

I am trying to write a function that is able to take a text file and separate each line into individual strings and adds them to an array, after which I will take the strings and convert the numbers into Integers or Doubles. However, it keeps returning a NumberFormatException whenever I try to do Integer.parseInt() on the first string in the array, which is always an integer.

This code is a simplified version of what I am attempting to do:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class test {

    public static void main(String[] args) {

        try {

            File file = new File("preprocessed_data.txt");

            Scanner scanner = new Scanner(file);

            while(scanner.hasNextLine()) {

                String line = scanner.nextLine();

                if (line.length() != 0) {

                    // Splitting each line into an array of Strings
                    String[] strings = line.split("\\s+");

                    // Trying to convert the first String into an Integer
                    System.out.println(Integer.parseInt(strings[0]));

                }
            }

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

    }

}

This is the text file which I am trying to process:

11259 8111 +2 14 5.9 5.1 2.0 662.8449 1324.6825 1324.6817 0.0008 1 √ CDFEK($1) KLTK($1) [A1:5 P215:218]

10365 4551 +2 28 11.0 9.0 1.7 643.3196 1285.6320 1285.6245 0.0075 1 √ CDFEK($1) K($1)FR [A1:5 P311:313]

16242 4175 +3 23 13.4 7.3 1.6 546.6142 1637.8280 1637.8316 -0.0035 3 √ CDFEK($1)K K($1)GDKAR [A1:6 O448:453]

27030 24226 +3 16 5.4 6.4 1.7 893.4433 2678.3153 2678.3178 -0.0024 2 √ KSFCAWLNVPNGNK($1) IK($1)DNNMR + OxiM(22) 27031 25071 +3 10 4.8 5.1 2.6 893.4530 2678.3445 2678.3178 0.0267 2 √ KSFCAWLNVPNGNK($1) IK($1)DNNMR + OxiM(22) [A6:19 D503:509]

25104 18270 +3 19 6.8 5.8 1.7 805.7773 2415.3173 2415.2965 0.0207 2 √ KSFCAWLNVPNGNK($1) LRNLK($1) [A6:19 I271:275 A6:19 I329:333 A6:19 I369:373]

27761 30048 +3 37 6.0 6.5 1.7 959.4729 2876.4041 2876.3883 0.0158 1 √ KSFCAWLNVPNGNK($1) ELNEQAGESK($1) [A6:19 I469:478]

26769 27493 +3 17 13.0 6.4 1.3 883.4568 2648.3560 2648.3541 0.0019 1 √ KSFCAWLNVPNGNK($1) KPLDFEK($1) 26781 28982 +3 15 9.4 6.6 1.6 883.4586 2648.3611 2648.3541 0.0070 1 √ KSFCAWLNVPNGNK($1) KPLDFEK($1) [A6:19 K1379:1385]

And this is the error which I keep getting:

Exception in thread "main" java.lang.NumberFormatException: For input string: "11259" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at com.company.test.main(test.java:25)

5
  • 2
    Does the debugger show any dodgy characters in the string you're trying to parse? Commented Feb 6, 2018 at 22:20
  • Your code works for me with exactly this file content. Seems like you have some characters at the begining of the line indeed. Commented Feb 6, 2018 at 22:31
  • It will be helpful if you add this line of code just prior to the Integer.parseInt() line; IntStream.range(0, strings[0].length()).forEach(i -> System.out.println(strings[0].getBytes()[i]));; This will print the byte value of each character, and might shed some insight to why the parsing is failing. If any value is not between 48 and 57 you have a problem. Commented Feb 6, 2018 at 23:11
  • @IanMc Thanks, I tried that and it was indeed printing out numbers outside of that range. I figured out while trying to copy the text into a different file, that the problem was that the text file was being encoded as UTF-8 and when I changed it to ANSI when saving the file it worked. If you don't mind, could you explain why, when I put that code in, the value should be between 48 and 57, what does it mean if it is out of that range? Commented Feb 7, 2018 at 0:14
  • There are standard ASCII codes for all characters. The numbers 0-9 have ASCII codes 48-57. rapidtables.com/code/text/ascii-table.html Commented Feb 7, 2018 at 0:17

2 Answers 2

1

The text file I was attempting to process was encoded with UTF-8, and once I switched it to ANSI it removed the invisible characters in the beginning of the file and the code worked.

Sign up to request clarification or add additional context in comments.

Comments

0

This looks like invisible characters, because your code works for me.

I would highly recommend opening the file in an editor like the notepad.exe on windows and make sure, that there are no invisible characters.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.