1
String[] values = line.split(",");

Long locId = Long.parseLong(replaceQuotes(values[0]));
String country = replaceQuotes(values[1]);
String region = replaceQuotes(values[2]);
String city = replaceQuotes(values[3]);
String postalCode = replaceQuotes(values[4]);
String latitude = replaceQuotes(values[5]);
String longitude = replaceQuotes(values[6]);
String metroCode = replaceQuotes(values[7]);
String areaCode = replaceQuotes(values[8]);

//...

public String replaceQuotes(String txt){
    txt = txt.replaceAll("\"", "");
    return txt;
}

I'm using the code above to parse a CSV with data in this format:

828,"US","IL","Melrose Park","60160",41.9050,-87.8641,602,708

However, when I encounter a line of data such as the following I get java.lang.ArrayIndexOutOfBoundsException: 7

1,"O1","","","",0.0000,0.0000,,

Does this mean that any time I even try to access the value at values[7], an Exception will be thrown?

If so, how do I parse lines that don't contain data in that position of the text line?

2 Answers 2

6

First of all, String.split() is not a great CSV parser: it doesn't know about quotes and will mess up as soon as one of your quoted values contains a comma.

That being said, by default String.split() leaves out empty trailing elements. You can influence that by using the two-argument variant:

String[] values = line.split(",", -1);
  • -1 (or any negative value) means that the array will be as large as necessary.
  • Using a positive value gives a maximum amount of splits to be done (meaning that everything beyond that will be a single value, even if it contains a comma).
  • 0 (the default if you use the one-argument value) means that the array will be as large as necessary, but empty trailing values will be left out of the array (exactly as it happens to you).
Sign up to request clarification or add additional context in comments.

Comments

1

As a general rule you should never, ever hack up your own (faulty) parser if a working one already exists. CSV is not easy to parse correctly, and String.split will not do the job since CSV allows , to be used between "'s without working as separaters.

Consider using OpenCSV. This will solve both the problem you have now and the problem you will face when a user uses a , as part of the data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.