1

I'm working with org.apache.commons-csv 1.4, this week I discovered in one of our junit test's, this strange behaviuor:

    CSVReader reader = null;
    List<String[]> linesCsv = new ArrayList<>();
    FileInputStream fileStream = null;
    InputStreamReader inputStreamReader = null;

    try {
        fileStream = new FileInputStream(file);
        inputStreamReader = new InputStreamReader(fileStream, "ISO-8859-1");
        reader = new CSVReader(inputStreamReader, ',', '"', 0);

        String[] record = null;
        while ((record = reader.readNext()) != null) {
            linesCsv.add(record);
        }

    } catch (Exception e) {
        logger.error("Error in ", e);
    } finally {
        if (inputStreamReader != null) {
            inputStreamReader.close();
        }
        if (fileStream != null) {
            fileStream.close();
        }
        if (reader != null) {
            reader.close();
        }
    }

*ERROR CASE

Input .csv

DAR_123451                  ,"XXXXX Hello World "Hello World XXX "
DAR_123452                  ,"XXXXX Hello World "Hello World XXX "

Java KO:

[0.0] DAR_123451
[0.1] XXXXX Hello World "Hello World XXX\nDAR_123456 ,XXXXX Hello World "Hello World XXX


*CORRECT CASE

Input .csv

DAR_123451                  ,"XXXXX Hello World "Hello World" XXX "
DAR_123452                  ,"XXXXX Hello World "Hello World" XXX "

Java OK:

[0.0] DAR_123451 [0.1] XXXXX Hello World "Hello World" XXX

[1.0] DAR_123452 [1.1] XXXXX Hello World "Hello World" XXX

I can't setup commons csv library to work properly, it seems it's a Bug, how we can read correctly strings with single quotes in strings?

1
  • Check line ending in first line in file input.csv. Commented Feb 8, 2017 at 14:27

1 Answer 1

1

The CSV format usually use 2 consecutive double-quotes to include a double-quote in the text if the values are surrounded by quotes, e.g. the following works.

When I use the latest version of commons-csv I even get an exception with the your inputs (IOException: (line 1) invalid char between encapsulated token and delimiter)

So to correctly include the double-quotes you need to use the following

DAR_123451                  ,"XXXXX Hello World ""Hello World"" XXX "
DAR_123452                  ,"XXXXX Hello World ""Hello World"" XXX "

And the test-case then works as expected:

    Reader in = new StringReader(
            "DAR_123451                  ,\"XXXXX Hello World \"\"Hello World XXX\"\" \"\n" +
                    "DAR_123452                  ,\"XXXXX Hello World \"\"Hello World XXX\"\" \"");
    Iterable<CSVRecord> records = CSVFormat.DEFAULT.parse(in);
    for (CSVRecord record : records) {
        for (int i = 0; i < record.size(); i++) {
            System.out.println("At " + i + ": " + record.get(i));
        }
    }

Output:

At 0: DAR_123451                  
At 1: XXXXX Hello World "Hello World XXX" 
At 0: DAR_123452                  
At 1: XXXXX Hello World "Hello World XXX" 

See https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality for details.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.