1

I have an EXCEL CSV file with this content:

access_id;logicalgate
123456789;TEST

As you can see header has strings and are not quoted.

With this code:

fileReader = new BufferedReader(new InputStreamReader(inputStream, "UTF-8"));
csvParser = new CSVParser(fileReader, CSVFormat.EXCEL.withNullString("").withFirstRecordAsHeader().withIgnoreHeaderCase().withQuoteMode(QuoteMode.MINIMAL).withIgnoreEmptyLines().withTrim());

The output of this command:

csvParser.getHeaderMap()
    

Is a single element map!

(java.util.TreeMap<K,V>) {access_id;logicalgate=0}

With a concatendated key "access_id;logicalgate"

Why the parser is missing the header separation?

3
  • 1
    I can't see anywhere where you specify that a semicolon is the delimiter. How's it supposed to know? Needs .withDelimiter(';'), surely? I imagine the rows themselves are also broken. Commented Jul 19, 2022 at 15:33
  • 1
    the CSVFormat.EXCEL specifies the delimiter, just the wrong one (,) in this case. Commented Jul 19, 2022 at 15:51
  • Incredible... thanks for the tip! Commented Jul 19, 2022 at 15:54

1 Answer 1

2

CsvFormat.EXCEL is defined with delimiter , not ;, you should add .withDelimiter(';').

public static final CSVFormat EXCEL

Excel file format (using a comma as the value delimiter). Note that the actual value delimiter used by Excel is locale dependent, it might be necessary to customize this format to accommodate to your regional settings.

For example for parsing or generating a CSV file on a French system the following format will be used:

CSVFormat fmt = CSVFormat.EXCEL.withDelimiter(';');

The CSVFormat.Builder settings are:

setDelimiter(',')
setQuote('"')
setRecordSeparator("\r\n")
setIgnoreEmptyLines(false)
setAllowMissingColumnNames(true)
setAllowDuplicateHeaderNames(true)

-- https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVFormat.html#EXCEL

Sign up to request clarification or add additional context in comments.

3 Comments

I'm wondering why naming "EXCEL" when the format is different...
I think excel format refers to the permitting missing column names and not ignoring empty lines, but not to the separator.
Yes, the separator is a detail ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.