0

I'm trying to parse a CSV file which contains a JSON object in the last column.
Here is an example with two rows from the input CSV file:

'id','value','createddate','attributes'
524256,CAFE,2018-04-06 16:41:01,{"Att1Numeric": 6, "Att2String": "abc"}
524257,BEBE,2018-04-06 17:00:00,{}

I tried using the parser from csv package:

func processFileAsCSV(f *multipart.Part) (int, error) {
  reader := csv.NewReader(f)
  reader.LazyQuotes = true
  reader.Comma = ','
  lineCount := 0
  for {
    line, err := reader.Read()
    if err == io.EOF {
        break
    } else if err != nil {
        fmt.Println("Error:", err)
        return 0, err
    }

    if lineCount%100000 == 0 {
        fmt.Println(lineCount)
    }
    lineCount++
    fmt.Println(lineCount, line)
    processLine(line) // do something with the line
  }

  fmt.Println("done!", lineCount)
  return lineCount, nil
}

But I got an error:

Error: line 2, column 0: wrong number of fields in line,

probably because the parser ignores the JSON scope which starts with {.

Should I be writing my own CSV parser, or is there a library that can handle this?

1
  • 1
    Please post what you have tried. Commented Apr 9, 2018 at 9:01

1 Answer 1

3

Your CSV input doesn't follow normal CSV convention, by using unquoted fields (for JSON).

I think the best approach would be to pre-process your input, either in your Go program, or in an external script.

If your CSV input is predictable (as indicated in your question), it should be easy to properly quote last element, using a simple strings.Split call, for instance, before passing it to the CSV parser.

Sign up to request clarification or add additional context in comments.

2 Comments

I suspect preprocessing the file would be difficult because the input file (which may be huge) is arriving via network and is being processed while it is incoming (notice multipart.Part).
Not at all. Just create an io.Reader wrapper, that reads each line as it comes in, calls strings.SplitN(input, 4), properly quotes/escapes the last field, then writes the modified line as output.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.