0

I have the following setup to parse a csv file:

package main

import (
    "fmt"
    "os"
    "encoding/csv"
)

type CsvLine struct {
    Id string
    Array1 [] string
    Array2 [] string
}


func ReadCsv(filename string) ([][]string, error) {

    f, err := os.Open(filename)
    if err != nil {
        return [][]string{}, err
    }
    defer f.Close()

    lines, err := csv.NewReader(f).ReadAll()
    if err != nil {
        return [][]string{}, err
    }
    return lines, nil
}


func main() {

    lines, err := ReadCsv("./data/sample-0.3.csv")
    if err != nil {
        panic(err)
    }

    for _, line := range lines {
                fmt.Println(line)
        data := CsvLine{
            Id: line[0],
            Array1: line[1],
            Array2: line[2],
        }
        fmt.Println(data.Id)
        fmt.Println(data.Array1)
        fmt.Println(data.Array2)
    }
}

And the following setup in my csv file:

594385903dss,"['fhjdsk', 'dfjdskl', 'fkdsjgooiertio']","['jflkdsjfl', 'fkjdlsfjdslkfjldks']"
87764385903dss,"['cxxc', 'wqeewr', 'opi', 'iy', 'qw']","['cvbvc', 'gf', 'mnb', 'ewr']"

My understanding is that variable length lists should be parsed into a slice, is it possible to do this directly via a csv reader? (The csv output was generated via a python project.)

Help/suggestions appreciated.

1
  • 2
    No. it's not. There is no official definition of "array" in CSV. So I think there aren't any libraries support this converting. Commented Dec 16, 2019 at 12:07

1 Answer 1

5

CSV does not have a notion of "variable length arrays", it is just a comma separated list of values. The format is described in RFC 4180, and that is exactly what the encoding/csv package implements.

You can only get a string slice out of a CSV line. How you interpret the values is up to you. You have to post process your data if you want to split it further.

What you have may be simply processed with the regexp package, e.g.

var r = regexp.MustCompile(`'[^']*'`)

func split(s string) []string {
    parts := r.FindAllString(s, -1)
    for i, part := range parts {
        parts[i] = part[1 : len(part)-1]
    }
    return parts
}

Testing it:

s := `['one', 'two', 'three']`
fmt.Printf("%q\n", split(s))
s = `[]`
fmt.Printf("%q\n", split(s))
s = `['o,ne', 't,w,o', 't,,hree']`
fmt.Printf("%q\n", split(s))

Output (try it on the Go Playground):

["one" "two" "three"]
[]
["o,ne" "t,w,o" "t,,hree"]

Using this split() function, this is how processing may look like:

for _, line := range lines {
    data := CsvLine{
        Id:     line[0],
        Array1: split(line[1]),
        Array2: split(line[2]),
    }
    fmt.Printf("%+v\n", data)
}

This outputs (try it on the Go Playground):

{Id:594385903dss Array1:[fhjdsk dfjdskl fkdsjgooiertio] Array2:[jflkdsjfl fkjdlsfjdslkfjldks]}
{Id:87764385903dss Array1:[cxxc wqeewr opi iy qw] Array2:[cvbvc gf mnb ewr]}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.