Reading CSV into Map[String, Array[String]] in Scala

Question

Given a csv in the format below, what is the best way to load it into Scala as type Map[String, Array[String]], with the first key being the unique values for Col2, and the value Array[String]] as all co-occurring values of Col1?

a,1,
b,2,m
c,2,
d,1,
e,3,m
f,4,
g,2,
h,3,
I,1,
j,2,n
k,2,n
l,1,
m,5,
n,2,

I have tried to use the function below, but am getting errors trying to add to the Option type: += is not a member of Option[Array[String]]

In addition, I get overloaded method value ++ with alternatives: with regards to the line case None => mapping ++ (linesplit(2) -> Array(linesplit(1)))

def parseCSV() : Map[String, Array[String]] = {
    var mapping = Map[String, Array[String]]()
    val lines = Source.fromFile("test.csv")
    for (line <- lines.getLines) {
      val linesplit = line.split(",")
      mapping.get(linesplit(2)) match {
        case Some(_) => mapping.get(linesplit(2)) += linesplit(1)
        case None => mapping ++ (linesplit(2) -> Array(linesplit(1)))
      }
    }
    mapping
  }
}

I am hoping for a Map[String, Array[String]] like the following:

(2 -> Array["b","c","g","j", "k", "n"])
(3 -> Array["e","h"])
(4 -> Array["f"])
(5 -> Array["m"])

Possible duplicate of How can I read a CSV file and put its content in a Map in Scala? — Jeffrey Chung
– Jeffrey Chung, Commented Sep 12, 2019 at 10:08

Gal Naor · Accepted Answer · 2019-09-12 05:32:51Z

1

You can do the following: First - read the file to List[List[String]]:

val rows: List[List[String]] = using(io.Source.fromFile("test.csv")) { source =>
   source.getLines.toList map { line =>
   line.split(",").map(_.trim).toList
  }
}

Then, because the input has only 2 values per row, I filter the rows (rows with only one value I want to ignore)

val filteredRows = rows.filter(row => row.size > 1)

And the last step is to groupBy the first value (which is the second column - the index column is not returned from Source.fromFile):

filteredRows.groupBy(row => row.head).mapValues(_.map(_.last)))

answered Sep 12, 2019 at 5:32

Gal Naor

2,39714 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jwvh · Accepted Answer · 2019-09-12 05:41:38Z

1

This isn't complete, but it should give you an outline of how it might be done.

io.Source
  .fromFile("so.txt")    //open file
  .getLines()            //line by line
  .map(_.split(","))     //split on commas
  .toArray               //load into memory
  .groupMap(_(1))(_(0))  //Scala 2.13

//res0: Map[String,Array[String]] = Map(4 -> Array(f), 5 -> Array(m), 1 -> Array(a, d, I, l), 2 -> Array(b, c, g, j, k, n), 3 -> Array(e, h))

You'll notice that the file resource isn't closed, and it doesn't handle malformed input. I leave that for the diligent reader.

answered Sep 12, 2019 at 5:41

jwvh

51.3k7 gold badges42 silver badges70 bronze badges

Comments

hagarwal · Accepted Answer · 2019-09-12 05:50:16Z

1

For the above code mutable Map & ArrayBuffer should be used, as they could be mutated/updated later.

def parseCSV(): Map[String, Array[String]] = {
val mapping = scala.collection.mutable.Map[String, ArrayBuffer[String]]()
val lines = Source.fromFile("test.csv")
for (line <- lines.getLines) {
  val linesplit = line.split(",")
  val key = line.split(",")(1)
  val values = line.replace(s",$key", "").split(",")
  mapping.get(key) match {
    case Some(_) => mapping(linesplit(1)) ++= values
    case None =>
      val ab = ArrayBuffer[String]()
      mapping(linesplit(1)) = ab ++= values
  }
}
 mapping.map(v => (v._1, v._2.toArray)).toMap
}

answered Sep 12, 2019 at 5:50

hagarwal

1,16312 silver badges28 bronze badges

Collectives™ on Stack Overflow

Reading CSV into Map[String, Array[String]] in Scala

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related