0

I want to create Map[Int,Set[String]] in scala by reading input from a CSV file.

My file.csv is,

sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
overcast,hot,high,FALSE,yes
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
rainy,cool,normal,TRUE,no
overcast,cool,normal,TRUE,yes

I want the output as,

var Attributes = Map[Int,Set[String]] = Map()

Attributes += (0 -> Set("sunny","overcast","rainy"))
Attributes += (1 -> Set("hot","mild","cool"))
Attributes += (2 -> Set("high","normal"))
Attributes += (3 -> Set("false","true"))
Attributes += (4 -> Set("yes","no"))

This 0,1,2,3,4 represents the column number and Set contains the distinct values in each column.

I want to add each (Int -> Set(String)) to my attribute "Attributes". ie, If we print Attributes.size , it displays 5(In this case).

1
  • Look at http://stackoverflow.com/questions/1284423/read-entire-file-in-scala to see how to read the lines into memory. To avoid performance issues consider using Streams. zipWithIndex will give you the line number with each line. Iterate over the lines and create your Map. Hope this helps! Commented Dec 5, 2014 at 4:01

1 Answer 1

2

Use one of the existing answers to read in the CSV file. You'll have a two dimensional array or vector of strings. Then build your map.

// row vectors
val rows = io.Source.fromFile("file.csv").getLines.map(_.split(",")).toVector
// column vectors
val cols = rows.transpose
// convert each vector to a set
val sets = cols.map(_.toSet)
// convert vector of sets to map
val attr = sets.zipWithIndex.map(_.swap).toMap

The last line is bit ugly because there is no direct .toMap method. You could also write

val attr = Vector.tabulate(sets.size)(i => (i, sets(i))).toMap

Or you could do the last two steps in one go:

val attr = cols.zipWithIndex.map { case (xs, i) => 
  (i, xs.toSet) 
} (collection.breakOut): Map[Int,Set[String]]
Sign up to request clarification or add additional context in comments.

1 Comment

If I use another dataset, it show an error, ie, Exception in thread "main" java.lang.IllegalArgumentException: transpose requires all collections have the same size

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.