1

I am trying to read a .txt file with | delimiters as an RDD and trying return a Map[(String, String),(Double, Double)] , however I am running into CastException

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Double

input data looks like this

string1|string2|100.00|200.00
string1|string2|34.98|0.989

this is how i am reading the file as rdd and parsing it

val mydata = sc
  .textFile("file")
  .map(line => line.split("|"))
  .map(row =>
    ((row(0), row(1)),
     (row(2).asInstanceOf[Double], row(3).asInstanceOf[Double])))
  .collect
  .toMap

How can I fix this issue

expected o/p:

Map[(String, String),(Double, Double)] = Map((string1,string2) -> (100.0,200.0), (string1,string2) -> (34.98,0.989))

1 Answer 1

2

To be on the safe side you can use trim function and you can use collectAsMap

val mydata = sc
  .textFile("file")
  .map(line => line.split("\\|"))
  .map(row =>
    ((row(0), row(1)),
      (row(2).trim.asInstanceOf[Double], row(3).trim.asInstanceOf[Double])))
  .collectAsMap()

And to be more safe you can use Try/getOrElse

val mydata = sc
  .textFile("file")
  .map(line => line.split("\\|"))
  .map(row =>
    ((row(0), row(1)),
      (Try(row(2).trim.asInstanceOf[Double]).getOrElse(0.0), Try(row(3).trim.asInstanceOf[Double]).getOrElse(0.0))))
  .collectAsMap()

Moreover you can use toDouble instead of asInstanceOf[Double]

val mydata = sc
  .textFile("file")
  .map(line => line.split("\\|"))
  .map(row =>
    ((row(0), row(1)), 
      (Try(row(2).trim.toDouble).getOrElse(0.0), Try(row(3).trim.toDouble).getOrElse(0.0)))
  )
  .collectAsMap().foreach(println)
Sign up to request clarification or add additional context in comments.

4 Comments

this is splitting every word into character. scala.collection.Map[(String, String),(Double, Double)] = Map((s,t) -> (0.0,0.0))
You had to escape the pipe delimiter :)
@Sanjay see my latest update. you should use toDouble
well escaping the pipe got it working but its returing ( 0.0,0.0) for all . mydata.get("string1","string2") res32: Option[(Double, Double)] = Some((0.0,0.0))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.