0

I am trying to run the following command in Scala 2.2

   val x_test0 = cn_train.map( { case row => row.toSeq.toArray } )

And I keep getting the following mistake

 error: Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.

I have already imported implicits._ through the following commands:

val spark = SparkSession.builder.getOrCreate()
import spark.implicits._
6
  • which line is it exactly? this error is shown when you try to create dataset without encoders defined. Commented Mar 6, 2018 at 3:32
  • Scala 2.2 is a bit old. You probably meant Spark 2.2.x? Commented Mar 6, 2018 at 3:38
  • Yeah, sorry my mistake. It is Spark 2.2.x. Commented Mar 6, 2018 at 3:39
  • The error shows like this: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. val x_test0 = cn_train.map( { case row => row.toSeq.toArray } ) Commented Mar 6, 2018 at 3:40
  • The error message tells you that you can't force an array into a dataframe. Try cn_train.rdd.map{ row => row.toSeq.toArray }, this would at least give you an RDD of arrays. Would that be sufficient? Commented Mar 6, 2018 at 3:40

1 Answer 1

1

The error message tells you that it cannot find an Encoder for a heterogeneous Array to save it in a Dataset. But you can get an RDD of Arrays like this:

cn_train.rdd.map{ row => row.toSeq.toArray }
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.