I have a Sequence of maps. Each map contains column names as keys and column values as values. So one map describes one row. I do not know how many entries will be there in a map. So I can't create a fixed length tuple in my code. I want to convert the sequence to dataframe. I tried the below code:
val mapRDD= sc.parallelize(Seq(
Map("col1" -> "10", "col2" -> "Rohan", "col3" -> "201"),
Map("col1" -> "13", "col2" -> "Ross", "col3" -> "201")
))
val columns=mapRDD.take(1).flatMap(a=>a.keys)
val resultantDF=mapRDD.map{value=> // Exception is thrown from this block
value.values.toList
}.toDF(columns:_*)
resultantDF.show()
But it gave the below exception:
org.apache.spark.sql.types.ArrayType cannot be cast to org.apache.spark.sql.types.StructType
java.lang.ClassCastException: org.apache.spark.sql.types.ArrayType cannot be cast to org.apache.spark.sql.types.StructType
at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:414)
at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:155)
...
I tried few other approaches, but nothing worked.