I have a column of type integer arrays:
case class Testing(name: String, age: Int, salary: Double, array: Array[Int])
val x = sc.parallelize(Array(
Testing(null, 21, 905.33, Array(1,2,3)),
Testing("Noelia", 26, 1130.60, Array(3,2,1)),
Testing("Pilar", 52, 1890.85, Array(3,3,3)),
Testing("Roberto", 31, 1450.14, Array(1,0,0))
))
// Convert RDD to a DataFrame
val df = sqlContext.createDataFrame(x)
// For SQL usage we need to register the table
df.registerTempTable("df")
I want to create an array whose elements are the values of the column "array". How to do this in Spark SQL?
sqlContext.sql("SELECT [array] from df").show
[ [1,2,3], [3,2,1], [3,3,3], [1,0,0]]