I'm working on a zeppelin notebook and try to load data from a table using sql.
In the table, each row has one column which is a JSON blob. For example, [{'timestamp':12345,'value':10},{'timestamp':12346,'value':11},{'timestamp':12347,'value':12}]
I want to select the JSON blob as a string, like the original string. But spark automatically load it as a WrappedArray.
It seems that I have to write a UDF to convert the WrappedArray to a string. The following is my code.
I first define a Scala function and then register the function. And then use the registered function on the column.
val unwraparr = udf ((x: WrappedArray[(Int, Int)]) => x.map { case Row(val1: String) => + "," + val2 })
sqlContext.udf.register("fwa", unwraparr)
It doesn't work. I would really appreciate if anyone can help.
The following is the schema of the part I'm working on. There will be many amount and timeStamp pairs.
-- targetColumn: array (nullable = true)
|-- element: struct (containsNull = true)
| |-- value: long (nullable = true)
| |-- timeStamp: string (nullable = true)
UPDATE: I come up with the following code:
val f = (x: Seq[Row]) => x.map { case Row(val1: Long, val2: String) => x.mkString("+") }
I need it to concat the objects/struct/row (not sure how to call the struct) to a single string.
schemaof thedataframe? just copy paste the output ofdataframe.printSchema()