I've a JSON within a Column of a Spark DataFrame as follows:
ID| Text| JSON
------------------------------------------------------------------------------
1| xyz| [{"Hour": 1, "Total": 10, "Fail": 1}, {"Hour": 2, "Total": 40, "Fail": 4}, {"Hour": 3, "Total": 20, "Fail": 2}]
I'm using following Schema
val schema = StructType(Array(StructField("Hour", IntegerType),
StructField("Total", IntegerType), StructField("Fail", IntegerType))
I'm using following code to parse the DataFrame and output the JSON as multiple columns
val newDF = DF.withColumn("JSON", from_json(col("JSON"), schema)).select(col("JSON.*"))
newDF.show()
The above code just parses the one single record from the JSON. But, I want it to parse all the records in the JSON.
The Output is as follows:
Hour| Total| Fail|
-------------------------------
1| 10| 1|
-------------------------------
But, I want the output to be as follows:
Hour| Total| Fail|
-------------------------------
1| 10| 1|
2| 40| 4|
3| 20| 2|
-------------------------------
Can Someone, please let me know. What is it that I'm missing !!
Thanks in advance.
JSONanarrayor just plainstring?