How can I split an Array of Json DataFrame to multiple rows in Spark-Scala
Input DataFrame :
+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------+
|item_id |s_tag |jsonString |
+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------+
|Item_12345|S_12345|[{"First":{"Info":"ABCD123","Res":"5.2"}},{"Second":{"Info":"ABCD123","Res":"5.2"}},{"Third":{"Info":"ABCD123","Res":"5.2"}}] |
+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------+
Output DataFrame :
+----------+-------------------------------------------------+
|item_id |s_tag |jsonString |
+----------+-------------------------------------------------+
|Item_12345|S_12345|{"First":{"Info":"ABCD123","Res":"5.2"}} |
+----------+-------------------------------------------------+
|Item_12345|S_12345|{"Second":{"Info":"ABCD123","Res":"5.2"}}|
+----------+-------------------------------------------------+
|Item_12345|S_12345|{"Third":{"Info":"ABCD123","Res":"5.2"}} |
+----------+-------------------------------------------------+
This is what I have tried so far but it did not work
val rawDF = sparkSession
.sql("select 1")
.withColumn("item_id", lit("Item_12345")).withColumn("s_tag", lit("S_12345"))
.withColumn("jsonString", lit("""[{"First":{"Info":"ABCD123","Res":"5.2"}},{"Second":{"Info":"ABCD123","Res":"5.2"}},{"Third":{"Info":"ABCD123","Res":"5.2"}}]"""))
val newDF = RawDF.withColumn("splittedJson", explode(RawDF.col("jsonString")))