I am trying to parse nested json with some sample json. Below is the print schema
|-- batters: struct (nullable = true)
| |-- batter: array (nullable = true)
| | |-- element: struct (containsNull = true)
| | | |-- id: string (nullable = true)
| | | |-- type: string (nullable = true)
|-- id: string (nullable = true)
|-- name: string (nullable = true)
|-- ppu: double (nullable = true)
|-- topping: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id: string (nullable = true)
| | |-- type: string (nullable = true)
|-- type: string (nullable = true)
Trying to explode batters,topping separately and combine them.
df_batter = df_json.select("batters.*")
df_explode1= df_batter.withColumn("batter", explode("batter")).select("batter.*")
df_explode2= df_json.withColumn("topping", explode("topping")).select("id",
"type","name","ppu","topping.*")
Unable to combine the two data frame.
Tried using single query
exploded1 = df_json.withColumn("batter", df_batter.withColumn("batter",
explode("batter"))).withColumn("topping", explode("topping")).select("id",
"type","name","ppu","topping.*","batter.*")
But getting error.Kindly help me to solve it. Thanks