I have a column that contains array of objects as a value.
Objects have the following structure:
[
{
"key": "param1",
"val": "value1"
},
{
"key": "param2",
"val": "value2"
},
{
"key": "param3",
"val": "value3"
}
]
| someColumn | colName |
|---|---|
| text | [{key: "param1", val: "value1"}, {key: "param2", val: "value2"}, {key: "param3", val: "value3"}] |
When I do:
df.withColumn("exploded", explode(col("colName")))
I get
| someColumn | exploded |
|---|---|
| text | {key: "param1", val: "value1"} |
| text | {key: "param2", val: "value2"} |
| text | {key: "param3", val: "value3"} |
Then I do next:
df.select("*", "exploded.*").drop("exploded")
I get this:
| someColumn | key | value |
|---|---|---|
| text | param1 | value1 |
| text | param2 | value2 |
| text | param3 | value3 |
I understand why I get such result but I need to get other structure.
I want to get next result:
| someColumn | param1 | param2 | param3 |
|---|---|---|---|
| text | value1 | value2 | value3 |
Maybe do I have to transform array of Object[key, value] to Map and then to transform Map to Columns? What is the sequence of transformations I have to do?