I have a few json arrays
[{"key":"country","value":"aaa"},{"key":"region","value":"a"},{"key":"city","value":"a1"}]
[{"key":"city","value":"b"},{"key":"street","value":"1"}]
I need to extract city and street value into different columns.
Using get_json_object($"address", "$[2].value").as("city") to get element by it's number doesn't work because arrays can miss some fields.
Instead I decided to turn this array into a map of key -> value pairs, but have trouble doing it. So far I only managed to get an array of arrays.
val schema = ArrayType(StructType(Array(
StructField("key", StringType),
StructField("value", StringType)
)))
from_json($"address", schema)
Returns
[[country, aaa],[region, a],[city, a1]]
[[city, b],[street, 1]]
I'm not sure where to go from here.
val schema = ArrayType(MapType(StringType, StringType))
Fails with
cannot resolve 'jsontostructs(`address`)' due to data type mismatch: Input schema array<map<string,string>> must be a struct or an array of structs.;;
I'm using spark 2.2