I followed the spark streaming guide and was able to get a sql context of my json data using sqlContext.read.json(rdd). The problem is that one of the json fields is a JSON string itself that I would like parsed.
Is there a way to accomplish this within spark sql, or would it be easier to use ObjectMapper to parse the string and join to the rest of the data?
To clarify, one of the values of the JSON is a string containing JSON data with the inner quotes escaped. I'm looking for a way to tell the parser to treat that value as stringified JSON
Example Json
{
"key": "val",
"jsonString": "{ \"too\": \"bad\" }",
"jsonObj": { "ok": "great" }
}
How SQLContext Parses it
root
|-- key: string (nullable = true)
|-- jsonString: string (nullable = true)
|-- jsonObj: struct (nullable = true)
| |-- ok: string (nullable = true)
How I would like it
root
|-- key: string (nullable = true)
|-- jsonString: struct (nullable = true)
| |-- too: string (nullable = true)
|-- jsonObj: struct (nullable = true)
| |-- ok: string (nullable = true)