i have a json file with the following schema:
root
|-- demo: boolean (nullable = true)
|-- person: struct (nullable = true)
| |-- dateOfBirth: string (nullable = true)
| |-- email: array (nullable = true)
| | |-- element: string (containsNull = true)
| |-- emergencyContacts: array (nullable = true)
| | |-- element: struct (containsNull = true)
| | | |-- name: string (nullable = true)
| | | |-- phone: string (nullable = true)
| | | |-- relationship: string (nullable = true)
| |-- id: long (nullable = true)
| |-- name: string (nullable = true)
| |-- phones: struct (nullable = true)
| | |-- home: string (nullable = true)
| | |-- mobile: string (nullable = true)
| |-- registered: boolean (nullable = true)
|-- product: string (nullable = true)
|-- releaseDate: string (nullable = true)
i want to parse the emergencyContacts array so as to get the names of the contacts
i have reached till the persons struct using:
val df =sqlContext.read.json("file:///home/training211/test/cjson1.json").toDF();
df.registerTempTable("df");
df.printSchema();
val person = df.select("person");
person.registerTempTable("person");
person.printSchema();
person.show();
if i want to go further it always gives an error as : org.apache.spark.sql.AnalysisException: cannot resolve 'persons.emergencyContact s' given input columns: [person];
also tried doing:
val arrayFlatten = df.select($"person.emergencyContacts".getItem(0))
which gives me
+---------------------------+
|person.emergencyContacts[0]|
+---------------------------+
| [Jane Doe,888-555...|
+---------------------------+
but this is not the result i want
Any help is appreciated
df.select($"person.emergencyContacts"), what you got? can you update your question?