I have a schema like the below. I was wondering what is the best way in spark to select the elements seat and drive then cast it into a string. I am reading this in a dataframe with spark 1.6.
|-- cars: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- carId: string (nullable = true)
| | |-- carCode: string (nullable = true)
| | |-- carNumber: string (nullable = true)
| | |-- features: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- seat: string (nullable = true)
| | | | |-- drive: string (nullable = true)
The output of cars.features as car_features in json:
"cars_features":[[{"seat":"Auto","drive":"Manual"}]]
I am trying to select "Auto" and put it into a dataframe column and "Manual" and put into another column.
current attempt returns the whole structure as:
+-------------------+
|car_features |
+-------------------+
| [[Auto,Manual]] |
+-------------------+
col("car.features").getItem(0).as("car_features_seat")
seatanddriveas arrays of arrays or just as an array or as rows?