Attempting to logically parse through the following sample json list:
FruitJson = [
('{"num":100, "fruit" : ["apple", "peach", "grape", "melon"]}',),
('{"num":101, "fruit" : ["melon", "apple", "mango", "banana"]}',),
]
Ideal Output:
| fruit | count |
|---|---|
| apple | 2 |
| melon | 2 |
| peach | 1 |
| grape | 1 |
| mangno | 1 |
| banana | 1 |
I managed to get the first row of the list into a dataframe, but unable to progress further from here:
dbutils.fs.put("/temp/test.json",'{"num":100, "fruit" : ["apple", "peach", "grape", "melon"]}'\
'{"num":101, "fruit" : ["melon", "apple", "mango", "banana"]}',True)
df = spark.read.option("multiline","true").json('/temp/test.json')
display(df)
You advice is much appreciated.