I have two columns of data restaurant name and reviewer grades:
name grades
0 Honey'S Thai Pavilion [{u'date': 2014-08-12 00:00:00, u'grade'..
1 Siam Sqaure Thai Cuisine [{u'date': 2014-11-06 00:00:00, u'grade'...
The problem is one column is a list of multiple 'date,' 'grade' and 'score' pairings in JSON (well technically BSON since this is the sample data set from the MongoDB tutorial). I need to break out the grades column so I have a resulting data frame like below:
name Date Grade Score
Honey'S Thai Pavilion 2014-08-12 00:00:00 A 6
Honey'S Thai Pavilion 2015-03-14 00:00:00 B 5
Honey'S Thai Pavilion 2013-07-15 00:00:00 C 6
Siam Sqaure Thai Cuisine 2014-11-06 00:00:00 A 3
Siam Sqaure Thai Cuisine 2015-06-06 00:00:00 B 2
So I need to split out one column but retain the restaurant name. The code below achieves getting the grades column into a nice looking data frame, but I can't figure out how to keep the restaurant name.
from pymongo import MongoClient
import pymongo
import pandas as pd
client = MongoClient()
db = client.test
)
cursor2 = db.restaurants.find().sort([
("borough", pymongo.ASCENDING),
("cuisine", pymongo.DESCENDING)
])
#cursor.sort("cuisine",pymongo.ASCENDING)
data = pd.DataFrame(list(cursor2))[['name', 'grades']]
data_list= []
for i in range(0, len(data.grades)):
g_data = pd.DataFrame(data.grades[i])
data_list.append(g_data)
result = pd.concat(data_list)
print result.head(100)
datain your code? If it is, I think it would be more or less easy to solve withapply, there's many questions around but probably this one will help you. If you find issues please post them. Hope it helps.