Here are the 3 rows of my sample json.
{"customer": 10, "date": "2017.04.06 12:09:32", "itemList": [{"item": "20126907_EA", "price": 1.88, "quantity": 1.0}, {"item": "20185742_EA", "price": 0.99, "quantity": 1.0}, {"item": "20138681_EA", "price": 1.79, "quantity": 1.0}, {"item": "20049778001_EA", "price": 2.47, "quantity": 1.0}, {"item": "20419715007_EA", "price": 3.33, "quantity": 1.0}, {"item": "20321434_EA", "price": 2.47, "quantity": 1.0}, {"item": "20068076_KG", "price": 28.24, "quantity": 10.086}, {"item": "20022893002_EA", "price": 1.77, "quantity": 1.0}, {"item": "20299328003_EA", "price": 1.25, "quantity": 1.0}], "store": "825f9cd5f0390bc77c1fed3c94885c87"}
{"customer": 100, "date": "2017.01.10 12:59:09", "itemList": [{"item": "20132638_KG", "price": 3.33, "quantity": 0.28}, {"item": "20320042001_EA", "price": 2.99, "quantity": 1.0}, {"item": "20320832003_EA", "price": 2.58, "quantity": 2.0}, {"item": "20128148_KG", "price": 4.85, "quantity": 0.256}, {"item": "20027478_KG", "price": 4.58, "quantity": 0.135}, {"item": "20653232_EA", "price": 5.99, "quantity": 1.0}, {"item": "20317755_EA", "price": 3.69, "quantity": 1.0}, {"item": "20519704_KG", "price": 4.24, "quantity": 0.214}, {"item": "20591843_KG", "price": 5.56, "quantity": 0.286}], "store": "a666587afda6e89aec274a3657558a27"}
{"customer": 1000, "date": "2017.04.17 18:53:40", "itemList": [{"item": "20788909_EA", "price": 3.49, "quantity": 1.0}, {"item": "20975073_EA", "price": 5.0, "quantity": 1.0}, {"item": "20868904_EA", "price": 5.0, "quantity": 1.0}, {"item": "20189092_EA", "price": 0.05, "quantity": 1.0}], "store": "ebb71045453f38676c40deb9864f811d"}
I would like to convert every single tag into rows with the nested tag, below is the code. I'm trying while I am facing issues :
def data_load():
p=Path(r'C:\Users\rohgorthy\Downloads\LBD_Assignemtn\sample_tag.json')
with p.open('r', encoding='utf-8') as f:
data = f.read()
df = pd.json_normalize(data, record_path='itemList', meta=['customer', 'date', 'store'])
return df
Error below:
result = result[spec]
TypeError: string indices must be integers
Can any one please help me to achieve the below format :
df Columns:
customer date item price quantity store
Thank you in advance.