I have a dataframe that contains json style dictionaries per row, and I need to extract the fourth and eigth fields, or values from the second and fourth pairs to a new column i.e. 'a' for row 1, 'a' for 2 (corresponding to the Group) and '10786' for row 1, '38971' for row 2 (corresponding to Code). The expected output is below.
dat = pd.DataFrame({ 'col1': ['{"Number":1,"Group":"a","First":"Yes","Code":"10786","Desc":true,"Labs":"["a","b","c"]"}',
'{"Number":2,"Group":"a","First":"Yes","Code":"38971","Desc":true,"Labs":"["a","b","c"]"}']})
expected output
a Group Code
0 {"Number":1,"Group":"a","First":"Yes","Second"... a 10786
1 {"Number":2,"Group":"a","First":"Yes","Second"... a 38971
I have tried indexing by location but its printing only characters rather than fields e.g.
tuple(dat['a'].items())[0][1][4]
I also cannot appear to normalize the data with json_normalize, which I'm not sure why - perhaps the json string is stored suboptimally. So I am quite confused, and any tips would be great. Thanks so much