Similar to this question, but my CSV has a slightly different format. Here is an example:
id,employee,details,createdAt
1,John,"{"Country":"USA","Salary":5000,"Review":null}","2018-09-01"
2,Sarah,"{"Country":"Australia", "Salary":6000,"Review":"Hardworking"}","2018-09-05"
I think the double quotation mark in the beginning of the JSON column might have caused some errors. Using df = pandas.read_csv('file.csv'), this is the dataframe that I got:
id employee details createdAt Unnamed: 1 Unnamed: 2
1 John {Country":"USA" Salary:5000 Review:null}" 2018-09-01
2 Sarah {Country":"Australia" Salary:6000 Review:"Hardworking"}" 2018-09-05
My desired output:
id employee details createdAt
1 John {"Country":"USA","Salary":5000,"Review":null} 2018-09-01
2 Sarah {"Country":"Australia","Salary":6000,"Review":"Hardworking"} 2018-09-05
I've tried adding quotechar='"' as the parameter and it still doesn't give me the result that I want. Is there a way to tell pandas to ignore the first and the last quotation mark surrounding the json value?
"on the string, causing the commas to be interpreted as new columns rather than part of a dictionary structure"{"and thenCountrywithout quote and then":"in quote and thenUSA"and comma encountered which interpreted it as next column value