I am using python to parse a CSV which contains JSON type:
| cast | id |
|---|---|
| {"user":"a","character":"AA"},{"user":"b","character":"BB"} | 1 |
| {"user":"c","character":"CC"} | 2 |
How can I make this CSV be:
| cast | id |
|---|---|
| ["AA","BB"] | 1 |
| ["CC"] | 2 |
import pandas as pd
import json
df = pd.read_csv('your_file.csv')
df.cast = df.cast.apply(lambda r: [x.get('character') for x in json.loads('[' + r + ']')])
The apply method is used to apply a function on a column. You can use the json module to parse the JSON string. json.loads() returns a dictionary, I had to add the brackets to make the string a valid JSON string (list of mappings).
json.loads('['+r+']') mean? why do not put df['cast'] inside?