I have a list inside a pandas dataframe and I want to filter it. For example, I have a dataframe like this:
{
"examples": [
{
"website": "info",
"df": [
{
"Question": "What?",
"Answers": []
},
{
"Question": "how?",
"Answers": []
},
{
"Question": "Why?",
"Answers": []
}
],
"whitelisted_url": true,
"exResponse": {
"pb_sentence": "",
"solution_sentence": "",
"why_sentence": ""
}
},
{
"website": "info2",
"df": [
{
"Question": "What?",
"Answers": ["example answer1"]
},
{
"Question": "how?",
"Answers": ["example answer1"]
},
{
"Question": "Why?",
"Answers": []
}
],
"whitelisted_url": true,
"exResponse": {
"pb_sentence": "",
}
},
]
}
my filter function:
def filter(data, name):
resp = pd.concat([pd.DataFrame(data),
pd.json_normalize(data['examples'])],
axis=1)
resp = pd.concat([pd.DataFrame(resp),
pd.json_normalize(resp['df'])],
axis=1)
resp['exResponse.pb_sentence'].replace(
'', np.nan, inplace=True)
resp.dropna(
subset=['exResponse.pb_sentence'], inplace=True)
resp.drop(resp[resp['df.Answers'].apply(len) == 0].index, inplace=True)
I want to remove the empty 'answers' elements in this dataframe. I have already filtered the empty 'problem_summary' elements using the following code:
resp['exResponse.pb_sentence'].replace(
'', np.nan, inplace=True)
resp.dropna(
subset=['exResponse.pb_sentence'], inplace=True)
How can I do the same for the 'answers' elements?
I don't actually expect a specific output. the following part of my code It throws the error "AttributeError: 'list' object has no attribute 'keys'". I think this is due to empty answers arrays, so I want to remove these parts.
resp.rename(
columns={0: 'Challenge', 1: 'Solution', 2: 'Importance'}, inplace=True)
# challenge deserializing
resp = pd.concat([pd.DataFrame(df_resp),
pd.json_normalize(resp['Challenge'])],
axis=1)
resp = pd.concat([pd.DataFrame(resp),
pd.json_normalize(resp['Answers'])],
axis=1)
error line:
29 resp = pd.concat([pd.DataFrame(resp),
---> 30 pd.json_normalize(resp['Answers'])],
31 axis=1)