0

df has the following columns:

Index(

['level_id', 'description', 'level_type', 'difficulty', 'solution', 'modifier', 'last_modified', 'time_limit', 'tags', 'level_options', 'extra_info', 'hint']

, dtype='object')

column that i am converting is ="level_options"

Down below is the data inside first row of level_options:

{"marking": "normal", "options": {"1": "Carbon dioxide", "2": "Oxygen", "3": "Urine", "4": "Sweat"}, "max_score": 1, "random_options": "false", "correct_options": ["2"]}

Then in the below code: I manually pass the above first row, got the expected output is well.

>     j2=[{"marking": "normal", "options": {"1": "Yes", "2": "No"}, "max_score": 1, "random_options": "false", "correct_options": ["1"]}]
>     pd.json_normalize(j2)

output:

marking   max_score     random_options   correct_options      options.1   options.2     options.3   options.4  
normal      1             false            [2]       Carbon dioxide Oxygen         Urine        Sweat

But while iterating through all rows I don't get the output.
I have used this below code where I was supposed to get 6 lists but got empty lists.PFA of expected output- final output

Option1=[] 
Option2=[] 
Option3=[] 
Option4=[] 
Option5=[]
Correct_option=[]
 try:
>     for index, row in df.iterrows():
>         l=row["level_options"]
>         print(l) except:
>     print(index,row)
>     l = json.loads(l) 
>     print(l)
>     for i in l:
>         i = pd.normalize_json(i) 
>         print(i)
>         if i['options'] == '1':
>             Option1.append(i['value'])
>         elif i['options'] == '2':
>             Option2.append(i['value'])
>         elif i['options'] == '3':
>             Option3.append(i['value'])
>         elif i['options'] == '4':
>             Option4.append(i['value'])
>         elif i['options'] == '5':
>             Option5.append(i['value'])
>         elif i['options'] == 'correct_options':
>             Correct_option.append(i['value'])

i even used the json.loads instead of json_normalize but error occurs

below are 2 rows of dataframe= df["level_options"]

{"marking": "normal", "options": {"1": "Carbon dioxide", "2": "Oxygen", "3": "Urine", "4": "Sweat"}, "max_score": 1, "random_options": "false", "correct_options": ["2"]} {"marking": "normal", "options": {"1": "Yes", "2": "No"}, "max_score": 1, "random_options": "false", "correct_options": ["1"]}

The Output shown below is what I am expecting after merging each list as a single dataframe, the problem I am getting in is while retrieving the lists...

Thanks and regards

options.1   options.2     options.3 options.4  options.5    correct_options    
Carbon dioxide  Oxygen      Urine      Sweat       NaN               [2]    
Yes              No           NaN        NaN        NaN               [1]  

1 Answer 1

1

Just try pd.json_normalize on column

out = (pd.json_normalize(df['level_options'])
       .filter(like='option')
       .drop('random_options', axis=1))
print(out)

  correct_options       options.1 options.2 options.3 options.4
0             [2]  Carbon dioxide    Oxygen     Urine     Sweat
1             [1]             Yes        No       NaN       NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.