I have project.json file, which contains data like this :
{"student_id": "ST0001", "project": [{"subject_id": "S003", "date_of_submission": "2021-05-23 20:03:05"}, {"subject_id": "S004", "date_of_submission": "2021-05-24 21:03:05"}, {"subject_id": "S005", "date_of_submission": "2021-05-30 05:09:30"}], "project_year": "Second"}
{"student_id": "ST0002", "project": [{"subject_id": "S003", "date_of_submission": "2021-06-02 15:05:05"}, {"subject_id": "S007", "date_of_submission": "2021-04-28 21:03:01"}], "project_year": "Second"}
{"student_id": "ST0002", "project": [{"subject_id": "S0018", "date_of_submission": "2020-06-03 08:15:21"}], "project_year": "First"}
I need to extract project_subject_id and project_date_of_submission into a separate column like :
| student_id | project_subject_id | project_date_of_submission | project_year |
|---|---|---|---|
| ST0001 | S003 | 23/05/2021 20:03 | Second |
| ST0001 | S004 | 24/05/2021 21:03 | Second |
| ST0001 | S005 | 30/05/2021 05:09 | Second |
| ST0002 | S003 | 02/06/2021 15:05 | Second |
| ST0002 | S007 | 28/04/2021 21:03 | Second |
| ST0002 | S0018 | 03/06/2020 08:15 | First |
Here's what I have tired :
import pandas as pd
df_pr = pd.read_json('project.json', lines=True)