1

I have tried to use the online Jsonify It tool which can create nested JSON data from my data but I can't seem to get that to work. I have also tried to use the Python code from other posts on but they do not seem to work either. If you know an easier method than using Python, that would be good.

Here is my .CSV data:

ID,Name,Date,Subject,Start,Finish
0,Ladybridge High School,01/11/2019,Maths,05:28,0
0,Ladybridge High School,02/11/2019,Maths,05:30,06:45
0,Ladybridge High School,01/11/2019,Economics,11:58,12:40
0,Ladybridge High School,02/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,01/11/2019,Maths,05:28,06:45
1,Loreto Sixth Form,02/11/2019,Maths,05:30,06:45
1,Loreto Sixth Form,01/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,02/11/2019,Economics,11:58,12:40

This is the nested JSON structure I would like:

{
  "Timetable" : [ {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Ladybridge High School"
  }, {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Loreto Sixth From"
  } ]
}
1
  • try this: import json then x = x.to_json() then x= json.loads(x) Commented Nov 20, 2019 at 7:50

2 Answers 2

2

Something like this?

[EDIT]

I refactored it to handle arbitrary top-level keys for each entry in the timetable. I also made it first create a dict and then convert the dict to a list so that it can run in O(N) time, in case the input is very large.

import csv
timetable = {}
with open('data.csv') as f:
    csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)]
for row in csv_data:
    if not timetable.get(row["ID"]):
        timetable[row["ID"]] = {"ID": row["ID"], "Date": {}}
    for k in row.keys():
        # Date has to be handled as a special case
        if k == "Date":
            timetable[row["ID"]]["Date"][row["Date"]] = {}
            timetable[row["ID"]]["Date"][row["Date"]][row["Subject"]] = {
                "Start": row["Start"],
                "Finish": row["Finish"]
            }
        # Ignore these keys because they are only for 'Date'
        elif k == "Start" or k == "Finish" or k == "Subject":
            continue
        # Use everything else
        else:
            timetable[row["ID"]][k] = row[k]
timetable = {"Timetable": [v for k, v in timetable.items()]}
Sign up to request clarification or add additional context in comments.

4 Comments

Any idea how I would add additional data (no nested data) that is on the same level as "Name" (no nested data) and "Date" (contains nested data)?
You mean, if there are N additional columns in the CSV and it is assumed that they contain no nested data, how do we add them without explicitly naming them?
Yes, spot on! I've tried making various alterations to your code but I didn't get very far.
Looks good! Just one small issue - structure of the output is now Timetable/0, 1, etc./Name, Date, ID. Ideally, it should be Timetable/ID/Name, Date, etc. I'll try and play around with your code once I'm free to see if I can get it to work.
0

An improvement to the above answer to nest the ID before the name and date:

import csv timetable = {"Timetable": []} print(timetable) with open("C:/Users/kspv914/Downloads/data.csv") as f: csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)] name_array = [] for name in [row["Name"] for row in csv_data]: name_array.append(name) name_set = set(name_array) for name in name_set: timetable["Timetable"].append({"Name": name, "Date": {}}) for row in csv_data: for entry in timetable["Timetable"]: if entry["Name"] == row["Name"]: entry["Date"][row["Date"]] = {} entry["Date"][row["Date"]][row["Subject"]] = { "Start": row["Start"], "Finish": row["Finish"] } print(timetable)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.