0

I have a nested JSON file, I flattened it and got a list back which looks like this;

[{patient_0_order: 1234,
   patient_0_id: a1,
   patient_0_time: 01/01/2016,
   patient_0_desc: xyz,
   patient_1_order: 2313,
   patient_1_id: b1,
   patient_1_time: 02/01/2016,
   patient_1_desc: def,
   patient_2_order: 9876,
   patient_2_id: c1,
   patient_2_time: 03/01/2016,
   patient_2_desc: ghi,
   patient_3_order: 0075,
   patient_3_id: d1,
   patient_3_time: 04/01/2016,
   patient_3_desc: klm,
   patient_4_order: 6268,
   patient_4_id: e1,
   patient_4_time: 05/01/2016,
   patient_4_desc: pqr}`]

Now I want to convert the list into a data frame such that each row takes one patient like below.

       patient_order    patient_id       patient_time    patient_desc 
  0      1234                a1          01/01/2016        xyz
  1      2313                b1          02/01/2016        def
  2      9876                c1          03/01/2016        ghi
  3      0075                d1          04/01/2016        klm
  4      6268                e1          05/01/2016        pqr 

I tried using pandas.DataFrame(list) and it gave me a dataframe with 1 row * 20 columns table which is not I want.

Any help and suggestions would be greatly appreciated.

2 Answers 2

1

Here we go, this works. Probably not the prettiest it could be, but it works and I'll probably come back to clean this later.

original = [{"patient_0_order": 1234, "patient_0_id": 123, "patient_1_id": 12, "patient_1_order": 1255}]
original = original[0]

elems = []

current_patient = 0
current_d = {}
total_elems = len(original.keys())

for index, i in enumerate(sorted(original.keys(), key=lambda x: int(x.split("_")[1]))):
   key_details = i.split("_")
   # This will be used in the dataframe as a column name
   key_name = key_details[2]
   # The number specific to this patient
   patient_num = int(key_details[1])
   # Checking if we're still on the same patient
   if patient_num == current_patient:
      current_d[key_name] = original[i]
   # Checks if this is the last element
   if index == total_elems-1:
      elems.append(current_d)
   # Checks if we've moved on to the next patient and moves on accordingly
   if patient_num != current_patient:
      elems.append(current_d)
      # Starting off the new dictionary for this patient with the current key
      current_d = {key_name: original[i]}
      current_patient = patient_num

df = pd.DataFrame(elems)

And feel free to modify the key_name method to adjust how you want the columns to be named! Adding a 'patient_' to it will get what you have in the question.

Sign up to request clarification or add additional context in comments.

2 Comments

it's giving me an error ` 'list' object has no attribute 'keys' `.
Make sure you're running the second line. That takes the first object from the list (since you only had one item in the list in the answer).
1

'Here's how you can convert the json object (dictionary):

old_dict = json.loads('YOUR JSON STRING')[0]
col_names = ['order', 'id', 'time', 'desc']
# Reorganize the dictionary.
new_dict = {col: {k: v for k, v in old_dict.iteritems() if col in k} for col in col_names}
df = pd.DataFrame(new_dict)

should return what you want.

4 Comments

This is giving me Too many values to unpack in the inner dict comprehension.
@WiggyA. I forgot the .iteritems() method, it should work now. If you use python3, you can just use items()
It's not returning me any values. I think because its a nested JSON.
@AkashBachu Yes I'm targeting the dictionary inside that json. Read json into a list and get the dictionary as the only element to that list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.