0

Python parsing JSON file that is a list with an embedded dictionary I can get the values from it but only for the last record in the dictionary

Code is attached on what I've tried

import json

access_json = open('shortdocuments.json', 'r')
read_content = json.load(access_json)
# Make list a dictionary
for question_access in read_content:
    print(type(question_access))

replies_access = question_access['Attachment']


for i in replies_access:
    print(i,"|",replies_access[i])

Wanting out put like this for all records

CreateDate | 2019-10-16T09:13:33

Description |
CreateUserID | 1

Path | 201910\10489_AParker_T0231056_13.pdf

Name | 10489_AParker_T0231056_13
FileID | 765

IsDepotUsed | True

Extension | .pdf

UpdateDate | 2019-10-16T09:13:33

UpdateUserID | 1

My JSON FIle

[
  {
    "UserDefinedValueID": 872,
    "UserDefinedFieldID": 56,
    "ParentID": 355,
    "Name": "PDM_Application",
    "Value": "763",
    "UpdateDate": "2019-10-27T14:29:18",
    "FieldType": "File",
    "Attachment": {
      "FileID": 763,
      "Name": "03981-00117",
      "Description": "",
      "IsDepotUsed": true,
      "Path": "201910\\03981-00117.pdf",
      "Extension": ".pdf",
      "CreateDate": "2019-10-16T09:13:32",
      "UpdateDate": "2019-10-27T14:29:18",
      "CreateUserID": 1,
      "UpdateUserID": 1
    },
    "UpdateUserID": 1,
    "CreateUserID": 1  },
  {
    "UserDefinedValueID": 873,
    "UserDefinedFieldID": 57,
    "ParentID": 355,
    "Name": "PDM_LeaseDoc",
    "Value": "764",
    "UpdateDate": "2019-10-16T09:13:33",
    "FieldType": "File",
    "Attachment": {
      "FileID": 764,
      "Name": "09658-00060_t0007192_Application",
      "Description": "",
      "IsDepotUsed": true,
      "Path": "201910\\09658-00060_t0007192_Application.pdf",
      "Extension": ".pdf",
      "CreateDate": "2019-10-16T09:13:33",
      "UpdateDate": "2019-10-16T09:13:33",
      "CreateUserID": 1,
      "UpdateUserID": 1
    },
    "UpdateUserID": 1,
    "CreateUserID": 1  },
  {
    "UserDefinedValueID": 875,
    "UserDefinedFieldID": 59,
    "ParentID": 355,
    "Name": "PDM_FAS/SODA",
    "Value": "765",
    "UpdateDate": "2019-10-16T09:13:33",
    "FieldType": "File",
    "Attachment": {
      "FileID": 765,
      "Name": "10489_AParker_T0231056_13",
      "Description": "",
      "IsDepotUsed": true,
      "Path": "201910\\10489_AParker_T0231056_13.pdf",
      "Extension": ".pdf",
      "CreateDate": "2019-10-16T09:13:33",
      "UpdateDate": "2019-10-16T09:13:33",
      "CreateUserID": 1,
      "UpdateUserID": 1
    },
    "UpdateUserID": 1,
    "CreateUserID": 1 
 }
]

1 Answer 1

2

Following a for-loop, the variable that you used is still in scope.

for question_access in read_content:
    print(type(question_access))

# question_access is still in scope, and the last item in the list
replies_access = question_access['Attachment']

You need to indent the code under the loop to get each item acted upon

for question_access in read_content:
    replies_access = question_access['Attachment']
    for i in replies_access:
        print(i,"|",replies_access[i])

Edit: If you want a CSV type format, you can try this

import json

with open('shortdocuments.json') as f:
    data = json.load(f)

if data:
    i = iter(data)
    a = next(i)['Attachment']
    print('|'.join(a.keys()))  # comment line to get only values
    while True:
        try:
            print('|'.join(map(str, a.values())))
            a = next(i)['Attachment']
        except StopIteration:
            break

Outputs

CreateDate|CreateUserID|Description|Extension|FileID|IsDepotUsed|Name|Path|UpdateDate|UpdateUserID
2019-10-16T09:13:32|1||.pdf|763|True|03981-00117|201910\03981-00117.pdf|2019-10-27T14:29:18|1
2019-10-16T09:13:33|1||.pdf|764|True|09658-00060_t0007192_Application|201910\09658-00060_t0007192_Application.pdf|2019-10-16T09:13:33|1
2019-10-16T09:13:33|1||.pdf|765|True|10489_AParker_T0231056_13|201910\10489_AParker_T0231056_13.pdf|2019-10-16T09:13:33|1

Or using pandas

import json
from pandas import DataFrame

with open('shortdocuments.json') as f:
    data = json.load(f)
    attachments = [d['Attachment'] for d in data]
    print(DataFrame.from_dict(attachments).to_csv(sep='|', index=False))
Sign up to request clarification or add additional context in comments.

10 Comments

Oh geez - big mistake there Thanks a ton!
I added check mark for you -- on a side note - is there a way to get the print statement to print i into 1 value vs having a line feed after each one for example File ID|764|UpdateUserID|1|Description| etc
you can disregard that - I figured that out - thanks again
oops thought I did - I want each record to be on one line so 12 key/pairs per line then line feed - so in my sample above I want 3 lines total
Thanks!! I tried originally to get this to work for pandas and I got close - I truly appreciate the help
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.