4

Before you judge and think to yourself 'I've seen this question a thousand times' please read on and give me a chance, I've scoured the internet for answers but can't seem to find any help.

So I am trying to parse json from an API which I obtain using the following code:

url = 'https://api.companieshouse.gov.uk/company/' + companyno + '/charges'

r = requests.get(url, auth=('API KEY', ''))

data = r.json()

I can get specific values from the json pretty easily, for example using this code:

for each in data['items']:
    print(each['created_on'])

which returns:

2016-11-08
2016-11-08
2016-11-08
2015-03-27
2015-03-27
2015-03-27
2015-03-27
2015-03-27
2007-10-10
2007-09-28
2007-09-19

And this is exactly what I want, it works for other keys to.

However there is one bit of the json which I just can't seem to access, I have slightly edited the json as to avoid releasing sensitive data (which is available to the public anyway but just to be cautious) but it is largely unchanged:

{  
   'items':[  
      {  
         'created_on':'2016-11-08',
         'etag':'DELETED',
         'classification':{  
            'type':'charge-description',
            'description':'A registered charge'
         },
         'particulars':{  
            'contains_negative_pledge':True,
            'description':'DELETED',
            'type':'brief-description'
         },
         'transactions':[  
            {  
               'links':{  
                  'filing':'DELETED'
               },
               'filing_type':'create-charge-with-deed',
               'delivered_on':'2016-11-21'
            }
         ],
         'links':{  
            'self':'DELETED'
         },
         'charge_code':'DELETED',
         'delivered_on':'2016-11-21',
         'status':'outstanding',
         'persons_entitled':[  
            {  
               'name':'DELETED'
            },
            {  
               'name':'DELETED'
            }
         ],
         'charge_number':59
      },
      {  
         'transactions':[  
            {  
               'delivered_on':'2016-11-10',
               'links':{  
                  'filing':'DELETED'
               },
               'filing_type':'create-charge-with-deed'
            }
         ],
         'particulars':{  
            'contains_negative_pledge':True,
            'contains_fixed_charge':True,
            'floating_charge_covers_all':True,
            'contains_floating_charge':True
         },
         'persons_entitled':[  
            {  
               'name':'DELETED'
            }
         ],
         'charge_number':58,
         'status':'outstanding',
         'charge_code':'DELETED',
         'links':{  
            'self':'DELETED'
         },

It is the [items, links, self] data

If I use the following code to try and access it:

for each in data['items']['links']:
    print(each['self'])  

I get the following error:

  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str

And if I try to access it using the following code:

for each in data['items'][0]['links']:
        print(each['self'])

I get this error:

  File "<stdin>", line 2, in <module>
TypeError: string indices must be integers

I just cant understand why its throwing this error at me, all the other responses on the internet point towards the fact that I am trying to parse a string and not an actual json object but whenever I run the command:

type(data)

it returns:

<class 'dict'>

So I know it is a dictionary and that it should be able to iterate through the keys.

I'm sure I'm making a very stupid mistake and if so I apologize but I just can't figure out what I'm doing wrong, can anyone help?

p.s. sorry for the long post.

EDIT:

Thanks so much for all your replies, things seems to make sense now. We were all learners at one point! :)

1
  • Can you post a working json? The one you've posted seems incomplete, so it's hard to test it. Commented Jan 24, 2017 at 18:15

3 Answers 3

8

The 'links' key is pointing to a dictionary value, so iterating over it gives you the key(s) of the dictionary which in this case is 'self'. You should do:

for k, v in data['items'][0]['links'].items():
    if k == 'self':
        print(v)

Or you can simply access the value at key 'self' without iterating:

print(data['items'][0]['links']['self'])
Sign up to request clarification or add additional context in comments.

2 Comments

Or print(data['items'][0]['links'].get('self')) in case this key doesn't exist.
Thank you so much, feel like something just clicked about parsing json :)
3

In case there's more than one item in that response, do

for item in data['items']:
    print(item['links']['self'])

otherwise print(data['items'][0]['links']['self']) is sufficient.

Alternatively, you could use JSONPath (which is similar to XPath):

import jsonpath_rw as jp

for match in jp.parse('items[*].links.self').find(data):
    print(match.value)

Comments

3

data['items'][0]['links']: is a dict: { 'self':'DELETED' }

Iterating over a dict yields its keys: in this case, just one, 'self'. `'self'['self'] accounts for your error

What you want is possibly justdata['items'][0]['links']['self'], or iterate over one of the dict methods .keys(), .values() or .items()

for key,value in data['items'][0]['links'].items()
   print(key,":",value)

1 Comment

Beaten to it. Sigh.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.