Parsing nested JSON from API in Python

Question

I'm working on JSON data from this API call: https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page=1&limit=10&format=json&api-version=1.2

This is page 1, but there are 49 pages in total, therefore a part of my code deals (successfully) with pagination. I don't want to save this JSON in a file and, if I can avoid it, don't really want to import the 'json' package - but will do if necessary.

A variation of this code works correctly if I'm pulling entire ['data']['agreements'] dictionary (or is it a list...). But I don't want that, I want individual parameters for all the 'attributes' of each 'agreement'. In my code below I'm trying to pull the 'provider-name' attribute, and would like to get a list of all the provider names, without any other data there.

But I keep getting the "list indices must be integers or slices, not str" error in line 18. I've tried many ways to get this data which is nested within a list nested within a dictionary, etc. like splitting it further into another 'for' loop, but no success.

import requests
import math
import pandas as pd


baseurl = 'https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page=1&limit=10&format=json&api-version=1.2'

def main_request(baseurl, x):
    r = requests.get(baseurl + f'&page={x}')
    return r.json()

def get_pages(response):
    return math.ceil(response['meta']['count'] / 10)

def get_names(response):
    providerlist = []
    all_data = response['data']['agreements']
    for attributes1 in all_data ['data']['agreements']:
        item = attributes1['attributes']['provider-name']
        providers = {
            'page1': item,
        }

    providerlist.append(providers)
    return providerlist

mainlist = []
data = main_request(baseurl, 1)
for x in range(1,get_pages(data)+1):
    mainlist.extend(get_names(main_request(baseurl, x)))

mydataframe = pd.DataFrame(mainlist)

print(mydataframe)

The simple solution is that you need to use integers to index lists. If you use something else than an integer and you expected to index something else than a list, you need to figure out why that something is a list and not what you expect it to be. — mkrieger1
– mkrieger1, Commented Jan 14, 2023 at 20:54

Andrej Kesely · Accepted Answer · 2023-01-14 21:00:49Z

2

To get the data from the Json to the dataframe you can use next example:

import requests
import pandas as pd


api_url = "https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page={}&limit=10&format=json&api-version=1.2"

all_data = []
for page in range(1, 5): # <-- increase page numbers here
    data = requests.get(api_url.format(page)).json()

    for a in data["data"]["agreements"]:
        all_data.append({"id": a["id"], **a["attributes"], "link": a["links"]['related']})

df = pd.DataFrame(all_data)
print(df.head().to_markdown(index=False))

Prints:

id	code	technical-code	origin-code	service-type	service-name	amount	updated-at	provider-code	provider-nip	provider-regon	provider-registry-number	provider-name	provider-place	year	branch	link
75f1b5a0-34d1-d827-8970-89b6b593be86	0113/3202010/01/2022/01	0113/3202010/01/2022/01	0113/3202010/01/2022/01	01	Podstawowa Opieka Zdrowotna	14583.7	2022-07-11T20:04:39	3202010	8851039259	89019398100026	000000001951-W-02	NZOZ PRAKTYKA LEKARZA RODZINNEGO JAN WOLAŃCZYK	JEDLINA-ZDRÓJ	2022	01	https://api.nfz.gov.pl/app-umw-api/agreements/75f1b5a0-34d1-d827-8970-89b6b593be86?format=json&api-version=1.2
1840cf6e-10ba-33a1-81f1-9f58c613d705	0113/3302665/01/2022/01	0113/3302665/01/2022/01	0113/3302665/01/2022/01	01	Podstawowa Opieka Zdrowotna	1479	2022-08-03T20:00:22	3302665	9281731555	390737391	000000023969-W-02	NZOZ "MEDICA"	PĘCŁAW	2022	01	https://api.nfz.gov.pl/app-umw-api/agreements/1840cf6e-10ba-33a1-81f1-9f58c613d705?format=json&api-version=1.2
954eb365-e232-fd29-10f7-c8af21c07470	0113/3402005/01/2022/01	0113/3402005/01/2022/01	0113/3402005/01/2022/01	01	Podstawowa Opieka Zdrowotna	1936	2022-09-02T20:01:17	3402005	6121368883	23106871400021	000000002014-W-02	PRZYCHODNIA OGÓLNA TSARAKHOV OLEG	BOLESŁAWIEC	2022	01	https://api.nfz.gov.pl/app-umw-api/agreements/954eb365-e232-fd29-10f7-c8af21c07470?format=json&api-version=1.2
7dd72607-ab9f-7217-87b9-8e4ed2bc5537	0113/3202025/01/2022/01	0113/3202025/01/2022/01	0113/3202025/01/2022/01	01	Podstawowa Opieka Zdrowotna	0	2022-04-14T20:01:42	3202025	8851557014	891487450	000000002063-W-02	"PRZYCHODNIA LEKARSKA ZDROWIE BIELAK, PIEC I SZYMANIAK SPÓŁKA PARTNERSKA"	NOWA RUDA	2022	01	https://api.nfz.gov.pl/app-umw-api/agreements/7dd72607-ab9f-7217-87b9-8e4ed2bc5537?format=json&api-version=1.2
bb60b21d-38da-1f2e-a7fd-5a45453e7370	0113/3102115/01/2022/01	0113/3102115/01/2022/01	0113/3102115/01/2022/01	01	Podstawowa Opieka Zdrowotna	414	2022-10-18T20:01:17	3102115	8941504470	93009444900038	000000001154-W-02	PRAKTYKA LEKARZA RODZINNEGO WALDEMAR CHRYSTOWSKI	WROCŁAW	2022	01	https://api.nfz.gov.pl/app-umw-api/agreements/bb60b21d-38da-1f2e-a7fd-5a45453e7370?format=json&api-version=1.2

answered Jan 14, 2023 at 21:00

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Michael Wiz Over a year ago

Andrej, thank you very much for your help. That's a neat code. I'll learn from this. But I'm still stuck on part of my initial question - what if I don't want all the 'attributes' but only a couple of them, e.g. only 'amount' and 'provider-code'? I'm trying things like all_data.append({**a["attributes"]['amount']}) and that's not working...

Andrej Kesely Over a year ago

@MichaelWiz Then construct the dataframe as I shown in the question. Then you can filter the dataframe for the columns you want. For example df = df[['id', 'code']] will give you dataframe with only two columns.

Michael Wiz Over a year ago

Great stuff. Haven't thought of that. Works beautifully. Thank you!

Collectives™ on Stack Overflow

Parsing nested JSON from API in Python

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related