3

I'm trying to scrape data from the chart on this website: https://www.spglobal.com/spdji/en/indices/equity/sp-bmv-ipc/#overview

I found the JSON file behind the chart and tried this code to import it into pandas:

import pandas as pd
url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-&currencycode=MXN&currencyChangeFlag=false&language_id=1"

with urllib.request.urlopen(url) as url:
    data = json.loads(url.read().decode())

df = pd.DataFrame(data, columns=['indexLevelsHolder'])
Data=df.iloc[3 , 0]

By doing so, I get the "Data" object which is a list containing the time series data in JSON format.

[{'effectiveDate': 1309406400000, 'indexId': 92330714, 'effectiveDateInEst': 1309392000000, 'indexValue': 43405.82, 'monthToDateFlag': 'N', 'quarterToDateFlag': 'N', 'yearToDateFlag': 'N', 'oneYearFlag': 'N', 'threeYearFlag': 'N', 'fiveYearFlag': 'N', 'tenYearFlag': 'Y', 'allYearFlag': 'Y', 'fetchedDate': 1626573344000, 'formattedEffectiveDate': '30-Jun-2011'}, .........

The problem is that I cannot find a way to read this JSON data and grab the columns I need (effectiveDate and indexValue).

Any way to do it? Thanks

1 Answer 1

3

You can use pd.json_normalize to load the Json into columns:

import json
import urllib
import pandas as pd

url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-&currencycode=MXN&currencyChangeFlag=false&language_id=1"

with urllib.request.urlopen(url) as url:
    data = json.loads(url.read().decode())

df = pd.json_normalize(data["indexLevelsHolder"]["indexLevels"])
print(df)

Prints:

      effectiveDate   indexId  effectiveDateInEst    indexValue monthToDateFlag quarterToDateFlag yearToDateFlag oneYearFlag threeYearFlag fiveYearFlag tenYearFlag allYearFlag    fetchedDate formattedEffectiveDate
0     1309406400000  92330714       1309392000000  43405.820000               N                 N              N           N             N            N           Y           Y  1626574897000            30-Jun-2011
1     1309492800000  92330714       1309478400000  43693.930000               N                 N              N           N             N            N           Y           Y  1626574897000            01-Jul-2011
2     1309752000000  92330714       1309737600000  43758.130000               N                 N              N           N             N            N           Y           Y  1626574897000            04-Jul-2011
3     1309838400000  92330714       1309824000000  43513.290000               N                 N              N           N             N            N           Y           Y  1626574897000            05-Jul-2011

...and son on.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.