1

Dears, I need to transform the Covid hospitalisation json data from the government webpage: https://onemocneni-aktualne.mzcr.cz/covid-19#panel3-hospitalization

I inspect the webpage and identified the table in the below-showed html code.

I used the following Python code and got the outcome below:

import bs4 as bs
import urllib.request
import json

source = urllib.request.urlopen("https://onemocneni-aktualne.mzcr.cz/covid-19#panel3-hospitalization")
soup = bs.BeautifulSoup(source)
js_test = soup.find("div", id="js-hospitalization-table-data")

#Convert to JSON object
jsonData = json.loads(js_test.attrs["data-table"])   
print (jsonData['body'])

Thank you.

11
  • What's a "data-table"? Commented Dec 21, 2020 at 18:10
  • i ´ve thought the .csv file ... Commented Dec 21, 2020 at 18:20
  • From the output you're getting, it looks like data-table is in JSON format, so you would need to convert data in that format into CSV — which may or may not be possible because the latter doesn't support nested data structures while the other does. Commented Dec 21, 2020 at 18:31
  • i tried to import this json data into the xls or Power BI, but without any success. so my idea was to extract the text behind "body": and transform the data in [ xxx ] into the .csv file .. any idea? thank you Commented Dec 21, 2020 at 18:59
  • Sorry, I don't know how to use beautifulsoup, but if you can get just the value of data-table from it or what it's returning, then I might be able to help convert it into CSV format. Commented Dec 21, 2020 at 19:36

1 Answer 1

1

The data you want is in JSON format, you can convert it to a Python dictionary (dict) and get the data under the body key using the built-in json module.

import json
import bs4 as bs
import urllib.request

source = urllib.request.urlopen(
    "https://onemocneni-aktualne.mzcr.cz/covid-19#panel3-hospitalization"
)
soup = bs.BeautifulSoup(source, "html.parser")

json_data = json.loads(
    soup.find("div", id="js-hospitalization-table-data")["data-table"]
)

print(type(json_data))
print(*json_data["body"])

Output (partial):

<class 'dict'>
['01.03.2020', 0, 0, 0, 0, 0] ['02.03.2020', 0, 0, 0, 0, 0] ... ['20.12.2020', 4398, 588, 0.1337, 34796, 0.7152]
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much. However, I have not found out how to transform this dictionary into the table or dataframe. could you please add this for me? thank you
@Jara That's a different question. Please see How to convert JSON File to Dataframe. If you are still stuck, consider asking a new question here on Stackoverflow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.