0

hopefully there is a kind soul out there which will offer to graciously answer my question as I continue my learning experience.

I have written the following simple python code to bring back a sporting web site I would like to scrape.

I want to be able to scrape from race-1 to race X without having to manually change the url. So I was wondering is it possible to include a loop to scrape all races from 1 to end and consolidate in one print file.

import requests

url = "https://s3-ap-southeast-2.amazonaws.com/racevic.static/2018-01-01/flemington/sectionaltimes/race-1.json?callback=sectionaltimes_callback"

payload={}
headers = {}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

1 Answer 1

1

Use string formatting to insert the current race number into the URL.

import requests
payload = {}
headers = {}

for race in range(1, X+1):
    url = f"https://s3-ap-southeast-2.amazonaws.com/racevic.static/2018-01-01/flemington/sectionaltimes/race-{race}.json?callback=sectionaltimes_callback"
    response = requests.request("GET", url, headers=headers, data=payload)
    json_string = re.sub(r'^sectionaltimes_callback\((.*)\)$', r'\1', response.text)
    data = json.loads(json_string)
    print(data)
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you Barmar I have made your suggested changes and get the following error msg "name 'X' is not defined". So I am not sure whether you have placed "X" as just a generic symbol for me to include the relevant field or there is something missing.
You said you wanted to go up to race X. I assumed you had defined X. Use whatever variable you want for the limit.
Haha sorry my mistake - works a treat. Thank you
Would it be possible to ask for your further help in getting this into a JSON format. You helped answer a similar question because the output comes back as JSON with padding. Now that I am calling back multiple events that padding appears throughout the string just not at the start. So I don't think the previous re.sub() coding solution fits anymore. The string starts with: sectionaltimes_callback({"Horses":[{"Comment" and the sectionaltimes_callback function is embedded at the start of each race result.
Yes, the same solution works here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.