1

I'm having an issue scraping the F1 website using BeautifulSoup where I have specified the data I have required using a for loop from the website however I am only retrieving one result instead of all the results within the class.

Below is my following code

import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")

for container in data:
    countryname = container.find_all("span", class_="name")
    country = countryname[0].text
    racetype = container.find_all("span", class_="race-type")
    rtype = racetype[0].text
    racetime = container.find_all("time", class_="day")
    racetimename = racetime[0].text.replace("\n", "").strip()
    print(country)

My Current Output -

Australia

Expected Output -

Australia

Bahrain

China

etc

Thanks in advance!

1
  • Are you sure the HTML elements you expect are actually received? Try finding them manually in the parsed tree, because it could easily be a dynamically loaded page, where you're better off with Selenium than BeautifulSoup+Requests. Commented Jan 30, 2019 at 14:24

1 Answer 1

3

The culprit:

country = countryname[0].text

The reason:

There are 21 countries and you're only fetching the first one at zeroth index i.e.

country = countryname[0].text

The answer:

Loop through the 'countryname' to find all the elements:

  import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")
#
# print(data)

for container in data:
  countryname = container.find_all("span", class_="name")
  for count in countryname:
      country = count.text
      racetype = container.find_all("span", class_="race-type")
      rtype = count.text
      racetime = container.find_all("time", class_="day")
      racetimename = count.text.replace("\n", "").strip()
      print(country)

OUTPUT:

Australia
Bahrain
China
Azerbaijan
Spain
Monaco
Canada
France
Austria
Great Britain
Germany
Hungary
Belgium
Italy
Singapore
Russia
Japan
Mexico
United States
Brazil
Abu Dhabi
Sign up to request clarification or add additional context in comments.

5 Comments

No need to use range() nor len() nor indexed access, just iterate over countryname (docs.python.org/3/tutorial/controlflow.html#for-statements)
@brunodesthuilliers Ah, Good catch. Thank you, fixed.
Thanks, how would I add in the racetype and racename, so that the data would look like Australia, Qualifying, Sat ?
append them in the print statement like: print(country, ", ", racetype)
Or use tring formatting: print("{}, {}, {}".format(country, racetype, racetimename))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.