Web scraping in Python for loop issue doesn't return expected data

Question

I'm having an issue scraping the F1 website using BeautifulSoup where I have specified the data I have required using a for loop from the website however I am only retrieving one result instead of all the results within the class.

Below is my following code

import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")

for container in data:
    countryname = container.find_all("span", class_="name")
    country = countryname[0].text
    racetype = container.find_all("span", class_="race-type")
    rtype = racetype[0].text
    racetime = container.find_all("time", class_="day")
    racetimename = racetime[0].text.replace("\n", "").strip()
    print(country)

My Current Output -

Australia

Expected Output -

Australia

Bahrain

China

etc

Thanks in advance!

Are you sure the HTML elements you expect are actually received? Try finding them manually in the parsed tree, because it could easily be a dynamically loaded page, where you're better off with Selenium than BeautifulSoup+Requests. — Nemanja Radojković
– Nemanja Radojković, Commented Jan 30, 2019 at 14:24

DirtyBit · Accepted Answer · 2019-01-30 15:27:55Z

3

The culprit:

country = countryname[0].text

The reason:

There are 21 countries and you're only fetching the first one at zeroth index i.e.

country = countryname[0].text

The answer:

Loop through the 'countryname' to find all the elements:

  import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")
#
# print(data)

for container in data:
  countryname = container.find_all("span", class_="name")
  for count in countryname:
      country = count.text
      racetype = container.find_all("span", class_="race-type")
      rtype = count.text
      racetime = container.find_all("time", class_="day")
      racetimename = count.text.replace("\n", "").strip()
      print(country)

OUTPUT:

Australia
Bahrain
China
Azerbaijan
Spain
Monaco
Canada
France
Austria
Great Britain
Germany
Hungary
Belgium
Italy
Singapore
Russia
Japan
Mexico
United States
Brazil
Abu Dhabi

edited Jan 30, 2019 at 15:27

answered Jan 30, 2019 at 14:23

DirtyBit

16.8k5 gold badges37 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

bruno desthuilliers Over a year ago

No need to use range() nor len() nor indexed access, just iterate over countryname (docs.python.org/3/tutorial/controlflow.html#for-statements)

DirtyBit Over a year ago

@brunodesthuilliers Ah, Good catch. Thank you, fixed.

Kronos Over a year ago

Thanks, how would I add in the racetype and racename, so that the data would look like Australia, Qualifying, Sat ?

DirtyBit Over a year ago

append them in the print statement like: print(country, ", ", racetype)

bruno desthuilliers Over a year ago

Or use tring formatting: print("{}, {}, {}".format(country, racetype, racetimename))

Collectives™ on Stack Overflow

Web scraping in Python for loop issue doesn't return expected data

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related