0

I'm learning to scrape information using Selenium for python. I'm coding this bot to get the table data from a racecar website. I have different files for the project, structured as:

RUN.PY:

from CarSearch.methods import Car

with Car() as bot:
    bot.land_first_page()
    bot.language_selection(language=input("Which language ? FRA, ENG or ESP? "))  #FRA #ENG #ESP
    bot.season_selection(year=input("What year ? "))
    bot.datacollection()

INIT.PY:

print('The bot is running now...') 

CONSTANTS.PY: url="https://www.statsf1.com/es/default.aspx"

METHODS.PY:

from selenium import webdriver
import CarSearch.constants as const
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
import pandas as pd

class Car(webdriver.Chrome):
    def __init__(self, driver_path=r"C:\Users\ADMIN\Desktop\PROJECTS\Python\WebScraping_cars_1", teardown=False):
        self.driver_path = driver_path
        self.teardown = teardown
        super(Car, self).__init__()
        self.implicitly_wait(15)
        self.maximize_window()

    def __exit__(self, exc_type, exc_val, exc_tb):
        if self.teardown:  # if teardown is true, then perform the command below.
            self.quit()  # Method to responsible to shut down

    def land_first_page(self):
        self.get(const.url)

    def language_selection(self, language):

        if language == 'FRA':
            french = self.find_element_by_class_name(
                'lang1'
            )
            french.click()

        if language == 'ENG':
            english = self.find_element_by_class_name(
                'lang2'
            )
            english.click()

        if language == 'ESP':
            español = self.find_element_by_class_name(
                'lang3'
            )
            español.click()

    def season_selection(self, year):
        season_menu = self.find_element_by_id('ctl00_HL_SeasonH')
        season_menu.click()

        year_season = self.find_element_by_link_text(f'{year}')
        year_season.click()


    def datacollection(self):

        tbl= self.find_element_by_xpath("/html/body/form/div[3]/div[2]/div[3]/div[3]/div[1]/table").get_attribute('outerHTML')
        df=pd.read_html(tbl)
        print(df)
        df.to_csv(r'C:\Users\ADMIN\Desktop\PROJECTS\Python\Test.csv', index=False)

The problem is that the last part of the code, when trying to create a CSV file, the following error is displayed:

AttributeError: 'list' object has no attribute 'to_csv'

Why is it prompting it? Do I need to inherit the Class Car in the variable df in a different way?

3
  • first print tbl this and see if it has anything in it or not Commented Mar 12, 2022 at 16:22
  • The point at issue is that pd.read_html reads "HTML tables into a list of DataFrame objects." (1 DataFrame per html table). So df=pd.read_html(tbl) makes df a list (which doesn't have a to_csv function). You can loop over the list to write the DataFrames to csv like this answer. If you're sure there's only 1 table you can just access the first element and write it out like this answer Commented Mar 12, 2022 at 16:43
  • Thanks, It worked. I used df[0].to_csv(........). Could you share with me a link to order the information of each element of the df[0] list to csv. I mean, that each value of the list is a box in csv. Commented Mar 12, 2022 at 18:11

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.