2

I'm trying to extract data from a website. I need to enter the value in the search box and then find the details. it will generate a table. After generating the table, need to write the details to the text file or insert them into a database. I'm trying the following things.

Website: https://commtech.byu.edu/noauth/classSchedule/index.php Search text: "C S 142"

Sample Code

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

from selenium.webdriver.chrome.service import Service

from selenium.webdriver.chrome.options import Options
c_options = Options()
c_options.add_experimental_option("detach", True)

s = Service('C:/Users/sidat/OneDrive/Desktop/python/WebDriver/chromedriver.exe')



URL = "http://saasta.byu.edu/noauth/classSchedule/index.php"
driver = webdriver.Chrome(service=s, options=c_options)
driver.get(URL)
element = driver.find_element("id", "searchBar")
element.send_keys("C S 142", Keys.RETURN)
search_button = driver.find_element("id", "searchBtn")
search_button.click()

table = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[@id='sectionTable']")))

rows = table.find_elements("xpath", "//tr")

for row in rows:
    cells = row.find_elements(By.TAG_NAME, "td")
    for cell in cells:
        print(cell.text)

I'm using PyCharm 2022.3 to code and test the result. There is nothing printing with my code. Please help me to solve this problem with to extract data to a text file and to an SQL database table.

2 Answers 2

2

The following code prints the content of the table you asked for.
You need to wait for elements to be clickable in case you going to click them or send them a text or to wait for visibility in case you want to read their text content.

from selenium import webdriver
from selenium.webdriver import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")

webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 30)

url = "http://saasta.byu.edu/noauth/classSchedule/index.php"
driver.get(url)

wait.until(EC.element_to_be_clickable((By.ID, "searchBar"))).send_keys("C S 142", Keys.RETURN)
wait.until(EC.element_to_be_clickable((By.ID, "searchBtn"))).click()

table = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='sectionTable']")))
headers = table.find_elements("xpath", ".//thead//th")
cells = table.find_elements("xpath", ".//tbody//td")

headers_text = ""
for header in headers:
    cell_text = header.text
    headers_text = headers_text + cell_text.ljust(10)

cells_text = ""
for cell in cells:
    c_text = cell.text
    cells_text = cells_text + c_text.ljust(10)

print(headers_text)
print(cells_text)

The output is:

Section   Type      Mode      InstructorCredits   Term      Days      Start     End       Location  Available Waitlist  
002       DAY       Classroom           3.00                                              TBA       0/0       0
Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

c_options = Options()
c_options.add_experimental_option("detach", True) 
s = Service('C:/Users/sidat/OneDrive/Desktop/python/WebDriver/chromedriver.exe')

driver = webdriver.Chrome()
URL = "http://saasta.byu.edu/noauth/classSchedule/index.php"
driver.get(URL)
driver.maximize_window()
element = driver.find_element("id", "searchBar")
element.send_keys("C S 142", Keys.RETURN)
search_button = driver.find_element("id", "searchBtn")
search_button.click()

header = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//table[@id='sectionTable']/thead/tr/th")))
for th in header:
    print(f"{th.get_attribute('textContent')}")


rows = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//table[@id='sectionTable']/tbody/tr")))
for i in range(0, len(rows)):
    cells = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, f"(//table[@id='sectionTable']/tbody/tr)[{i+1}]//td")))
    for cell in cells:
        print(cell.get_attribute('textContent'))

You are waiting for the table, which is correct, but the table is fully loaded (the td are not loaded yet).

    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[@id='sectionTable']//td")))

Then you wait at least for having any content into td element

3 Comments

Thanks for your support. Now this will print the entire information. I just need the below table that below to this string. "Sections that match your search"
And do you know how to write this table data into a SQL table?
I updated my code in order to get the data you want. First the header of the table and then the content of the rows. About write this info into an SQL is a totally different topic. Can you mark answer as accepted if it helped to you?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.