2

I have this URL which has table in it. I need to get all the rows and column data from table from all the multiple pages. I am not able to understand how can I get data from the table. Below is the code I have:

from selenium import webdriver
import os
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import Select
from pynput.keyboard import Key, Controller

curr_path = os.path.dirname(os.path.abspath(__file__))

keyboard = Controller()

driver = webdriver.Firefox()
driver.get("http://silk.dephut.go.id/index.php/info/iuiphhk")
driver.maximize_window()

Above code opens a firefox and loads up the url. Below code I am using to click on next page:

next_btn = (By.XPATH, "//div[@id='silk_content_wrapper']//ul[1]//li[4]//a[1]")
WebDriverWait(driver, 30).until(ec.element_to_be_clickable(next_btn)).click() 

But I am unable to understand how to get data from table. I am not from web development field so not able to understand the website code. I referred to this question accepted answer and I extracted the ID of the table:

table_id = driver.find_element(By.ID, 'diviuiphhk')

But I didnt find the ID of the rows to get the value. To find the ID,XPATH of any object on url, I use chropath. Can anyone please help me understand how to get data from the table. Please help. Thanks

2 Answers 2

6

I was able to solve it. Below is the code:

table_id = driver.find_element(By.XPATH, "//table[@class='table']")

for row in range(1, 11):
    rows = table_id.find_elements(By.XPATH, "//body//tbody//tr[" + str(row) + "]")
    for row_data in rows:
        col = row_data.find_elements(By.TAG_NAME, "td")
        for i in range(len(col)):
            print(col[i].text)

First I used chropath to get the XPATH value of the table. Then I also got the XPATH of row. This XPATH of row was same for all the rows of table, just have to increase the number from 1 to 10. The column inside the rows was referred to bye td TAG NAME. So used this tag name to get the values of the column.

Thanks

Sign up to request clarification or add additional context in comments.

Comments

1

This will give you all the cells of the table and you can extract the data

driver.find_elements(By.XPATH, "//table[@class='table']/tbody/tr/td")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.