My python script does not print table from html

Question

I am trying to get table data from below code but surprisingly the script shows a "none" output for table, though I could clearly see it in my HTML doc. Look forward for help..

from urllib2 import urlopen, Request
from bs4 import BeautifulSoup
site = 'http://www.altrankarlstad.com/wisp'
hdr = {'User-Agent': 'Chrome/78.0.3904.108'}
req = Request(site, headers=hdr)
res = urlopen(req)
rawpage = res.read()
page = rawpage.replace("<!-->", "")
soup = BeautifulSoup(page, "html.parser")
table = soup.find("table", {"class":"table workitems-table mt-2"})
print (table)

Also here comes the code with Selenium Script as suggested:

import time
from bs4 import BeautifulSoup
from selenium import webdriver

url = 'http://www.altrankarlstad.com/wisp'

driver = webdriver.Chrome('C:\\Users\\rugupta\\AppData\\Roaming\\Microsoft\\Windows\\Start Menu\\Programs\\Python 3.7\\chromedriver.exe') 

driver.get(url)
driver.find_element_by_id('root').click() #click on search button to fetch list of bus schedule

time.sleep(10) #depends on how long it will take to go to next page after button click

for i in range(1,50):
    url = "http://www.altrankarlstad.com/wisp".format(pagenum = i)

text_field = driver.find_elements_by_xpath("//*[@id="root"]/div/div/div/div[2]/table")
for h3Tag in text_field:
    print(h3Tag.text)

Hùng Nguyễn · Accepted Answer · 2019-12-09 10:20:24Z

1

The page wasn't fully loaded when you use Request. you can debug by printing res. It seems the page is using javascript to load the table.

You should use selenium, load the page with driver (eg: chromedriver, Firefoxdriver). Sleep a while until the page is loaded (you define it, it take quite a bit to load fully). Then get the table using selenium

import time
from bs4 import BeautifulSoup
from selenium import webdriver

url = 'http://www.altrankarlstad.com/wisp'

driver = webdriver.Chrome('/path/to/chromedriver) 

driver.get(url)
# I dont understand what's the purpose when clicking that button
time.sleep(100) 

text_field = driver.find_elements_by_xpath('//*[@id="root"]/div/div/div/div[2]/table')
print (text_field[0].text)

You code worked fine with a bit of modifying, this will print all the text from the table. You should learn to debug and change it to get what you want.

This is my output running above scripts

edited Dec 9, 2019 at 10:20

answered Dec 4, 2019 at 13:46

Hùng Nguyễn

6391 gold badge10 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Zygote Over a year ago

Ok, So here is my Selenium Script:

Zygote Over a year ago

Hi Hung, As suggested by you, I have put it in the code window :) but not working, am I missing anything?

Zygote Over a year ago

Hi Hung, thanks again for your reply though the script suggested by you only throws table headings..am also trying to debug it..attached an output image attached for your reference:

Hùng Nguyễn Over a year ago

Please recheck if you run it properly, I inserted my output @Zygote

Zygote Over a year ago

Hey Hung, thanks so much for that direction...yes am seeing the output now. I flagged your answer as useful. though I do not have enough repute to vote! (: Now am trying to save the data in data frame and export it in CSV!

|

Collectives™ on Stack Overflow

My python script does not print table from html

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related