2

I am trying to scrape data from the following website:

http://mozo.com.au/credit-cards/search#fetch/680

Using chrome's 'inspect element feature' I have been able to locate the element address I want as:

//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()

I was hoping using this code, I would be able to get the text "9.99%"

import requests
page = requests.get('http://mozo.com.au/credit-cards/search#fetch/680')
tree = html.fromstring(page.text)


tree.xpath('//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()')

However, the output is an empty array. Where am I going wrong?

3
  • The problem is that the content of the page get's dynamically loaded. You should inform yourself about the concepts of scraping dynamic webpages. Commented Aug 11, 2015 at 11:31
  • any resources you can suggest? Commented Aug 11, 2015 at 12:04
  • 1
    It's nothing so spectacular... just know that pages can have dynamic content loaded and it can get messy because of it. You'll need a scraper that can handle javascript. E.g. selenium Commented Aug 11, 2015 at 12:05

1 Answer 1

4

Like tobifasc said, the page is loaded dynamically. Try selenium for example,

First install:

pip3 install selenium

Then:

import lxml.html
from selenium import webdriver
driver = webdriver.Firefox()
driver.get(url)

tree = lxml.html.fromstring(driver.page_source)

Now you can query:

# With your xpath there are 2 results...
results = tree.xpath('//*[@id="p-40"]/div[4]/table/tbody/tr/td[1]/text()')   
results[1].strip()
'9.99%'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.