1

I have been practicing my scraping abilities in Python. I have gotten pretty good, but came across a few sites that have me pretty stumped. They make use of Ajax to find the nearby locations. There are several sites designed the same way. One of the websites is www.applebees.com. Even using firebug I cannot find the answer.

How can Python request the locations via the ajax call? I am completely stumped.

The page is www.applebees.com, there is a form on the right hand side to enter the zipcode and it pulls up the closest locations to that zipcode. However, if I pull the source after this zipcode is entered the locations still don't show up in the source file. The request/response are completely ajax and hidden to the html source, i have never seen anything like it. I am trying to research a solution now.

2
  • 1
    Ajax is a popular way of doing HTTP requests, Python is a programming language. The only correct answer is to "use your favorite HTTP library." Commented Feb 27, 2012 at 0:09
  • 1
    Could you provide a more specific example? For example, a particular page within Applebee's? Commented Feb 27, 2012 at 0:12

1 Answer 1

8

Scraping programatically using an http library can be difficult for some sites. If you are trying to simulate user interraction on a JavaScript heavy site (ajax or otherwise) you might consider driving a real browser using something like selenium. There are python client browsing and you will get some access to the page DOM.

http://pypi.python.org/pypi/selenium

Sign up to request clarification or add additional context in comments.

3 Comments

I might add a link to my own library, dryscrape, which uses QtWebkit to scrape Javascript-heavy web pages using an in-memory (headless) browser instance. This is both lightweight and faster than Selenium and the alikes.
Even with the suggestions you guys have given, I don't understand how a I can extract the location names and addresses, when the html source that gets saved even after manually submitting the zipcode, does not exist. I can see the info on my screen and in firebug, but upon downloading source it is not there.
After researching this some more selenium seems to be the only route. Niklas is also an option. Thanks for all the replies

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.