1

I've been trying to teach myself how to crawl and scrape different websites. I got a good feeling about crawling/scraping, but only with websites which mainly use HTML. Now I'm working with this link https://intel.taleo.net/careersection/10000/jobsearch.ftl

I'm using Perl (with mechanize) to do the following task : I want to write a crawler/scraper to click the "United States" checkbox on the left (filtering the results) and then collect the titles of all jobs. However, I couldn't find a way to navigate to this radio button using Perl. Can someone get me started on this? (an example code would be helpful).

1
  • 2
    Have you considered using a headless browser like PhantomJS? It's more setup but it supports full Javascript. Then you could hook into the events of the page and execute JS code once the page has loaded/form is displayed/results are fetched. Commented Feb 11, 2016 at 10:00

1 Answer 1

3

you need to analyise the page and see how this radio button impelented in order to use WWW-Mechanize to eumulate the Javascript code if there JavaScript code there .

also on Perl you have more easy options to handle JavaScript below some of crawling modules that handle javascript out of the box :

1.WWW-Mechanize-Firefox which automate FireFox 
2.WWW-Mechanize-PhantomJS which based on PhatonJS Broweser and can handle javascript
3.WWW::Selenium which use Selenium 
4.WWW::HtmlUnit  which based on Java HtmlUnit and can handle javascript
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.