1

I have been creating a web scraper for an internal application with PHP but one of the pages has a JavaScript login is there any way of autonomously logging in to scrape the data as usual?

(I am using curl to log in to the other two sites)

1
  • 1
    Please define "JavaScript login". Curl does not interpret the returned html-file so it does not interpet any js. What does the JS do if the password is entered correctly? Does it an ajax-request to fetch the data? Or is the data already in the html in a crypted form and decoded through js? Commented Jul 23, 2010 at 14:06

2 Answers 2

2

Use Firebug to check out what does browser send to server. After it you can do the same requests with curl.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks buddy, I have only been doing PHP a day so its all a bit new :)
0

There are many ways to implement a JavaScript login interface. Your question does not provide enough information to answer definitively.

Most JavaScript login interfaces are just logging in over AJAX. So it's just an asynchronous POST request that contains the login info. That can be faked using the proper headers. Install a browser plugin that lets you monitor HTTPS requests and you'll be able to see what headers and other form data to send.

2 Comments

The other guy seemed to answer it easily enough, but thanks anyway
That will not always work. Some login scripts use a security token specifically so that just repeating the request will not work. There could also be other interactions designed to prevent (or at least make more difficult) webscraping.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.