I'm trying to wrap my mind around some PHP web scraping using cURL. I recently picked up a short book on the topic, but am stuck on one of the tutorials and can't seem to find where the error is. The cookie.txt file is created, so I know that at least one portion of the function is executing properly.
I've tried using both the id and name attributes of the name and password input fields without any luck. As far as I can tell, I'm also using the correct POST url.
<?php
// Function to submit form using cURL POST method
function curlPost($postUrl, $postFields, $successString) {
$useragent = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'; // Setting using agent of a very old, yet popular browser.
$cookie = 'cookie.txt'; //Setting a cookie file to store cookie
$ch = curl_init(); // Intializing cURL session
// Setting cURL options
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); // Prevent cURL from verifying SSL certificate
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE); // Script should fail silently on error
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE); // Use cookies
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow Location: headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Reutrning transfer as a string
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie); // Setting cookiefile
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie); // Setting cookiejar
curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // Setting useragent
curl_setopt($ch, CURLOPT_URL, $postUrl); // Setting URL to POST
curl_setopt($ch, CURLOPT_POST, TRUE); // Setting method as POST
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postFields)); // Setting POST fields as array
$results = curl_exec($ch); // Executing cURL session
curl_close($ch); // Closing cURL session
// Checking if login was successful by checking existence of string
if (strpos($results, $successString)) {
return $results;
} else {
return FALSE;
}
}
$userEmail = '[email protected]'; // Setting your email address for site login
$userPass = 'password'; // Setting your password for site login
$postUrl = 'https://www.packtpub.com/'; // Setting URL to POST to
// Setting form input fields as 'name' => 'value'
$postFields = array (
'name' => $userEmail,
'password' => $userPass,
'form_id' => 'packt-login-form-header'
);
$successString = 'You are logged in as';
$loggedIn = curlPost($postUrl, $postFields, $successString); // Executing curlPost login and storing results page in $loggedIn
?>
CURLOPT_COOKIEFILE/CURLOPT_COOKIEJARoptions must be set with absolute path value."cookie.txt"is a relative path.