I tryed to use file_exists(URL/robots.txt) to see if the file exists on randomly chosen websites and i get a false response;
How do i check if the robots.txt file exists ?
I dont want to start the download before i check.
Using fopen() will do the trick ? because : Returns a file pointer resource on success, or FALSE on error.
and i guess that i can put something like:
$f=@fopen($url,"r");
if($f) ...
my code:
http://www1.macys.com/robots.txt maybe it's not there http://www.intend.ro/robots.txt maybe it's not there http://www.emag.ro/robots.txt maybe it's not there http://www1.bloomingdales.com/robots.txt maybe it's not there
try {
if (file_exists($file))
{
echo 'exists'.PHP_EOL;
$curl_tool = new CurlTool();
$content = $curl_tool->fetchContent($file);
//if the file exists on local disk, delete it
if (file_exists(CRAWLER_FILES . 'robots_' . $website_id . '.txt'))
unlink(CRAWLER_FILES . 'robots_' . $website . '.txt');
echo CRAWLER_FILES . 'robots_' . $website_id . '.txt', $content . PHP_EOL;
file_put_contents(CRAWLER_FILES . 'robots_' . $website_id . '.txt', $content);
}
else
{
echo 'maybe it\'s not there'.PHP_EOL;
}
} catch (Exception $e) {
echo 'EXCEPTION ' . $e . PHP_EOL;
}
error_reporting(E_ALL); ini_set('display_errors', 1);HEADrequest and examine the response code, which you can indeed do with curl