0

I use a script from here to generate my sitemaps.

I can call it with the browser with http://www.example.com/sitemap.php?update=pages and its working fine.

I need to call it as shell script so that I can automate it with the windows task scheduler. But the script needs to be changed to get the variables ?update=pages. But I don't manage to change it correctly.

Could anybody help me so that I can execute the script from command line with

...\php C:\path\to\script\sitemap.php update=pages. It would also be fine for me to hardcode the variables into the script since I wont change them anyway.

define("BASE_URL", "http://www.example.com/");
define ('BASE_URI', $_SERVER['DOCUMENT_ROOT'] . '/');


class Sitemap {

  private $compress;
  private $page = 'index';
  private $index = 1;
  private $count = 1;
  private $urls = array();

  public function __construct ($compress=true) {
    ini_set('memory_limit', '75M'); // 50M required per tests
    $this->compress = ($compress) ? '.gz' : '';
  }

  public function page ($name) {
    $this->save();
    $this->page = $name;
    $this->index = 1;
  }

  public function url ($url, $lastmod='', $changefreq='', $priority='') {
    $url = htmlspecialchars(BASE_URL . 'xx' . $url);
    $lastmod = (!empty($lastmod)) ? date('Y-m-d', strtotime($lastmod)) : false;
    $changefreq = (!empty($changefreq) && in_array(strtolower($changefreq), array('always', 'hourly', 'daily', 'weekly', 'monthly', 'yearly', 'never'))) ? strtolower($changefreq) : false;
    $priority = (!empty($priority) && is_numeric($priority) && abs($priority) <= 1) ? round(abs($priority), 1) : false;
    if (!$lastmod && !$changefreq && !$priority) {
      $this->urls[] = $url;
    } else {
      $url = array('loc'=>$url);
      if ($lastmod !== false) $url['lastmod'] = $lastmod;
      if ($changefreq !== false) $url['changefreq'] = $changefreq;
      if ($priority !== false) $url['priority'] = ($priority < 1) ? $priority : '1.0';
      $this->urls[] = $url;
    }
    if ($this->count == 50000) {
      $this->save();
    } else {
      $this->count++;
    }
  }

  public function close() {
    $this->save();
  }

  private function save () {
    if (empty($this->urls)) return;
    $file = "sitemaps/xx-sitemap-{$this->page}-{$this->index}.xml{$this->compress}";
    $xml = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
    $xml .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
    foreach ($this->urls as $url) {
      $xml .= '  <url>' . "\n";
      if (is_array($url)) {
        foreach ($url as $key => $value) $xml .= "    <{$key}>{$value}</{$key}>\n";
      } else {
        $xml .= "    <loc>{$url}</loc>\n";
      }
      $xml .= '  </url>' . "\n";
    }
    $xml .= '</urlset>' . "\n";
    $this->urls = array();
    if (!empty($this->compress)) $xml = gzencode($xml, 9);
    $fp = fopen(BASE_URI . $file, 'wb');
    fwrite($fp, $xml);
    fclose($fp);
    $this->index++;
    $this->count = 1;
    $num = $this->index; // should have already been incremented
    while (file_exists(BASE_URI . "xxb-sitemap-{$this->page}-{$num}.xml{$this->compress}")) {
      unlink(BASE_URI . "xxc-sitemap-{$this->page}-{$num}.xml{$this->compress}");
      $num++;
    }
    $this->index($file);
  }

  private function index ($file) {
    $sitemaps = array();
    $index = "sitemaps/xx-sitemap-index.xml{$this->compress}";
    if (file_exists(BASE_URI . $index)) {
      $xml = (!empty($this->compress)) ? gzfile(BASE_URI . $index) : file(BASE_URI . $index);
      $tags = $this->xml_tag(implode('', $xml), array('sitemap'));
      foreach ($tags as $xml) {
        $loc = str_replace(BASE_URL, '', $this->xml_tag($xml, 'loc'));
        $lastmod = $this->xml_tag($xml, 'lastmod');
        $lastmod = ($lastmod) ? date('Y-m-d', strtotime($lastmod)) : date('Y-m-d');
        if (file_exists(BASE_URI . $loc)) $sitemaps[$loc] = $lastmod;
      }
    }
    $sitemaps[$file] = date('Y-m-d');
    $xml = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
    $xml .= '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
    foreach ($sitemaps as $loc => $lastmod) {
      $xml .= '  <sitemap>' . "\n";
      $xml .= '    <loc>' . BASE_URL . $loc . '</loc>' . "\n";
      $xml .= '    <lastmod>' . $lastmod . '</lastmod>' . "\n";
      $xml .= '  </sitemap>' . "\n";
    }
    $xml .= '</sitemapindex>' . "\n";
    if (!empty($this->compress)) $xml = gzencode($xml, 9);
    $fp = fopen(BASE_URI . $index, 'wb');
    fwrite($fp, $xml);
    fclose($fp);
  }

  private function xml_tag ($xml, $tag, &$end='') {
    if (is_array($tag)) {
      $tags = array();
      while ($value = $this->xml_tag($xml, $tag[0], $end)) {
        $tags[] = $value;
        $xml = substr($xml, $end);
      }
      return $tags;
    }
    $pos = strpos($xml, "<{$tag}>");
    if ($pos === false) return false;
    $start = strpos($xml, '>', $pos) + 1;
    $length = strpos($xml, "</{$tag}>", $start) - $start;
    $end = strpos($xml, '>', $start + $length) + 1;
    return ($end !== false) ? substr($xml, $start, $length) : false;
  }



  public function __destruct () {
    $this->save();
  }

}
// start part 2

$sitemap = new Sitemap;

if (get('pages')) {
  $sitemap->page('pages');
  $result = mysql_query("SELECT uri FROM app_uri"); 
  while (list($url, $created) = mysql_fetch_row($result)) {
    $sitemap->url($url, $created, 'monthly');
  }
}



$sitemap->close();
unset ($sitemap);

function get ($name) {
  return (isset($_GET['update']) && strpos($_GET['update'], $name) !== false) ? true : false;
}

?>

2 Answers 2

1

I could install wget (it's available for windows as well) and then call the url via localhost in the task scheduler script:

wget.exe "http://localhost/path/to/script.php?pages=test"

This way you wouldn't have to rewrite the php script.


Otherwise, if the script is meant for shell usage only, then pass variables via command line:

php yourscript.php variable1 variable2 ...

In the php script you can than access those variables using the $argv variable:

$variable1 = $argv[1];
$variable2 = $argv[2];
Sign up to request clarification or add additional context in comments.

5 Comments

Ok :)... additionally you would/should make sure, that the script executes only if it is called from localhost
somehow the script is not processing completly through the whole mysql table. Actually it is supposed to create 19 xml files each 50k entries but it is only creating 13xml files. I have no idea how to troublshoot this since executing it as command line and monitoring the mysql query does not show any errors. THe commandline ends with: 2013-08-07 11:14:45 (1,35 MB/s) - »sitemap_fr.php@update=pages« saved [93100199]
from the SQL side I think there is no error since the slow query log sais the complete tbale is send: # Time: 130807 11:40:29 # User@Host: example[example] @ localhost [::1] Id: 1101010 # Query_time: 20.405540 Lock_time: 0.000000 Rows_sent: 952382 Rows_examined: 952382 SET timestamp=1375868429; SELECT uri FROM app_uri; It might be that I have somewere kind of a runtime limitation. It could be something in the php.ini or since I go over localhost wget propably goes through the appache webserver. So might be any setting in the httpd.conf
it's max_execution_time and probably memory_limit
I was blind -.- in the script the memory limit was set to 75M. raising this solved it. Thanks for the hint again.
0

have a look on:

How to pass GET variables to php file with Shell?

which already answered the same question :).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.