0

I have a python script that needs 2 parameters when running, [url] & [keyword] . Once the file is done it outputs a .csv file with my results which are [position] , [url], [keyword] , [date] .

I'm trying to schedule this script to run once every day using crontab.

So, in other words, I'm trying to schedule the following to be run every day:

python3 script.py [url] [keyword]

I added the following in my crontab(I am trying to see if it works by running it every minute)

* * * * * * /usr/bin/python3 /path-to-my-script/rank.py

but nothing happens, I don't see my expected .csv files in the /path-to-my-script/ folder, and when I check the mail I get the following error:

/bin/sh: file-name.csv: command not found

My python script looks like this:


import sys
import re
import random
from robobrowser import RoboBrowser
import datetime
import csv


sitename = sys.argv[1]
keyword = "+".join(sys.argv[2:])

print("site: %s keyword: %s" % (sitename, keyword))

agent = ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:54.0) Gecko/20100101 Firefox/54.0',
         'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
         'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0',
         'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
         'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0']

parser = 'html.parser'

browser = RoboBrowser(history=False,
                      user_agent=random.choice(agent),
                      parser=parser)

browser.open('https://www.google.com/search?num=100&q=' + keyword)

links = browser.find_all("div", {"class": "g"})

counter = 0



d=[]
for i in links:
    counter = counter + 1
    if sitename in str(i):
        url = i.find_all('a', href=True)
        position = "%d" % (counter)
        rank = "%s" % (url[0]['href'])
        now = datetime.date.today().strftime("%d-%m-%Y")
        keyword = keyword
        d.append(keyword)
        d.append(position)
        d.append(rank)
        d.append(now)
        print(keyword, position, rank, now)



file =datetime.date.today().strftime("%d-%m-%Y")+'-' +keyword + '.csv'
with open(file, 'w') as f:
    writer = csv.writer(f)
    writer.writerow(['Keyword' , 'Rank', 'URL' , 'Date'])
    writer.writerows(zip( d[0::4], d[1::4] , d[2::4], d[3::4]))

I want the following command to be run every day and store my .csv files in a specific folder

python3 script.py [url] [keyword]
8
  • It would be better to write a shell script that does the job of activating a virtual env (if you use one), adding input parameters to any python script and then put a cronjob on the shell script. Commented May 5, 2019 at 11:43
  • I'm not using a virtual environment. What my python script does is checks the URL and the keyword in Google and counts the position. So its like a day to day rank checker. I'm currently running it every day but I would like it to just run by itself. Commented May 5, 2019 at 11:46
  • * * * * * * /usr/bin/python3 /path-to-my-script/rank.py [url] [keyword] > /path-to-my-script/execution_logger.log 2>&1 - Just add the url keyword at the end. log the stdin and stderr to see if the python script is executing normally. Commented May 5, 2019 at 11:51
  • Also, you might want to change your command if you want it run once a day. * * * * * * runs it every minute. Commented May 5, 2019 at 11:56
  • 1
    You might also want to inspect your python script once. Is it running successfully without cron? Commented May 5, 2019 at 12:20

1 Answer 1

4

Here's a simple example to demonstrate how you can schedule python scripts with args using a shell script and cronjob.

hello_world.py

import sys

def main():
    print(sys.argv[1])
    print(sys.argv[2])


if __name__ == '__main__':
    main()

hello_world_scheduler.sh -- using such shell scripts have lots of added advantages that may come in handy in the future.

#! /bin/bash

cd /path_to_my_script
/usr/bin/python3 hello_world.py hello world! > execution_logger.log

Run

chmod +x hello_world_scheduler.sh ## to make the script executable
./hello_world_scheduler ## to run the shell script
cat execution_logger.log

The output should be

hello
world!

Just add the scheduler to cronjob -

* * * * * /path_to_script/hello_world_scheduler.sh

This should work

Sign up to request clarification or add additional context in comments.

8 Comments

Thank you. This way it worked perfectly. I have a question, is the execution_logger.log just to check if my output is correct?
Yes, apart from the logs in your script, you may use this as it helps you capture all standard output (print statements, warnings) in a file.
Hey @skybunk, sorry for the confuision, but the .sh file works perfectly if i call it in the terminal. The thing that still doesnt work is when I schedule it in crontab. When checking the mails it keeps showing me this error: /bin/bash: myfilename.csv: command not found. In my python scrip im using writer to export my outputs in a csv file. Do I need to set my PATH or SHELL somehwere in m bash or crontab?
Could you add your python script in the question?
I added my python script.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.