0

My project involves a Django website using data from a .csv file generated from a web scraping script, which needs to be hosted on Heroku. My development OS is Windows 10. When my development server is run, it initially executes the script under the main application's views.py file:

exec(open('homepage/scrape.py').read())

where homepage is the name of the main application of the project and scrape.py is the web scraping script.

What I need to occur is for this scrape.py to run every hour and be able to work on both a Heroku dyno and my Windows development environment.

Thanks.

1
  • Heroku has a scheduler that I've found works consistently well. I have an app that's been running a daily scheduler with this for several years. Commented Jun 28, 2020 at 1:42

1 Answer 1

1

I recently built an app with very similar functionality. The solution turned out to be quite simple, thankfully.

First, I created a clock.py file that had my actual scheduling functionality in it.

from apscheduler.schedulers.blocking import BlockingScheduler
from django import setup
from scrape import scrape #this is the package you referred to in your question, theoretically

setup() #got to make sure everything is running before this kicks in

@sched.scheduledc_job('interval', hours=1)
def hourly_scrape():
    update = scrape()

sched.start()

Then I added a separate dyno named clock in my Procfile to do the work.

clock: python clock.py --log-file -

As long as you update your requirements.txt and get another dyno online you'll be good to hook. Also, don't forget that you have to scale your dyno up. From the command line that'll look something like this:

$ heroku ps:scale clock=1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.