1

I have a script that downloads larger amounts of data from an API. The script takes around two hours to run. I would like to run the script on GCP and schedule it to run once a week on Sundays, so that we have the newest data in our SQL database (also on GCP) by the next day.

I am aware of cronjobs, but would not like to run an entire server just for this single script. I have taken a look at cloud functions and cloud scheduler, but because the script takes so long to execute I cannot run it on cloud functions as the maximum execution time is 9 minutes (from here). Is there any other way how I could schedule the python script to run?

Thank you in advance!

1 Answer 1

6

For running a script more than 1h, you need to use a Compute Engine. (Cloud Run can live only 1h).

However, you can use Cloud Scheduler. Here how to do

  • Create a cloud scheduler with the frequency that you want
    • On this scheduler, use the Compute Engine Start API
    • In the advanced part, select a service account (create one or reuse one) which have the right to start a VM instance
    • Select OAuth token as authentication mode (not OIDC)
  • Create a compute engine (that you will start with the Cloud Scheduler)
    • Add a startup script that trigger your long job
    • At the end on the script, add a line to shutdown the VM (with Gcloud for example)

Note: the startup script is run as ROOT user. Take care of the default home directory and the permission of the created files.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.