bigquery python client.run_async_query gives error: 409 Already Exists

Question

I'm coding a python script that runs a certain SELECT query async. After the first time running the script, it always errors out after that with the following error:

google.cloud.exceptions.Conflict: 409 Already Exists: Job ps-bigdata:vci-temp-sales-query-job (POST https://www.googleapis.com/bigquery/v2/projects/ps-bigdata/jobs)

Here is a code snippet:

from google.cloud import bigquery

google_auth_json_file = './myprojectauth.json'
client = bigquery.Client.from_service_account_json( google_auth_json_file )

project = 'myProject'
dataset = 'myDataset'
ds = client.dataset(dataset)
query = "SELECT X,y,z FROM mytable;"

#--- Clear/create temp table
temp_table_name = 'myTempTable'
temp_tbl = myCreateTempTableFunction( client, project, dataset, temp_table_name )

#--- Create an async query job
job_name = 'vci-temp-sales-query-job'
job = client.run_async_query(job_name, query)
job.destination = temp_tbl
job.write_disposition = 'WRITE_TRUNCATE'
job.begin()

This script fails at the "job.begin()" line. I didn't know that named jobs live on beyond the end of the session or the execution of the job. How do I check if a named job already exists, and if it exists, how do I delete the existing named job to create a new one? Or do I have to create random or unique job names ever time I run an async job?

You can check if a job exists with job.exists(). If it exists, then you can cancel it with job.cancel(). You may want to check job.ended before you cancel it. — Abdou
– Abdou, Commented Jul 3, 2017 at 22:31

Elliott Brossard · Accepted Answer · 2017-07-03 22:34:15Z

2

You need to use a unique job ID, since this is what the metadata for the operation is associated with. Referring to the querying data example, your code could be something like this:

job_name = 'vci-temp-sales-query-job_{}'.format(uuid.uuid4())

answered Jul 3, 2017 at 22:34

Elliott Brossard

34k2 gold badges75 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

phil_scott_a_person Over a year ago

I just found that answer in an example code snippet somewhere too. Thank you!. The job ID passed to the client.run_async_query() method must be unique. So, adding "import uuid" and "uuid.uuid4()" to get a unique ID is the best option.

user4279562 Over a year ago

Is there a specific reason why BigQuery is designed to take a unique Job ID everytime?

Elliott Brossard Over a year ago

You can interact with or retrieve information about the job using this ID. If there's an active job with the same ID, as in the OP's question, then there would be no way e.g. to get the results of or cancel the job.

Collectives™ on Stack Overflow

bigquery python client.run_async_query gives error: 409 Already Exists

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related