0

I have a Flask application to start long-running Celery tasks (~10-120 min/task, sometimes with slow queries). I use Flask-SQLAlchemy for ORM and connection management. My app looks like this:

app = Flask(__name__)
db = SQLAlchemy(app)
celery = make_celery(app)

@app.route('/start_job')
def start_job():
    task = job.delay()
    return 'Async job started', 202

@celery.task(bind=True)
def job(self):
    db.session.query(... something ...)
    ... do something for hours ...
    db.session.add(... something ...)
    db.session.commit()
    return

Unfortunately the MySQL server I have to use likes to close connections after a few minutes inactivity and the celery tasks can't handle the situation, so after a lot of waiting I get (2006, 'MySQL server has gone away') errors. AFAIK the connection pooling should take care of the closed connections. I read the docs, but it only writes about the SQLALCHEMY_POOL_TIMEOUT and SQLALCHEMY_POOL_RECYCLE parameters, so based on some random internet article I tried to change recycle to 3 minutes, but that didn't help.

How the connection (session ?) handling works with this configuration? What should I do to avoid such errors?

1 Answer 1

2

I am not entirely sure about the goodness of the solution below, but it seems to solve the problem.

The session initialize a connection before the first query (or insert) statement and starts a transaction. Then it waits for a rollback or commit, but because of inactivity the MySQL server closes the connection after a few minutes. The solution is to close the session if you do not need it for a long time, and SQLAlchemy will open a new one for the next transaction.

@celery.task(bind=True)
def job(self):
    db.session.query(... something ...)
    db.session.close()
    ... do something for hours ...
    db.session.add(... something ...)
    db.session.commit()
    return
Sign up to request clarification or add additional context in comments.

2 Comments

I have a similar problem on Azure. It cancels the DB connection behind the scenes and then the celery job fails. I work around this much like you do by explicitly closing the connections. I thought I could use CONN_MAX_AGE=0 which closes connections after queries. This is done using signal handlers sitting on request_started and request_finished, which check the connection status and close the connection as necessary. It appears, though, that celery doesn't use the same logic for database requests, and doesn't see these hooks. Thus, still looking for a solution.
That was a long time ago. I don't recall anything from that project. Sorry.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.