3

I'm trying to write a function to calculate the average value of a nested json value in postgres via sqlalchemy. The value I'm trying to average is in a Statistics table, with a scores column that holds a json dictionary like this (filtered to the relevant structure): {1: {'score': 0.0}, 2: {'score': 0.0} ...}.

Written in postgres, the query looks like this:

SELECT *, avg((v->>'score')::float) AS average_score
FROM lms.statistics, jsonb_each(statistics.scores) js(k, v)
WHERE jsonb_typeof(scores) != 'null'
GROUP BY statistics.id

And I've cast it mostly into the following sqlalchemy code:

(
  session.query(Statistics)
  .add_columns(literal_column("avg((v->>'score')::float)").label('average_score'))
  .filter(literal("jsonb_typeof(statistics.scores != 'null'"))
  .group_by(Statistics.id)
).all()

However, no matter what I try to do, sqlalchemy simply won't allow me to include the jsonb_each this query depends on. I've even tried restructuring the query to use an explicit join, and sqlalchemy's .join won't accept literal_column, text, or any trickery with outer joins or specifying fake join conditions. I'm at the end of my rope trying to cheat this in, when there has to be an sqlalchemy standard to insert plaintext queries into FROM or JOIN statements.

1 Answer 1

3

With functions returning scalars or sets of single columns you'd simply use func.something.alias('x') and column('x'). Unfortunately SQLAlchemy does not support aliasing the columns explicitly, so handling functions returning multi column composites is a bit trickier. In case of jsonb_each the default names are key and value, so you could use those:

v = column('value', type_=JSONB)
score = v['score'].astext.cast(Float)

session.query(Statistics,
              func.avg(score).label('average_score')).\
    select_from(Statistics,
                func.jsonb_each(Statistics.scores).alias()).\
    filter(func.jsonb_typeof(Statistics.scores) != 'null').\
    group_by(Statistics.id)
Sign up to request clarification or add additional context in comments.

4 Comments

This works great! Unfortunately, with this error out of the way, I actually realized I was hoping to use this in a join condition. Because I'm actually adding to a previously defined query, and thus get the error where FROM is already defined. But.... this then causes sqlalchemy to throw this: "sqlalchemy.exc.NotSupportedError: (psycopg2.NotSupportedError) set-returning functions are not allowed in JOIN conditions". However, I know this isn't the case because I've written the query in postgres with a full outer join, and it works just fine. Any advice on this?
First thought is that are you trying to put a set-returning function in the ON clause (instead of a lateral join)? What do you mean by "hoping to use this in a join condition"? Could you wrap the existing query as a subquery and join against that? It sounds like you might have the makings of a new question :P
Ah. Got it. For future readers, I was able to move it into a join like so: .outerjoin(func.jsonb_each(Statistics.scores).alias(), text('true'))
thanks so much for this thread; really helped me out!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.