1

I've read a postgresql database with multiple tables into SQLAlchemy.

I'm trying to write a query that does a join, and sums over each "summable" column. My method works fine, but requires me to explicitly type every column that I want summed, I have ~40.

query = session.query(Table1.c.column1,\
                      Table2.c.columnB,\
                      func.sum(Table2.c.columnC),\
                      func.sum(Table2.c.columnD),\
                      ...
                      # Continue to write columns here
            .join(Table2)\
            .filter(Table1.c.column1 == Table2.c.columnA)\
            .group_by(Table1.c.column1, Table2.c.columnB)\
            .order_by(Table1.c.column1)

I've been searching all over and can't find an answer to what appears would be a common pattern, which makes me think I'm very off-track with this. Is there a way to do this without having to type out every column?

1
  • 1
    Even in raw SQL this is not possible. Having over 40 quantity fields might be a design issue like separate months or category-named fields. These should be indicators in another column with quantity in a value column: 2 fields instead of ~40! Then run aggregate query. Columns are expensive, rows are cheap in database schema. Commented Feb 10, 2018 at 21:24

1 Answer 1

2

You definitely can do that as long as you can define what is the "summable" column. Maybe you could do it by just data type of the column or by some naming convention.

In any event, below is the snippet of the code using simple column list from each table to achieve that.

Models:

# define actual ORM classes, but the code uses actual core `Table`

class Model1(Base):
    __tablename__ = 'table1'

    id = sa.Column(sa.Integer, primary_key=True)

    column1 = sa.Column(sa.String)
    column2 = sa.Column(sa.Numeric)
    column3 = sa.Column(sa.Numeric)

class Model2(Base):
    __tablename__ = 'table2'

    id = sa.Column(sa.Integer, primary_key=True)

    columnA = sa.Column(sa.String)
    columnB = sa.Column(sa.Numeric)
    columnC = sa.Column(sa.Numeric)
    columnD = sa.Column(sa.Numeric)


Table1 = Model1.__table__
Table2 = Model2.__table__

Base query:

q = (
    session.query(
        Table1.c.column1,
        Table2.c.columnA,  # WHY need this when it is identical to `column1`?
    )
    .join(Table2, Table1.c.column1 == Table2.c.columnA)
    .group_by(
        Table1.c.column1,
        Table2.c.columnB,
    )
    .order_by(
        Table1.c.column1,
    )
)

Additional columns:

summable_columns = {
    Table1: [
        'column2',
        'column3',
    ],
    Table2: [
        'columnC',
        'columnD',
    ],
}

for table, column_names in summable_columns.items():
    for column_name in column_names:
        column = getattr(getattr(table, 'c'), column_name)
        label = 'sum_{table_name}_{column_name}'.format(table_name=table.name, column_name=column_name)
        q = q.add_column(sa.func.sum(column).label(label))

for r in q:
    print(r)

Additional columns (alternative):

Of course, you can define some other logic to extend the query. Example below does it with the type and exclude list (which hopefully is much smaller that the include version):

def is_summable(column):
    exclude_columns = [
        'id',
        'column1',
        'columnA',
        'columnB',
    ]
    return (
        isinstance(column.type, (sa.Numeric, sa.Float, sa.Integer))
        and column.name not in exclude_columns
    )

for table in (Table1, Table2):
    for column in getattr(table, 'c'):
        label = 'sum_{table_name}_{column_name}'.format(table_name=table.name, column_name=column.name)
        q = q.add_column(sa.func.sum(column).label(label))
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks that makes sense. Just an FYI for the comment in your Base Query section: # WHY need this when it is identical to column1? That was a mistake when I was generalizing my actual code for this post, I've edited my post to reflect it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.