Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 185e304
Choose a base ref
...
head repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 8e11859
Choose a head ref
  • 1 commit
  • 26 files changed
  • 1 contributor

Commits on Oct 8, 2025

  1. Implement Eager Aggregation

    Eager aggregation is a query optimization technique that partially
    pushes aggregation past a join, and finalizes it once all the
    relations are joined.  Eager aggregation may reduce the number of
    input rows to the join and thus could result in a better overall plan.
    
    In the current planner architecture, the separation between the
    scan/join planning phase and the post-scan/join phase means that
    aggregation steps are not visible when constructing the join tree,
    limiting the planner's ability to exploit aggregation-aware
    optimizations.  To implement eager aggregation, we collect information
    about aggregate functions in the targetlist and HAVING clause, along
    with grouping expressions from the GROUP BY clause, and store it in
    the PlannerInfo node.  During the scan/join planning phase, this
    information is used to evaluate each base or join relation to
    determine whether eager aggregation can be applied.  If applicable, we
    create a separate RelOptInfo, referred to as a grouped relation, to
    represent the partially-aggregated version of the relation and
    generate grouped paths for it.
    
    Grouped relation paths can be generated in two ways.  The first method
    involves adding sorted and hashed partial aggregation paths on top of
    the non-grouped paths.  To limit planning time, we only consider the
    cheapest or suitably-sorted non-grouped paths in this step.
    Alternatively, grouped paths can be generated by joining a grouped
    relation with a non-grouped relation.  Joining two grouped relations
    is currently not supported.
    
    To further limit planning time, we currently adopt a strategy where
    partial aggregation is pushed only to the lowest feasible level in the
    join tree where it provides a significant reduction in row count.
    This strategy also helps ensure that all grouped paths for the same
    grouped relation produce the same set of rows, which is important to
    support a fundamental assumption of the planner.
    
    For the partial aggregation that is pushed down to a non-aggregated
    relation, we need to consider all expressions from this relation that
    are involved in upper join clauses and include them in the grouping
    keys, using compatible operators.  This is essential to ensure that an
    aggregated row from the partial aggregation matches the other side of
    the join if and only if each row in the partial group does.  This
    ensures that all rows within the same partial group share the same
    "destiny", which is crucial for maintaining correctness.
    
    One restriction is that we cannot push partial aggregation down to a
    relation that is in the nullable side of an outer join, because the
    NULL-extended rows produced by the outer join would not be available
    when we perform the partial aggregation, while with a
    non-eager-aggregation plan these rows are available for the top-level
    aggregation.  Pushing partial aggregation in this case may result in
    the rows being grouped differently than expected, or produce incorrect
    values from the aggregate functions.
    
    If we have generated a grouped relation for the topmost join relation,
    we finalize its paths at the end.  The final paths will compete in the
    usual way with paths built from regular planning.
    
    The patch was originally proposed by Antonin Houska in 2017.  This
    commit reworks various important aspects and rewrites most of the
    current code.  However, the original patch and reviews were very
    useful.
    
    Author: Richard Guo <guofenglinux@gmail.com>
    Author: Antonin Houska <ah@cybertec.at> (in an older version)
    Reviewed-by: Robert Haas <robertmhaas@gmail.com>
    Reviewed-by: Jian He <jian.universality@gmail.com>
    Reviewed-by: Tender Wang <tndrwang@gmail.com>
    Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
    Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
    Reviewed-by: David Rowley <dgrowleyml@gmail.com>
    Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
    Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
    Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
    Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
    Richard Guo committed Oct 8, 2025
    Configuration menu
    Copy the full SHA
    8e11859 View commit details
    Browse the repository at this point in the history
Loading