I have a query with a group by that either performs a sort followed by group aggregate or a hash aggregate. I noticed that when adding an array_agg function, the query never tries to perform a hash aggregate increasing the query time. I have set my work_mem for the transaction to a really high number so there is no memory constraint.
Does postgres never use hashaggregate if there is an array_agg function in the aggregation?
This is the format of the query in question:
SELECT
"test"."db"."a",
"test"."db"."b",
"test"."db"."c",
"test"."db"."d",
"test"."db"."e",
"test"."db"."f",
"test"."db"."g",
array_agg(ARRAY["test"."db"."resource_type", "test"."db"."resource_name", "test"."db"."pattern_type"]) FILTER (WHERE "test"."db"."deleted" is FALSE) AS "resources"
FROM
"test"."db"
WHERE ("test"."db"."last_change_id" BETWEEN 0 AND val3
AND "test"."db"."c" = 'val1'
AND "test"."db"."d" = 'val2')
GROUP BY
"test"."db"."a",
"test"."db"."b",
"test"."db"."c",
"test"."db"."d",
"test"."db"."e",
"test"."db"."f",
"test"."db"."g"
The plan executed is the following:
GroupAggregate (cost=180461.63..183740.70 rows=84788 width=147) (actual time=3310.049..3764.216 rows=119173 loops=1)
Group Key: a, b, c, d, e, f, g
-> Sort (cost=180461.63..180708.21 rows=98632 width=151) (actual time=3310.030..3394.925 rows=513793 loops=1)
Sort Key: a, b, e, f, g
Sort Method: quicksort Memory: 148765kB
-> Gather (cost=1000.00..172280.22 rows=98632 width=151) (actual time=2.590..478.026 rows=513793 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on role_binding (cost=0.00..161417.02 rows=41097 width=151) (actual time=0.897..414.638 rows=171264 loops=3)
Filter: ((last_change_id >= 0) AND (last_change_id <= val3) AND (c = 'val1'::text) AND (d = 'val2'::text))
Rows Removed by Filter: 826426
Planning Time: 0.181 ms
Execution Time: 3782.427 ms
ORDER BYclause in the aggregate function?