I have a PostgreSQL table constructed as
a | b | c
-----+-----+-----
3 | 2 | 1
1 | 5 | 1
8 | 4 | 1
2 | 5 | 1
4 | 4 | 2
2 | 5 | 2
9 | 3 | 2
3 | 5 | 3
2 | 5 | 3
4 | 4 | 3
5 | 6 | 3
9 | 7 | 3
I want to compute the average value of a for each value of c where b is below a given value — e.g. the average of b.
Here is my query :
SELECT avg(a) FROM mytable t WHERE b<(SELECT avg(b) FROM mytable WHERE c=t.c) GROUP BY c;
I actually have two issues, but I believe they both belong to this single question (the first one will actually allow me to update the title) :
Is there a particular name or expression for this kind of query (I mean, operations on subselections and reintegration in the main query, or something like that) ? I couldn't find how to even search for a solution online… => ok, window functions.
This query is very slow, how can I optimize it ? I'm using 9.3.5, and
b's are already sorted in numerical order.
Thanks.
Update : Edit on user17130's answer was rejected, but this answer won't work from scratch, so here is the working piece of code :
explain select
avg(a)
from
(
select
avg(b) over (partition by c) as b_avg,
a,
b,
c
from mytable
) as t
where b<b_avg
group by c;
QUERY PLAN
------------------------------------------------------------------------------------
GroupAggregate (cost=135.34..202.46 rows=67 width=8)
Subquery Scan on t (cost=135.34..198.39 rows=647 width=8)
Filter: ((t.b)::numeric < t.b_avg)
-> WindowAgg (cost=135.34..169.29 rows=1940 width=12)
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: mytable.c
-> Seq Scan on mytable (cost=0.00..29.40 rows=1940 width=12)