1

I have a PostgreSQL table constructed as

  a  |  b  |  c
-----+-----+-----
   3 |   2 |   1
   1 |   5 |   1
   8 |   4 |   1
   2 |   5 |   1
   4 |   4 |   2
   2 |   5 |   2
   9 |   3 |   2
   3 |   5 |   3
   2 |   5 |   3
   4 |   4 |   3
   5 |   6 |   3
   9 |   7 |   3

I want to compute the average value of a for each value of c where b is below a given value — e.g. the average of b.

Here is my query :

SELECT avg(a) FROM mytable t WHERE b<(SELECT avg(b) FROM mytable WHERE c=t.c) GROUP BY c;

I actually have two issues, but I believe they both belong to this single question (the first one will actually allow me to update the title) :

  1. Is there a particular name or expression for this kind of query (I mean, operations on subselections and reintegration in the main query, or something like that) ? I couldn't find how to even search for a solution online… => ok, window functions.

  2. This query is very slow, how can I optimize it ? I'm using 9.3.5, and b's are already sorted in numerical order.

Thanks.

Update : Edit on user17130's answer was rejected, but this answer won't work from scratch, so here is the working piece of code :

explain select 
   avg(a) 
   from  
   (
       select  
        avg(b) over (partition by c) as b_avg,
        a,
        b,
        c 
        from mytable
    ) as t 
    where b<b_avg 
    group by c;
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 GroupAggregate  (cost=135.34..202.46 rows=67 width=8)
   Subquery Scan on t  (cost=135.34..198.39 rows=647 width=8)
     Filter: ((t.b)::numeric < t.b_avg)
     ->  WindowAgg  (cost=135.34..169.29 rows=1940 width=12)
           ->  Sort  (cost=135.34..140.19 rows=1940 width=12)
                 Sort Key: mytable.c
                 ->  Seq Scan on mytable  (cost=0.00..29.40 rows=1940 width=12)
3
  • 1
    Look into window functions and the HAVING clause. Commented Sep 17, 2014 at 16:50
  • It's much easier to search for the right terms indeed, thanks ! I'll update the question accordingly. Commented Sep 17, 2014 at 16:55
  • Post the explain analyze (using explain.depesz.com preferebly). "very slow" is a bit too vague. Commented Sep 17, 2014 at 17:02

1 Answer 1

2

I think this is what you mean to do. This only has one table scan using window functions. As you can see your query below is estimated to cost a lot more running time than this one. Without any selective conditions your going to have scan the table at least once.

 explain select                                                                     
    a_avg
    from
    (
        select
         avg(a) over (partition by c) as a_avg  
        ,avg(b) over (partition by c) as b_avg
        ,c
        ,b
        from mytable
    ) as t
    where b < b_avg
;
                                  QUERY PLAN                                  
──────────────────────────────────────────────────────────────────────────────
 Subquery Scan on t  (cost=135.34..203.24 rows=647 width=32)
   Filter: ((t.b)::numeric < t.b_avg)
   ->  WindowAgg  (cost=135.34..174.14 rows=1940 width=12)
         ->  Sort  (cost=135.34..140.19 rows=1940 width=12)
               Sort Key: mytable.c
               ->  Seq Scan on mytable  (cost=0.00..29.40 rows=1940 width=12)
 Planning time: 0.128 ms
(7 rows)

...

crow@test=# explain SELECT avg(a) FROM mytable t WHERE b<(SELECT avg(b) FROM mytable WHERE c=t.c) GROUP BY c;
                                 QUERY PLAN                                  
─────────────────────────────────────────────────────────────────────────────
 HashAggregate  (cost=66560.08..66560.92 rows=67 width=8)
   Group Key: t.c
   ->  Seq Scan on mytable t  (cost=0.00..66556.85 rows=647 width=8)
         Filter: ((b)::numeric < (SubPlan 1))
         SubPlan 1
           ->  Aggregate  (cost=34.28..34.29 rows=1 width=4)
                 ->  Seq Scan on mytable  (cost=0.00..34.25 rows=10 width=4)
                       Filter: (c = t.c)
 Planning time: 0.191 ms
(9 rows)
Sign up to request clarification or add additional context in comments.

2 Comments

Your code didn't do exactly what I expected from scratch, but your answer helped dramatically, so I edited it with the corrected code.
Edit was rejected, see updated question for the working code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.