postgresql select query

Question

create temp table tmp_apps (
  id integer
);

create temp table tmp_pos (
     tmp_apps_id integer,
     position integer
);

insert into tmp_apps
select 1 id union
select 2 id 
;

insert into tmp_pos (tmp_apps_id, position)
select 1 tmp_apps_id, 1 as position union all
select 1 tmp_apps_id, 1 as position union all
select 1 tmp_apps_id, 2 as position union all
select 1 tmp_apps_id, 3 as position union all
select 1 tmp_apps_id, 3 as position union all
select 2 tmp_apps_id, 1 as position
;
/*
Expected result:
tmp_apps_id tmp_pos_position
1           1,2
2           1
*/

How to get first 2 comma separated, distinct tmp_pos.position for each tmp_apps.id
It is possible ?

+1 for a working test case. That's how you make it easier to help you. — Erwin Brandstetter
– Erwin Brandstetter, Commented Jan 29, 2012 at 0:05

Erwin Brandstetter · Accepted Answer · 2012-01-28 22:16:51Z

1

WITH x AS (
    SELECT tmp_apps_id
         , position
         , row_number() OVER (PARTITION BY tmp_apps_id ORDER BY position) AS rn
    FROM   tmp_pos
    GROUP  BY 1, 2
    ORDER  BY 1, 2
    )
SELECT tmp_apps_id, string_agg(position::text, ', ')
FROM   x
WHERE  rn < 3
GROUP  BY 1;

This happens to be much like the solution @araqnid posted a bit faster than me.
CTE or subquery, that's just two ways for doing the same in this case.

My version is different in one important aspect:
By using GROUP BY instead of DISTINCT to get distinct values, you can apply the window function row_number() (key element for the solution) in the same query level and do not need another subquery (or CTE).

The reason for this is that aggregation (GROUP BY) is applied before window functions while DISTINCT is applied after. In many situations DISTINCT and GROUP BY offer equally good solutions. In a case like this you can put the subtle difference to good use if you know it. I expect this to be quite a bit faster.

edited Jan 28, 2012 at 22:16

answered Jan 28, 2012 at 22:06

Erwin Brandstetter

668k159 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

cetver Over a year ago

First I would like to thank you for the excellent answer. I really surprised by grouping by column number and i want to ask: In general case, this way gives speed increase in comparison with grouping by column name ?

Erwin Brandstetter Over a year ago

@cetver: If you mean the ordinal numbers in GROUP BY and ORDER BY, then no. That's just notational convenience. No effect on performance.

araqnid · Accepted Answer · 2012-01-28 22:01:17Z

1

select tmp_apps_id, string_agg(position::text,',')
from (
 select tmp_apps_id, position,
        row_number() over (partition by tmp_apps_id order by position)
 from (
  select distinct tmp_apps_id, tmp_pos.position from tmp_pos
 ) x
) x
where row_number <= 2
group by tmp_apps_id;

answered Jan 28, 2012 at 22:01

araqnid

135k25 gold badges163 silver badges138 bronze badges

Comments

p.campbell · Accepted Answer · 2012-01-28 21:32:02Z

0

Try this using array_agg or string_agg, depending on your version of Postgres:

SELECT tmp_apps_id, array_agg(tmp_pos_position)
FROM tmp_pos_position
GROUP BY tmp_apps_id

In 9.0, use the string_agg function:

SELECT tmp_apps_id, string_agg(tmp_pos_position, ',')
FROM tmp_pos_position
GROUP BY tmp_apps_id

answered Jan 28, 2012 at 21:32

p.campbell

101k70 gold badges263 silver badges326 bronze badges

1 Comment

cetver Over a year ago

string_agg processes all records, but i need only 2 with distinct values

Collectives™ on Stack Overflow

postgresql select query

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related