LIMIT by distinct values in PostgreSQL

Question

I have a table of contacts with phone numbers similar to this:

Name    Phone
Alice   11
Alice   33
Bob     22
Bob     44
Charlie 12
Charlie 55

I can't figure out how to query such a table with LIMITing the rows not just by plain count but by distinct names. For example, if I had a magic LIMIT_BY clause, it would work like this:

SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 1

Alice 11
Alice 33
-- ^ only the first contact


SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 2

Alice   11
Charlie 12
Alice   33
Charlie 55
-- ^ now with Charlie because his phone 12 goes right after 11. Bob isn't here because he's third, beyond the limit

How could I achieve this result?

In other words, select all rows containing top N distinct Names ordered by Phone

jjanes · Accepted Answer · 2021-04-25 01:07:42Z

1

I don't think that PostgreSQL provides any particularly efficient way to do this, but for 6 rows it doesn't need to be very efficient. You could do a subquery to compute which people you want to see, then join that subquery back against the full table.

select * from 
"Contacts" join
(select name from "Contacts" group by name order by min(phone) limit 2) as limited 
using (name)

You could put the subquery in an IN-list rather than a JOIN, but that often performs worse.

answered Apr 25, 2021 at 1:07

jjanes

44.9k5 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

thesame Over a year ago

order by min(phone) what if two contacts share the minimal phone number? Is there a way to perform full set comparison?

jjanes Over a year ago

I don't know what the outcome of a 'full set comparison' would be expected to be. Could you edit the question to include the example data that has this condition, and the desired result?

thesame Over a year ago

sqlfiddle.com/#!17/97234/2/0 here I want it to return Alice but it returns Bob

jjanes Over a year ago

If you prefer Alice because a is before b, then you could order by min(phone), name limit 1 in the subquery to break the tie

jjanes Over a year ago

I think you can then order by an array_agg, see sqlfiddle.com/#!17/273182/1

|

Gordon Linoff · Accepted Answer · 2021-04-25 01:12:47Z

1

If you want all names that are in the first n rows, you can use in:

select t.*
from t
where t.name in (select t2.name
                 from t t2
                 order by t2.phone
                 limit 2
                );

If you want the first n names by phone:

select t.*
from t
where t.name in (select t2.name
                 from t t2
                 group by t2.name
                 order by min(t2.phone)
                 limit 2
                );

answered Apr 25, 2021 at 1:12

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

2 Comments

thesame Over a year ago

order by t2.phone limit 2 basically leaves me with the two top phone numbers. Not what I need. order by min(t2.phone) what if two contacts have equal min(phone)? Can I perform full set comparison somehow?

Gordon Linoff Over a year ago

@thesame . . . No. These returns all rows that have the same name based on the subquery.

Zeeshan Arif · Accepted Answer · 2021-04-25 20:40:23Z

0

try this:

SELECT distinct X.name
,X.phone
FROM (
SELECT *
FROM (
    SELECT name
        ,rn
    FROM (
        SELECT name
            ,phone
            ,row_number() OVER (
                ORDER BY phone
                ) rn
        FROM "Contacts"
        ) AA
    ) DD
WHERE rn <= 2 --rn is the "limit" variable
) EE
,"Contacts" X
WHERE EE.name = X.name

above seems to be working correctly on following dataset:

create table "Contacts" (name text, phone text);
insert into "Contacts" (name, phone) VALUES
 ('Alice', '11'),
 ('Alice', '33'),
 ('Bob', '22'),
 ('Bob', '44'),
 ('Charlie', '13'),
 ('Charlie', '55'),
 ('Dennis', '12'),
 ('Dennis', '66');

edited Apr 25, 2021 at 20:40

answered Apr 25, 2021 at 3:19

Zeeshan Arif

4994 silver badges16 bronze badges

3 Comments

Zeeshan Arif Over a year ago

Try now, also it would be helpful if you can tell the complete requirements or provide a complete test datasets

thesame Over a year ago

There's no complete test datasets, I'm making them on the fly. But I'll try to formulate the requirement: select all rows containing top N distinct Names ordered by Phone.

thesame Over a year ago

Your query still returns one Name when I ask for two: sqlfiddle.com/#!17/3a9ab/3/0

Collectives™ on Stack Overflow

LIMIT by distinct values in PostgreSQL

3 Answers 3

6 Comments

2 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related