3

I'm trying to write a query in PostgreSQL and I'm getting a little frustrated because it works in other database engines. I need to select the top 5 users from a given joins table like this:

SELECT users.*, 
       COUNT(deals.id) AS num_deals 
FROM users, deals 
WHERE deals.users_id = users.id 
GROUP BY users.id 
ORDER BY num_deals LIMIT 5;

I need the top 5 users. This code works in sqlite, mysql, etc, yet PostgreSQL refuses to select additional fields that aren't used in aggregate functions. I'm getting the following error:

PGError: ERROR:  column "users.id" must appear in the GROUP BY clause or be used in an aggregate function

How can I do this in PostgreSQL??

6
  • 2
    I do believe it works in MySQL and SQLite but the "etc" is wrong. No other database allows this. Those are the only two. Commented Jan 21, 2011 at 8:02
  • 1
    Actually, assuming that users.id is a PRIMARY KEY, it is not wrong. (Though MySQL for example does it both when it's right and when it's wrong). PostgreSQL 9.1 will support running this query the way it is written - since the GROUP BY is on the PRIMARY KEY, we can infer that all the other columns are functionally dependent on it. Commented Jan 21, 2011 at 9:58
  • @Magnus: I know that 9.1 will support this, but 9.1 is currently not available Commented Jan 21, 2011 at 14:26
  • @horse: absolutely true. But the statement that they are wrong is partially (though only partially) incorrect. Commented Jan 21, 2011 at 15:29
  • @MagnusHagander: Do you know why a PRIMARY KEY is required and not mere UNIQU-ness? I cannot imagine a case where uniqueness would not be good enough. Commented Mar 31, 2012 at 22:33

4 Answers 4

9

You could try:

SELECT users.*, a.num_deals FROM users, (
    SELECT deal.id as dealid, COUNT(deals.id) AS num_deals 
    FROM deals 
    GROUP BY deal.id
) a where users.id = a.dealid
ORDER BY a.num_deals DESC
LIMIT 5
Sign up to request clarification or add additional context in comments.

6 Comments

+1 for a cross-dbms solution. But the comma after users in the first line is wrong and you should order by num_deals DESC
can't you remove the reference to the 'users' table in the subquery? that way it will only look at each table once.
@a_horse_with_no_name: You're correct about the ORDER BY. ...and at first I thought you were correct about the comma in the first line, but I think it is actually correct (it's separating the users table from the subquery/a table)
@araqnid: I've implemented your suggestion. Thanks.
@Gerrat: ah, you are right about the comma. I'm not used to that old-style joins, I always use JOIN ... ON syntax. Sorry for the noise
|
2

Assuming that users.id IS a PK, then you can either

wait for 9.1

group by all fields

use an aggregate (i.e. max() ) on all fields

2 Comments

with use max() - it low speed
The use distinct on: select distinct on (col1, col2) col1, col2, col3, col4 from yada;
2

One other solution that works is to use all attributes implicitly in GROUP BY

Thus following will be final query

SELECT users.*, 
       COUNT(deals.id) AS num_deals 
FROM users, deals 
WHERE deals.users_id = users.id 
GROUP BY users.id, users.name, users.attrib1, ..., users.attribN
ORDER BY num_deals LIMIT 5;

If you are using framework like rails then you can implement this easily with Model.column_names function.

Comments

0

Just in case of somebody wants ANSI-92 standard solution and doesn't like 'Oracle' way to join tables...

SELECT users.*, num_deals
FROM users
JOIN
  (SELECT deals.users_id as users_id, count(deals.users_id) as num_deals
   FROM deals
   GROUP BY deals.id) grouped_user_deals
ON grouped_user_deals.users_id = users.id
ORDER BY num_deals DESC
LIMIT 5;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.