60

i have table storing product price information, the table looks similar to, (no is the primary key)

no   name    price    date
1    paper   1.99     3-23
2    paper   2.99     5-25
3    paper   1.99     5-29
4    orange  4.56     4-23
5    apple   3.43     3-11

right now I want to select all the rows where the "name" field appeared more than once in the table. Basically, i want my query to return the first three rows.

I tried:

SELECT * FROM product_price_info GROUP BY name HAVING COUNT(*) > 1  

but i get an error saying:

column "product_price_info.no" must appear in the GROUP BY clause or be used in an aggregate function

5 Answers 5

93
SELECT * 
FROM product_price_info 
WHERE name IN (SELECT name 
               FROM product_price_info 
               GROUP BY name HAVING COUNT(*) > 1)
Sign up to request clarification or add additional context in comments.

3 Comments

Better (as in faster) to use e.g. COUNT(id) per the official docs.
You could join "name" inside the subquery with the group by, and use the "exists" clause instead of "in". I think it's faster
@Madbreaks, where exactly in documentation?
13

Try this:

SELECT no, name, price, "date"
FROM (
  SELECT no, name, price, "date",
         COUNT(*) OVER (PARTITION BY name) AS cnt 
  FROM product_price_info ) AS t
WHERE t.cnt > 1

You can use the window version of COUNT to get the population of each name partition. Then, in an outer query, filter out name partitions having a population that is less than 2.

1 Comment

I have tested the versions from Juan Carlos Oropeza, Giorgos Betsos, and jarlh in SQLite3. This version is the fastest one. It's 26% faster than the other two.
13

Window Functions are really nice for this.

SELECT p.*, count(*) OVER (PARTITION BY name) FROM product p;

For a full example:

CREATE TABLE product (no SERIAL, name text, price NUMERIC(8,2), date DATE);

INSERT INTO product(name, price, date) values
('paper', 1.99, '2017-03-23'),
('paper', 2.99, '2017-05-25'),
('paper', 1.99, '2017-05-29'),
('orange', 4.56, '2017-04-23'),
('apple', 3.43, '2017-03-11')
;

WITH report AS (
  SELECT p.*, count(*) OVER (PARTITION BY name) as count FROM product p
)
SELECT * FROM report WHERE count > 1;

Gives:

 no |  name  | price |    date    | count
----+--------+-------+------------+-------
  1 | paper  |  1.99 | 2017-03-23 |     3
  2 | paper  |  2.99 | 2017-05-25 |     3
  3 | paper  |  1.99 | 2017-05-29 |     3
(3 rows)

2 Comments

Nice but this doesn't fully answer op's question
@Madbreaks thanks for catching that. I updated the answer.
3

Self join version, use a sub-query that returns the name's that appears more than once.

select t1.*
from tablename t1
join (select name from tablename group by name having count(*) > 1) t2
  on t1.name = t2.name

Basically the same as IN/EXISTS versions, but probably a bit faster.

Comments

1
SELECT name, count(name)
FROM product_price_info
GROUP BY name
HAVING COUNT(name) > 1
LIMIT 3

2 Comments

This doesn't do what the question asks for. It should select all columns for each row where the name value appears more than once. It should not add aggregation or new columns.
Funny enough this is the answer to the question I asked google and it gave me this SO

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.