0

Well friends, I have got this query which works but is very long for the execution. I was wondering whether there is a faster way to achieve this.

SELECT id, email FROM member WHERE email IN 
(SELECT email FROM member GROUP BY email HAVING count( * ) >1 ) 
ORDER BY `email` ASC

Basically, there are entries where the same email is appearing more than once, and I just wanted to have those rows returned where there is duplicate entries of 'email'.

The above query works in that direction, but is painfully long.

Cheers.

5 Answers 5

2

You can group the results first, then join them to the member table to insure only rows with duplicate emails will show.

SELECT m.id, m.email
FROM member m JOIN (
    SELECT email 
    FROM member 
    GROUP BY email 
    HAVING COUNT(*) > 1
  ) g ON m.email = g.email
ORDER BY m.email ASC
Sign up to request clarification or add additional context in comments.

3 Comments

Excellent work. Let me tell you something. In my query for my existing table, it was taking 27,9810 secs. Your query is giving me the same thing in 0,0306 secs. Now that's called optimization, isn't it? Cool! Thanks a lot.
One question here, I see you are always doing like FROM member m whereas I have always done FROM member AS m. Is there any difference between these two or are all the same?
The word AS is optional. :) Good only for clarity.
1

Your query is slow because of the nested select, which gets recomputed for every row. The best solution is to rewrite your algorithm a bit so that you can use a query like this:

SELECT id, email 
FROM member GROUP BY email
HAVING count( * ) >1
ORDER BY `email`

Unfortunately, the id you get back will be a random choice among each group. This may be a more useful query:

SELECT GROUP_CONCAT(id), email 
FROM member GROUP BY email
HAVING count( * ) >1
ORDER BY `email`

5 Comments

Thanks awm, it looks great. However, it is removing the duplicate entries, I want them as well :) A way out?
Yes, you'll only get one row per email this way. Using the group_concat, you can see all of the ids of the duplicate emails, though. Or use the Scrum Meister's solution.
yes, scrum meister's solution is working, I have just found. group_concat is returning a blob. :)
Group_concat does return a blob. And in the blob is a comma-separated list of ids. If you're not seeing the ids, test it by casting the result to char: SELECT CAST(GROUP_CONCAT(id) AS CHAR) ...
This is great awm, honestly, I didn't see the comma separated list of ids. I'm sure I'm going to use this more in the future. Great to learn that, thanks again.
0

Can you do this in two stages? First create a temporary table containing all the emails with > 1 occurance, then join the main table to the temp table through the email field...

If your member table has an index on the email field, this should be pretty fast.


CREATE TEMPORARY TABLE ddd
SELECT email, count(*) as cnt FROM member GROUP BY email HAVING cnt>1;

SELECT * FROM ddd
INNER JOIN member USING (email);

2 Comments

Thanks again Dave. Nope, member table has no index on the email field. Scrum's solution was pretty fast, though :)
Looking at @Scrum's solution, it's essentially the same as mine, the nested query will create a temp table internally. You may find adding the index will help further, especially when the table gets /really/ large!
-1

You are doing two queries when you only need to do one

SELECT id, email 
FROM member
GROUP BY email 
HAVING count( * ) > 1
ORDER BY `email` ASC

2 Comments

Will only return 1 row per email.
Thanks Chris, your previous solution was longer count(*) is faster than count(id). This solution is removing the duplicate emails, like awm's first solution, which I need them as well. Thanks anyway :)
-1

select id,email,count(*) as n from member group by id having n > 1;

1 Comment

Will only return 1 row per email.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.