0

I want to send bulk email to my site users and the email will be sent to lets say 100k+ users at a time. What I want to achieve is to keep record of my mail() function calls i.e. as soon as the mail is sent to the user, an entry is added for that user in the temporary table. This helps me in condition if my server crashes then I can send the emails to the rest of the users.

Here's my problem:

I select the records from the users table where the records are NOT IN (select sent_ids from temp_table)

If there are to many sent IDs, let's say 70% of the total users, then it will make the query relatively slow.

What can I do to solve my problem?

5
  • I see most of you prefer joins but my colleague does not agree with it. He has preferred that the maximum emails will be sent for up to 10 times a day, therefore he thinks that we should add 10 columns for it in the users table. So as soon as the email is sent to specific user, his is_sent field should be set to 1 for the email (email number of the day, let's say email no. 1). Do you think this is the fastest way than all other solutions? And should I follow this or not? Commented Jun 20, 2013 at 9:59
  • This smells like a bad design to me, I surely would not follow that. This rapes the database design patterns quite much IMHO. And even if it were faster, it introduces a lot of other difficoulties and maintanability issues, it's not worth. Commented Jun 20, 2013 at 10:02
  • Well he says that the problem with IN is that he has faced server crash issues because as soon as the records are too long, the query starts to take so long and the server crashes. I'm confused :s Commented Jun 20, 2013 at 10:09
  • 1
    The best way is to test it if you have so critical amount of data. But if you have proper table setup, perhaps using indices also, the LEFT JOIN should be very fast even with millions of records. Commented Jun 20, 2013 at 10:18
  • Thanks, looks good. I'll inform and select the answer after I try it. Commented Jun 20, 2013 at 10:21

4 Answers 4

1

Have a look at EXIST/NOT EXISTS optimizations in mysql , docs.

Sign up to request clarification or add additional context in comments.

Comments

1

It should not be slower than other variants, because in most cases MySQL will optimize the IN clauses as much as possible (at least later versions). However, you could try LEFT JOINing the temp table by the id, and then check for sent_id IS NULL to get the users who you didn't send your mail to already.

Comments

1

Two options :

  1. newer versions of mysql (5.6, or mariadb 5.5) should deal with this request much better https://blog.mozilla.org/it/2013/01/29/in-subqueries-in-mysql-5-6-are-optimized-away/
  2. You can use an JOIN statement : SELECT users/* FROM users JOIN temptable ON send_id = user_id

Comments

1

Sounds like a job for an outer join:

SELECT * FROM users u 
LEFT JOIN temp_table t
    ON u.id = t.id
WHERE t.id IS NULL

This will list all the users that have not been sent an email.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.