0

Table 1 contains a unique ID column with millions of rows.

Table 2 contains 2 columns, matchId1, matchId2 which can hold an ID from Table 1. There can be many rows in Table 2 referencing a given ID.

How can I list the IDs from Table 1 that are not contained in Table 2 (in either column) in an efficient way?

2
  • NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL MySQL Commented Dec 17, 2013 at 10:56
  • @valex: It's nice but how to apply it to two columns? I need to remove all rows that have either matchid1 or matchid2 set. A join on matchid1 will ignore matchid2. Commented Dec 17, 2013 at 11:44

3 Answers 3

5
SELECT x.*
  FROM table1 x
  LEFT
  JOIN table2 y
    ON y.id = x.id
 WHERE y.id IS NULL;

or, more specifically...

SELECT x.*
  FROM table1 x
  LEFT
  JOIN table2 y
    ON x.id IN(y.id1,y.id2)
 WHERE y.id IS NULL;

This assumes that there is a single column PK on table2

Sign up to request clarification or add additional context in comments.

4 Comments

I as looking for something similar yesterday and didn't find much. Stumbled upon this today, awesome. Cheers from me anyway.
OK, but this is just about the easiest question in RDBMS-land, so don't go overboard! ;-)
Soneone's having a laugh.
This is not what I need, I need y.id1 and y.id2 to matter in this. I already found such answers on SO. If y.id1 is null but y.id2 is not then I do not want this row. So it seems to me there has to be 2 joins (maybe through an union), I have no idea how to do this in a single request, and I have no idea how to do it efficiently at all
2

A couple of suggestions

Using jons

SELECT x.*
FROM table1 x
LEFT JOIN table2 y ON y.matchId1 = x.id
LEFT JOIN table2 z ON z.matchId2 = x.id
WHERE y.matchId1 IS NULL AND z.matchId2 IS NULL

Using IN

SELECT x.*
FROM table1 x
WHERE x.id NOT IN
(
    SELECT matchId1 FROM table2
    UNION
    SELECT matchId2 FROM table2
)

2 Comments

Which do you think is the fastest? I had doubts that the first would work for my case since I didn't know if common rows from the joins would be interpreted as one row. BTW, you got a typo "jons" instead of "joins").
I would suspect the first query is more efficient, as I suspect MySQL will struggle to use any indexes on matchid1 / matchid2 in the 2nd query.
1

Try this:

SELECT table1.* 
FROM table1
LEFT JOIN Table2 on table1.id in (table2.matchId1,table2.matchId2)
WHERE table2.matchId1 IS NULL AND table2.matchId2 IS NULL

1 Comment

From the looks of it, it does exactly what I mean and is easy to understand. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.