[sql] SQL: ... WHERE X IN (SELECT Y FROM ...)

Is the following the most efficient in SQL to achieve its result:

SELECT * 
  FROM Customers 
 WHERE Customer_ID NOT IN (SELECT Cust_ID FROM SUBSCRIBERS)

Could some use of joins be better and achieve the same result?

This question is related to sql

The answer is


One reason why you might prefer to use a JOIN rather than NOT IN is that if the Values in the NOT IN clause contain any NULLs you will always get back no results. If you do use NOT IN remember to always consider whether the sub query might bring back a NULL value!

RE: Question in Comments

'x' NOT IN (NULL,'a','b')

= 'x' <> NULL and 'x' <> 'a' and 'x' <> 'b'

= Unknown and True and True

= Unknown


Maybe try this

Select cust.*

From dbo.Customers cust
Left Join dbo.Subscribers subs on cust.Customer_ID = subs.Customer_ID
Where subs.Customer_Id Is Null

SELECT Customers.* 
  FROM Customers 
 WHERE NOT EXISTS (
       SELECT *
         FROM SUBSCRIBERS AS s
         JOIN s.Cust_ID = Customers.Customer_ID) 

When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, the query can use an index within the sub-query.


If you want to know which is more effective, you should try looking at the estimated query plans, or the actual query plans after execution. It'll tell you the costs of the queries (I find CPU and IO cost to be interesting). I wouldn't be surprised much if there's little to no difference, but you never know. I've seen certain queries use multiple cores on our database server, while a rewritten version of that same query would only use one core (needless to say, the query that used all 4 cores was a good 3 times faster). Never really quite put my finger on why that is, but if you're working with large result sets, such differences can occur without your knowing about it.


Any mature enough SQL database should be able to execute that just as effectively as the equivalent JOIN. Use whatever is more readable to you.