[sql-server] SQL Server IN vs. EXISTS Performance

So, IN is not the same as EXISTS nor it will produce the same execution plan.

Usually EXISTS is used in a correlated subquery, that means you will JOIN the EXISTS inner query with your outer query. That will add more steps to produce a result as you need to solve the outer query joins and the inner query joins then match their where clauses to join both.

Usually IN is used without correlating the inner query with the outer query, and that can be solved in only one step (in the best case scenario).

Consider this:

  1. If you use IN and the inner query result is millions of rows of distinct values, it will probably perform SLOWER than EXISTS given that the EXISTS query is performant (has the right indexes to join with the outer query).

  2. If you use EXISTS and the join with your outer query is complex (takes more time to perform, no suitable indexes) it will slow the query by the number of rows in the outer table, sometimes the estimated time to complete can be in days. If the number of rows is acceptable for your given hardware, or the cardinality of data is correct (for example fewer DISTINCT values in a large data set) IN can perform faster than EXISTS.

  3. All of the above will be noted when you have a fair amount of rows on each table (by fair I mean something that exceeds your CPU processing and/or ram thresholds for caching).

So the ANSWER is it DEPENDS. You can write a complex query inside IN or EXISTS, but as a rule of thumb, you should try to use IN with a limited set of distinct values and EXISTS when you have a lot of rows with a lot of distinct values.

The trick is to limit the number of rows to be scanned.

Regards,

MarianoC

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to sql-server-2005

Add a row number to result set of a SQL query SQL Server : Transpose rows to columns Select info from table where row has max date How to query for Xml values and attributes from table in SQL Server? How to restore SQL Server 2014 backup in SQL Server 2008 SQL Server 2005 Using CHARINDEX() To split a string Is it necessary to use # for creating temp tables in SQL server? SQL Query to find the last day of the month JDBC connection to MSSQL server in windows authentication mode How to convert the system date format to dd/mm/yy in SQL Server 2008 R2?

Examples related to exists

Select rows which are not present in other table How to fix "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS" error? Check if record exists from controller in Rails sql: check if entry in table A exists in table B How to exclude records with certain values in sql select SQL - IF EXISTS UPDATE ELSE INSERT INTO Linq select objects in list where exists IN (A,B,C) PL/pgSQL checking if a row exists How to Check if value exists in a MySQL database MongoDB: How to query for records where field is null or not set?

Examples related to query-performance

Select distinct values from a table field SQL Server IN vs. EXISTS Performance

Examples related to sql-in

IN vs ANY operator in PostgreSQL SQL Server IN vs. EXISTS Performance Passing a varchar full of comma delimited values to a SQL Server IN function ORDER BY the IN value list