OR
makes sense (from readability point of view), when there are less values to be compared.
IN
is useful esp. when you have a dynamic source, with which you want values to be compared.
Another alternative is to use a JOIN
with a temporary table.
I don't think performance should be a problem, provided you have necessary indexes.
The OR operator needs a much more complex evaluation process than the IN construct because it allows many conditions, not only equals like IN.
Here is a like of what you can use with OR but that are not compatible with IN: greater. greater or equal, less, less or equal, LIKE and some more like the oracle REGEXP_LIKE. In addition consider that the conditions may not always compare the same value.
For the query optimizer it's easier to to manage the IN operator because is only a construct that defines the OR operator on multiple conditions with = operator on the same value. If you use the OR operator the optimizer may not consider that you're always using the = operator on the same value and, if it doesn't perform a deeper and very much more complex elaboration, it could probably exclude that there may be only = operators for the same values on all the involved conditions, with a consequent preclusion of optimized search methods like the already mentioned binary search.
[EDIT] Probably an optimizer may not implement optimized IN evaluation process, but this doesn't exclude that one time it could happen(with a database version upgrade). So if you use the OR operator that optimized elaboration will not be used in your case.
I think oracle is smart enough to convert the less efficient one (whichever that is) into the other. So I think the answer should rather depend on the readability of each (where I think that IN
clearly wins)
I did a SQL query in a large number of OR (350). Postgres do it 437.80ms.
Now use IN:
23.18ms
The best way to find out is looking at the Execution Plan.
I tried it with Oracle, and it was exactly the same.
CREATE TABLE performance_test AS ( SELECT * FROM dba_objects );
SELECT * FROM performance_test
WHERE object_name IN ('DBMS_STANDARD', 'DBMS_REGISTRY', 'DBMS_LOB' );
Even though the query uses IN
, the Execution Plan says that it uses OR
:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 1416 | 163 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| PERFORMANCE_TEST | 8 | 1416 | 163 (2)| 00:00:02 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='DBMS_LOB' OR "OBJECT_NAME"='DBMS_REGISTRY' OR
"OBJECT_NAME"='DBMS_STANDARD')
Source: Stackoverflow.com