PostgreSQL - max number of parameters in IN clause

Question

In Postgres  you can specify an IN clause  like this   SELECT   FROM user WHERE id IN  1000  1001  1002    Does anyone know what s the maximum number of parameters you can pass into IN

User · Answer

According to the source code located here, starting at line 850, PostgreSQL doesn't explicitly limit the number of arguments.

The following is a code comment from line 870:

/*
 * We try to generate a ScalarArrayOpExpr from IN/NOT IN, but this is only
 * possible if the inputs are all scalars (no RowExprs) and there is a
 * suitable array type available.  If not, we fall back to a boolean
 * condition tree with multiple copies of the lefthand expression.
 * Also, any IN-list items that contain Vars are handled as separate
 * boolean conditions, because that gives the planner more scope for
 * optimization on such clauses.
 *
 * First step: transform all the inputs, and detect whether any are
 * RowExprs or contain Vars.
 */

User · Answer

As someone more experienced with Oracle DB  I was concerned about this limit too  I carried out a performance test for a query with  10 000 parameters in an IN-list  fetching prime numbers up to 100 000 from a table with the first 100 000 integers by actually listing all the prime numbers as query parameters   My results indicate that you need not worry about overloading the query plan optimizer or getting plans without index usage  since it will transform the query to use   ANY        integer    where it can leverage indices as expected   -- prepare statement  runs instantaneous  PREPARE hugeplan  integer  integer  integer       AS SELECT   FROM primes WHERE n IN   1   2   3        9592    -- fetch the prime numbers  EXECUTE hugeplan 2  3  5       99991    -- EXPLAIN ANALYZE output for the EXECUTE   Index Scan using n idx on primes   cost 0 42  9750 77 rows 9592 width 5   actual time 0 024  15 268 rows 9592 loops 1      Index Cond   n   ANY    2 3 5 7          Execution time  16 063 ms   -- setup  should you care  CREATE TABLE public primes     n integer NOT NULL    prime boolean    CONSTRAINT n idx PRIMARY KEY  n    WITH     OIDS FALSE    ALTER TABLE public primes   OWNER TO postgres   INSERT INTO public primes SELECT generate series 1 100000     However  this  rather old  thread on the pgsql-hackers mailing list indicates that there is still a non-negligible cost in planning such queries  so take my word with a grain of salt

User · Answer

You might want to consider refactoring that query instead of adding an arbitrarily long list of ids    You could use a range if the ids indeed follow the pattern in your example   SELECT   FROM user WHERE id  gt   minValue AND id  lt   maxValue    Another option is to add an inner select   SELECT    FROM user  WHERE id IN       SELECT userId     FROM ForumThreads ft     WHERE ft id   X

User · Answer

Just tried it  the answer is -  out-of-range integer as a 2-byte value  32768

User · Answer

This is not really an answer to the present question  however it might help others too   At least I can tell there is a technical limit of 32767 values   Short MAX VALUE  passable to the PostgreSQL backend  using Posgresql s JDBC driver 9 1   This is a test of  delete from x where id in      100k values      with the postgresql jdbc driver   Caused by  java io IOException  Tried to send an out-of-range integer as a 2-byte value  100000     at org postgresql core PGStream SendInteger2 PGStream java 201

User · Answer

If  you have query like   SELECT   FROM user WHERE id IN  1  2  3  4 -- and thousands of another keys    you may increase performace if rewrite your query like   SELECT   FROM user WHERE id   ANY VALUES  1    2    3    4  -- and thousands of another keys

User · Answer

explain select   from test where id in  values  1    2      QUERY PLAN   Seq Scan on test   cost 0 00  1 38 rows 2 width 208     Filter   id   ANY    1 2    bigint       But if try 2nd query   explain select   from test where id   any  values  1    2      QUERY PLAN  Hash Semi Join   cost 0 05  1 45 rows 2 width 208         Hash Cond   test id     VALUES   column1         - gt   Seq Scan on test   cost 0 00  1 30 rows 30 width 208         - gt   Hash   cost 0 03  0 03 rows 2 width 4               - gt   Values Scan on   VALUES     cost 0 00  0 03 rows 2 width 4    We can see that postgres build temp table and join with it

User · Answer

There is no limit to the number of elements that you are passing to IN clause  If there are more elements it will consider it as array and then for each scan in the database it will check if it is contained in the array or not  This approach is not so scalable  Instead of using IN clause try using INNER JOIN with temp table  Refer http   www xaprb com blog 2006 06 28 why-large-in-clauses-are-problematic  for more info  Using INNER JOIN scales well as query optimizer can make use of hash join and other optimization  Whereas with IN clause there is no way for the optimizer to optimize the query  I have noticed speedup of at least 2x with this change

[postgresql] PostgreSQL - max number of parameters in "IN" clause?

Examples related to postgresql