A variation of the materialized view "Possible alternative" outlined by Erwin Brandstetter is possible.
Say, for example, that you don't want duplicates in the randomized values that are returned. So you will need to set a boolean value on the primary table containing your (non-randomized) set of values.
Assuming this is the input table:
id_values id | used
----+--------
1 | FALSE
2 | FALSE
3 | FALSE
4 | FALSE
5 | FALSE
...
Populate the ID_VALUES
table as needed. Then, as described by Erwin, create a materialized view that randomizes the ID_VALUES
table once:
CREATE MATERIALIZED VIEW id_values_randomized AS
SELECT id
FROM id_values
ORDER BY random();
Note that the materialized view does not contain the used column, because this will quickly become out-of-date. Nor does the view need to contain other columns that may be in the id_values
table.
In order to obtain (and "consume") random values, use an UPDATE-RETURNING on id_values
, selecting id_values
from id_values_randomized
with a join, and applying the desired criteria to obtain only relevant possibilities. For example:
UPDATE id_values
SET used = TRUE
WHERE id_values.id IN
(SELECT i.id
FROM id_values_randomized r INNER JOIN id_values i ON i.id = r.id
WHERE (NOT i.used)
LIMIT 5)
RETURNING id;
Change LIMIT
as necessary -- if you only need one random value at a time, change LIMIT
to 1
.
With the proper indexes on id_values
, I believe the UPDATE-RETURNING should execute very quickly with little load. It returns randomized values with one database round-trip. The criteria for "eligible" rows can be as complex as required. New rows can be added to the id_values
table at any time, and they will become accessible to the application as soon as the materialized view is refreshed (which can likely be run at an off-peak time). Creation and refresh of the materialized view will be slow, but it only needs to be executed when new id's are added to the id_values
table.