When to use SELECT FOR UPDATE

Question

Please help me understand the use-case behind SELECT     FOR UPDATE   Question 1  Is the following a good example of when SELECT     FOR UPDATE should be used   Given    rooms id  tags id  name  room tags room id  tag id    room id and tag id are foreign keys    The application wants to list all rooms and their tags  but needs to differentiate between rooms with no tags versus rooms that have been removed  If SELECT     FOR UPDATE is not used  what could happen is    Initially    rooms contains  id   1  tags contains  id   1  name    cats   room tags contains  room id   1  tag id   1   Thread 1  SELECT id FROM rooms    returns  id   1   Thread 2  DELETE FROM room tags WHERE room id   1  Thread 2  DELETE FROM rooms WHERE id   1  Thread 2   commits the transaction  Thread 1  SELECT tags name FROM room tags  tags WHERE room tags tag id   1 AND tags id   room tags tag id    returns an empty list    Now Thread 1 thinks that room 1 has no tags  but in reality the room has been removed  To solve this problem  Thread 1 should SELECT id FROM rooms FOR UPDATE  thereby preventing Thread 2 from deleting from rooms until Thread 1 is done  Is that correct   Question 2  When should one use SERIALIZABLE transaction isolation versus READ COMMITTED with SELECT     FOR UPDATE   Answers are expected to be portable  not database-specific   If that s not possible  please explain why

User · Accepted Answer

The only portable way to achieve consistency between rooms and tags and making sure rooms are never returned after they had been deleted is locking them with SELECT FOR UPDATE.

However in some systems locking is a side effect of concurrency control, and you achieve the same results without specifying FOR UPDATE explicitly.

To solve this problem, Thread 1 should SELECT id FROM rooms FOR UPDATE, thereby preventing Thread 2 from deleting from rooms until Thread 1 is done. Is that correct?

This depends on the concurrency control your database system is using.

MyISAM in MySQL (and several other old systems) does lock the whole table for the duration of a query.
In SQL Server, SELECT queries place shared locks on the records / pages / tables they have examined, while DML queries place update locks (which later get promoted to exclusive or demoted to shared locks). Exclusive locks are incompatible with shared locks, so either SELECT or DELETE query will lock until another session commits.
In databases which use MVCC (like Oracle, PostgreSQL, MySQL with InnoDB), a DML query creates a copy of the record (in one or another way) and generally readers do not block writers and vice versa. For these databases, a SELECT FOR UPDATE would come handy: it would lock either SELECT or the DELETE query until another session commits, just as SQL Server does.

When should one use REPEATABLE_READ transaction isolation versus READ_COMMITTED with SELECT ... FOR UPDATE?

Generally, REPEATABLE READ does not forbid phantom rows (rows that appeared or disappeared in another transaction, rather than being modified)

In Oracle and earlier PostgreSQL versions, REPEATABLE READ is actually a synonym for SERIALIZABLE. Basically, this means that the transaction does not see changes made after it has started. So in this setup, the last Thread 1 query will return the room as if it has never been deleted (which may or may not be what you wanted). If you don't want to show the rooms after they have been deleted, you should lock the rows with SELECT FOR UPDATE
In InnoDB, REPEATABLE READ and SERIALIZABLE are different things: readers in SERIALIZABLE mode set next-key locks on the records they evaluate, effectively preventing the concurrent DML on them. So you don't need a SELECT FOR UPDATE in serializable mode, but do need them in REPEATABLE READ or READ COMMITED.

Note that the standard on isolation modes does prescribe that you don't see certain quirks in your queries but does not define how (with locking or with MVCC or otherwise).

When I say "you don't need SELECT FOR UPDATE" I really should have added "because of side effects of certain database engine implementation".

User · Answer

Short answers   Q1  Yes   Q2  Doesn t matter which you use   Long answer   A select     for update will  as it implies  select certain rows but also lock them as if they have already been updated by the current transaction  or as if the identity update had been performed   This allows you to update them again in the current transaction and then commit  without another transaction being able to modify these rows in any way   Another way of looking at it  it is as if the following two statements are executed atomically   select   from my table where my condition   update my table set my column   my column where my condition    Since the rows affected by my condition are locked  no other transaction can modify them in any way  and hence  transaction isolation level makes no difference here   Note also that transaction isolation level is independent of locking  setting a different isolation level doesn t allow you to get around locking and update rows in a different transaction that are locked by your transaction   What transaction isolation levels do guarantee  at different levels  is the consistency of data while transactions are in progress

[mysql] When to use SELECT ... FOR UPDATE?

Examples related to mysql

Examples related to sql

Examples related to sql-server

Examples related to transactions

Examples related to select-for-update