IF EXISTS before INSERT UPDATE DELETE for optimization

Question

There is quite often situation when you need to execute INSERT  UPDATE or DELETE statement based on some condition  And my question is whether the affect on the performance of the query add IF EXISTS before the command   Example  IF EXISTS SELECT 1 FROM Contacs WHERE  Type    1      UPDATE Contacs SET  Deleted    1 WHERE  Type    1   What about INSERTs or DELETEs

User · Answer

If you re using MySQL  then you can use insert     on duplicate

User · Answer

That is not useful for just one update delete insert  Possibly adds performance if several operators after if condition  In last case better write    update a set    where    if   rowcount  gt  0  begin        end

User · Answer

I m not completely sure  but I get the impression that this question is really about upsert  which is the following atomic operation    If the row exists in both the source and target  UPDATE the target  If the row only exists in the source  INSERT the row into the target   Optionally  If the row exists in the target but not the source  DELETE the row from the target    Developers-turned-DBAs often na  vely write it row-by-row  like this   -- For each row in source IF EXISTS  lt target expression gt       IF  delete flag   1         DELETE  lt target expression gt      ELSE         UPDATE target         SET  lt target columns gt     lt source values gt          WHERE  lt target expression gt  ELSE     INSERT target   lt target columns gt       VALUES   lt source values gt     This is just about the worst thing you can do  for several reasons    It has a race condition   The row can disappear between IF EXISTS and the subsequent DELETE or UPDATE  It s wasteful   For every transaction you have an extra operation being performed  maybe it s trivial  but that depends entirely on how well you ve indexed  Worst of all - it s following an iterative model  thinking about these problems at the level of a single row   This will have the largest  worst  impact of all on overall performance    One very minor  and I emphasize minor  optimization is to just attempt the UPDATE anyway  if the row doesn t exist    ROWCOUNT will be 0 and you can then  safely  insert   -- For each row in source BEGIN TRAN  UPDATE target SET  lt target columns gt     lt source values gt  WHERE  lt target expression gt   IF    ROWCOUNT   0      INSERT target   lt target columns gt       VALUES   lt source values gt    COMMIT   Worst-case  this will still perform two operations for every transaction  but at least there s a chance of only performing one  and it also eliminates the race condition  kind of    But the real issue is that this is still being done for each row in the source   Before SQL Server 2008  you had to use an awkward 3-stage model to deal with this at the set level  still better than row-by-row    BEGIN TRAN  INSERT target   lt target columns gt   SELECT  lt source columns gt  FROM source s WHERE s id NOT IN  SELECT id FROM target   UPDATE t SET  lt target columns gt     lt source columns gt  FROM target t INNER JOIN source s ON t d   s id  DELETE t FROM target t WHERE t id NOT IN  SELECT id FROM source   COMMIT   As I said  performance was pretty lousy on this  but still a lot better than the one-row-at-a-time approach   SQL Server 2008  however  finally introduced MERGE syntax  so now all you have to do is this   MERGE target USING source ON target id   source id WHEN MATCHED THEN UPDATE  lt target columns gt     lt source columns gt  WHEN NOT MATCHED THEN INSERT   lt target columns gt   VALUES   lt source columns gt   WHEN NOT MATCHED BY SOURCE THEN DELETE    That s it   One statement   If you re using SQL Server 2008 and need to perform any sequence of INSERT  UPDATE and DELETE depending on whether or not the row already exists - even if it s just one row - there is no excuse not to be using MERGE   You can even OUTPUT the rows affected by a MERGE into a table variable if you need to find out afterward what was done   Simple  fast  and risk-free   Do it

User · Answer

Yes this will affect performance  the degree to which performance will be affected will be affected by a number of factors   Effectively you are doing the same query  twice   in your example   Ask yourself whether or not you need to be this defensive in your query and in what situations would the row not be there  Also  with an update statement the rows affected is probably a better way to determine if anything has been updated

User · Answer

The performance of an IF EXISTS statement   IF EXISTS SELECT 1 FROM mytable WHERE someColumn   someValue    depends on the indexes present to satisfy the query

User · Answer

IF EXISTS    UPDATE    Don t do it  It forces two scans seeks instead of one   If update doesn t find a match on the WHERE clause  the cost of the update statement is just a seek scan   If it does find a match  and if you preface it w  IF EXISTS  it has to find the same match twice  And in a concurrent environment  what was true for the EXISTS may not be true any longer for the UPDATE   This is precisely why UPDATE DELETE INSERT statements allow a WHERE clause  Use it

User · Answer

You shouldn t do this in most cases  Depending on your transaction level you have created a race condition  now in your example here it wouldn t matter to much  but the data can be changed from the first select to the update  And all you ve done is forced SQL to do more work  The best way to know for sure is to test the two differences and see which one gives you the appropriate performance

User · Answer

IF EXISTS will basically do a SELECT - the same one that UPDATE would   As such  it will decrease performance - if there s nothing to update  you did the same amount of work  UPDATE would have queried same lack of rows as your select  and if there s something to update  you juet did an un-needed select

User · Answer

This largely repeats the preceding  by time  five  no  six   no  seven  answers  but   Yes  the IF EXISTS structure that you have by and large will double the work done by the database  While IF EXISTS will  stop  when it finds the first matching row  it doesn t need to find them all   it s still extra and ultimately pointless effort--for updates and deletes    If no such row s  exist  IF EXISTS will a full scan  table or index  to determine this  If one or more such rows exist  IF EXISTS will read enough of the table index to find the first one  and then UPDATE or DELETE will then re-read that the table to find it again and process it -- and it will read  the rest of  the table to see if there are any more to process as well   Fast enough if properly indexed  but still     So either way  you ll end up reading the entire table or index at least once  But  why bother with the IF EXISTS in the first place   UPDATE Contacs SET  Deleted    1 WHERE  Type    1    or the similar DELETE will work fine whether or not there are any rows found to process  No rows  table scanned  nothing modified  you re done  1  rows  table scanned  everything that ought to be is modified  done again  One pass  no fuss  no muss  no having to worry about  did the database get changed by another user between my first query and my second query    INSERT is the situation where it might be useful -- check if the row is present before adding it  to avoid Primary or Unique Key violations  Of course you have to worry about concurrency -- what if someone else is trying to add this row at the same time as you  Wrapping this all into a single INSERT would handle it all in an implicit transaction  remember your ACID properties     INSERT Contacs  col1  col2  etc  values  val1  val2  etc  where not exists  select 1 from Contacs where col1   val1  IF   rowcount   0 then  lt didn t insert  process accordingly gt

User · Answer

There is a slight effect  since you re doing the same check twice  at least in your example   IF EXISTS SELECT 1 FROM Contacs WHERE  Type    1    Has to query  see if there are any  if true then   UPDATE Contacs SET  Deleted    1 WHERE  Type    1   Has to query  see which ones   same check twice for no reason   Now if the condition you re looking for is indexed it ought to be quick  but for large tables you could see some delay just because you re running the select

User · Answer

You should not do it for UPDATE and DELETE  as if there is impact on performance  it is not a positive one    For INSERT there might be situations where your INSERT will raise an exception  UNIQUE CONSTRAINT violation etc   in which case you might want to prevent it with the IF EXISTS and handle it more gracefully

User · Answer

Neither  UPDATE     IF    ROWCOUNT   0  INSERT   nor  IF EXISTS      UPDATE ELSE INSERT   patterns work as expected under high concurrency  Both may fail  Both may fail very frequently  MERGE is the king - it holds up much better Let us do some stress testing and see for ourselves   Here is the table we shall be using   CREATE TABLE dbo TwoINTs             ID INT NOT NULL PRIMARY KEY        i1 INT NOT NULL         i2 INT NOT NULL         version ROWVERSION         GO  INSERT  INTO dbo TwoINTs           ID  i1  i2   VALUES    1  0  0           IF EXISTS      THEN pattern frequently fails under high concurrency   Let us insert or update rows in a loop using the following simple logic  if a row with given ID exists  update it  and otherwise insert a new one  The following loop implements this logic  Cut and paste it into two tabs  switch into text mode in both tabs  and run them simultaneously   -- hit Ctrl T to execute in text mode  SET NOCOUNT ON    DECLARE  ID INT    SET  ID   0   WHILE  ID  gt  -100000     BEGIN           SET  ID     SELECT  MIN ID                      FROM    dbo TwoINTs                     - 1           BEGIN TRY                BEGIN TRANSACTION               IF EXISTS   SELECT                            FROM    dbo TwoINTs                         WHERE   ID    ID                   BEGIN                       UPDATE  dbo TwoINTs                     SET     i1   1                     WHERE   ID    ID                   END               ELSE                 BEGIN                       INSERT  INTO dbo TwoINTs                               ID  i1  i2                       VALUES     ID  0  0                     END               COMMIT            END TRY         BEGIN CATCH               ROLLBACK                SELECT  error message             END CATCH       END      When we run this script simultaneously in two tabs  we shall immediately get a huge amount of primary key violations in both tabs  This demonstrates how unreliable the IF EXISTS pattern is when it executes under high concurrency   Note  this example also demonstrates that it is not safe to use SELECT MAX ID  1 or SELECT MIN ID -1  as the next available unique value if we do it under concurrency

[sql] IF EXISTS before INSERT, UPDATE, DELETE for optimization

Examples related to sql

Examples related to sql-server

Examples related to tsql

Examples related to optimization