How to find rows in one table that have no corresponding row in another table

Question

I have a 1 1 relationship between two tables  I want to find all the rows in table A that don t have a corresponding row in table B  I use this query   SELECT id    FROM tableA   WHERE id NOT IN  SELECT id                      FROM tableB   ORDER BY id desc   id is the primary key in both tables  Apart from primary key indices  I also have a index on tableA id desc    Using H2  Java embedded database   this results in a full table scan of tableB  I want to avoid a full table scan   How can I rewrite this query to run quickly  What index should I should

User · Answer

You have to check every ID in tableA against every ID in tableB. A fully featured RDBMS (such as Oracle) would be able to optimize that into an INDEX FULL FAST SCAN and not touch the table at all. I don't know whether H2's optimizer is as smart as that.

H2 does support the MINUS syntax so you should try this

select id from tableA
minus
select id from tableB
order by id desc

That may perform faster; it is certainly worth benchmarking.

User · Answer

select parentTable id from parentTable left outer join childTable on  parentTable id   childTable parentTableID   where childTable id is null

User · Answer

You can also use exists  since sometimes it s faster than left join  You d have to benchmark them to figure out which one you want to use   select     id from     tableA a where     not exists      select 1 from tableB b where b id   a id    To show that exists can be more efficient than a left join  here s the execution plans of these queries in SQL Server 2008   left join - total subtree cost  1 09724     exists - total subtree cost  1 07421

User · Answer

select tableA id from tableA left outer join tableB on  tableA id   tableB id  where tableB id is null order by tableA id desc    If your db knows how to do index intersections  this will only touch the primary key index

User · Answer

I can t tell you which of these methods will be best on H2  or even if all of them will work   but I did write an article detailing all of the  good  methods available in TSQL   You can give them a shot and see if any of them works for you   http   code msdn microsoft com SQLExamples Wiki View aspx title QueryBasedUponAbsenceOfData amp referringTitle Home

User · Answer

For my small dataset  Oracle gives almost all of these queries the exact same plan that uses the primary key indexes without touching the table   The exception is the MINUS version which manages to do fewer consistent gets despite the higher plan cost   --Create Sample Data  d r o p table tableA  d r o p table tableB   create table tableA as      select rownum-1 ID  chr rownum-1 70  bb  chr rownum-1 100  cc        from dual connect by rownum lt  4     create table tableB as      select rownum ID  chr rownum 70  data1  chr rownum 100  cc from dual    UNION ALL    select rownum 2 ID  chr rownum 70  data1  chr rownum 100  cc        from dual connect by rownum lt  3     a l t e r table tableA Add Primary Key  ID   a l t e r table tableB Add Primary Key  ID    --View Tables  select   from tableA  select   from tableB   --Find all rows in tableA that don t have a corresponding row in tableB   --Method 1  SELECT id FROM tableA WHERE id NOT IN  SELECT id FROM tableB  ORDER BY id DESC   --Method 2  SELECT tableA id FROM tableA LEFT JOIN tableB ON  tableA id   tableB id  WHERE tableB id IS NULL ORDER BY tableA id DESC   --Method 3  SELECT id FROM tableA a WHERE NOT EXISTS  SELECT 1 FROM tableB b WHERE b id   a id      ORDER BY id DESC   --Method 4  SELECT id FROM tableA MINUS SELECT id FROM tableB ORDER BY id DESC

[sql] How to find rows in one table that have no corresponding row in another table

Examples related to sql

Examples related to optimization

Examples related to h2