[sql] SQL join format - nested inner joins

I have the following SQL statement in a legacy system I'm refactoring. It is an abbreviated view for the purposes of this question, just returning count(*) for the time being.

SELECT COUNT(*)
FROM Table1 
    INNER JOIN Table2 
        INNER JOIN Table3 ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2 
    ON Table1.DifferentKey = Table3.DifferentKey

It is generating a very large number of records and killing the system, but could someone please explain the syntax? And can this be expressed in any other way?

  • Table1 contains 419 rows
  • Table2 contains 3374 rows
  • Table3 contains 28182 rows

EDIT:

Suggested reformat

SELECT COUNT(*)
FROM Table1 
    INNER JOIN Table3
          ON Table1.DifferentKey = Table3.DifferentKey
    INNER JOIN Table2 
          ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2

This question is related to sql sql-server sql-server-2005 tsql

The answer is


For readability, I restructured the query... starting with the apparent top-most level being Table1, which then ties to Table3, and then table3 ties to table2. Much easier to follow if you follow the chain of relationships.

Now, to answer your question. You are getting a large count as the result of a Cartesian product. For each record in Table1 that matches in Table3 you will have X * Y. Then, for each match between table3 and Table2 will have the same impact... Y * Z... So your result for just one possible ID in table 1 can have X * Y * Z records.

This is based on not knowing how the normalization or content is for your tables... if the key is a PRIMARY key or not..

Ex:
Table 1       
DiffKey    Other Val
1          X
1          Y
1          Z

Table 3
DiffKey   Key    Key2  Tbl3 Other
1         2      6     V
1         2      6     X
1         2      6     Y
1         2      6     Z

Table 2
Key    Key2   Other Val
2      6      a
2      6      b
2      6      c
2      6      d
2      6      e

So, Table 1 joining to Table 3 will result (in this scenario) with 12 records (each in 1 joined with each in 3). Then, all that again times each matched record in table 2 (5 records)... total of 60 ( 3 tbl1 * 4 tbl3 * 5 tbl2 )count would be returned.

So, now, take that and expand based on your 1000's of records and you see how a messed-up structure could choke a cow (so-to-speak) and kill performance.

SELECT
      COUNT(*)
   FROM
      Table1 
         INNER JOIN Table3
            ON Table1.DifferentKey = Table3.DifferentKey
            INNER JOIN Table2
               ON Table3.Key =Table2.Key
               AND Table3.Key2 = Table2.Key2 

Since you've already received help on the query, I'll take a poke at your syntax question:

The first query employs some lesser-known ANSI SQL syntax which allows you to nest joins between the join and on clauses. This allows you to scope/tier your joins and probably opens up a host of other evil, arcane things.

Now, while a nested join cannot refer any higher in the join hierarchy than its immediate parent, joins above it or outside of its branch can refer to it... which is precisely what this ugly little guy is doing:

select
 count(*)
from Table1 as t1
join Table2 as t2
    join Table3 as t3
    on t2.Key = t3.Key                   -- join #1
    and t2.Key2 = t3.Key2 
on t1.DifferentKey = t3.DifferentKey     -- join #2  

This looks a little confusing because join #2 is joining t1 to t2 without specifically referencing t2... however, it references t2 indirectly via t3 -as t3 is joined to t2 in join #1. While that may work, you may find the following a bit more (visually) linear and appealing:

select
 count(*)
from Table1 as t1
    join Table3 as t3
        join Table2 as t2
        on t2.Key = t3.Key                   -- join #1
        and t2.Key2 = t3.Key2   
    on t1.DifferentKey = t3.DifferentKey     -- join #2

Personally, I've found that nesting in this fashion keeps my statements tidy by outlining each tier of the relationship hierarchy. As a side note, you don't need to specify inner. join is implicitly inner unless explicitly marked otherwise.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to sql-server-2005

Add a row number to result set of a SQL query SQL Server : Transpose rows to columns Select info from table where row has max date How to query for Xml values and attributes from table in SQL Server? How to restore SQL Server 2014 backup in SQL Server 2008 SQL Server 2005 Using CHARINDEX() To split a string Is it necessary to use # for creating temp tables in SQL server? SQL Query to find the last day of the month JDBC connection to MSSQL server in windows authentication mode How to convert the system date format to dd/mm/yy in SQL Server 2008 R2?

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL