[sql] INNER JOIN vs LEFT JOIN performance in SQL Server

I've created SQL command that uses INNER JOIN on 9 tables, anyway this command takes a very long time (more than five minutes). So my folk suggested me to change INNER JOIN to LEFT JOIN because the performance of LEFT JOIN is better, despite what I know. After I changed it, the speed of query got significantly improved.

I would like to know why LEFT JOIN is faster than INNER JOIN?

My SQL command look like below: SELECT * FROM A INNER JOIN B ON ... INNER JOIN C ON ... INNER JOIN D and so on

Update: This is brief of my schema.

FROM sidisaleshdrmly a -- NOT HAVE PK AND FK
    INNER JOIN sidisalesdetmly b -- THIS TABLE ALSO HAVE NO PK AND FK
        ON a.CompanyCd = b.CompanyCd 
           AND a.SPRNo = b.SPRNo 
           AND a.SuffixNo = b.SuffixNo 
           AND a.dnno = b.dnno
    INNER JOIN exFSlipDet h -- PK = CompanyCd, FSlipNo, FSlipSuffix, FSlipLine
        ON a.CompanyCd = h.CompanyCd
           AND a.sprno = h.AcctSPRNo
    INNER JOIN exFSlipHdr c -- PK = CompanyCd, FSlipNo, FSlipSuffix
        ON c.CompanyCd = h.CompanyCd
           AND c.FSlipNo = h.FSlipNo 
           AND c.FSlipSuffix = h.FSlipSuffix 
    INNER JOIN coMappingExpParty d -- NO PK AND FK
        ON c.CompanyCd = d.CompanyCd
           AND c.CountryCd = d.CountryCd 
    INNER JOIN coProduct e -- PK = CompanyCd, ProductSalesCd
        ON b.CompanyCd = e.CompanyCd
           AND b.ProductSalesCd = e.ProductSalesCd 
    LEFT JOIN coUOM i -- PK = UOMId
        ON h.UOMId = i.UOMId 
    INNER JOIN coProductOldInformation j -- PK = CompanyCd, BFStatus, SpecCd
        ON a.CompanyCd = j.CompanyCd
            AND b.BFStatus = j.BFStatus
            AND b.ProductSalesCd = j.ProductSalesCd
    INNER JOIN coProductGroup1 g1 -- PK = CompanyCd, ProductCategoryCd, UsedDepartment, ProductGroup1Cd
        ON e.ProductGroup1Cd  = g1.ProductGroup1Cd
    INNER JOIN coProductGroup2 g2 -- PK = CompanyCd, ProductCategoryCd, UsedDepartment, ProductGroup2Cd
        ON e.ProductGroup1Cd  = g2.ProductGroup1Cd

This question is related to sql sql-server performance

The answer is


Your performance problems are more likely to be because of the number of joins you are doing and whether the columns you are joining on have indexes or not.

Worst case you could easily be doing 9 whole table scans for each join.


If everything works as it should it shouldn't, BUT we all know everything doesn't work the way it should especially when it comes to the query optimizer, query plan caching and statistics.

First I would suggest rebuilding index and statistics, then clearing the query plan cache just to make sure that's not screwing things up. However I've experienced problems even when that's done.

I've experienced some cases where a left join has been faster than a inner join.

The underlying reason is this: If you have two tables and you join on a column with an index (on both tables). The inner join will produce the same result no matter if you loop over the entries in the index on table one and match with index on table two as if you would do the reverse: Loop over entries in the index on table two and match with index in table one. The problem is when you have misleading statistics, the query optimizer will use the statistics of the index to find the table with least matching entries (based on your other criteria). If you have two tables with 1 million in each, in table one you have 10 rows matching and in table two you have 100000 rows matching. The best way would be to do an index scan on table one and matching 10 times in table two. The reverse would be an index scan that loops over 100000 rows and tries to match 100000 times and only 10 succeed. So if the statistics isn't correct the optimizer might choose the wrong table and index to loop over.

If the optimizer chooses to optimize the left join in the order it is written it will perform better than the inner join.

BUT, the optimizer may also optimize a left join sub-optimally as a left semi join. To make it choose the one you want you can use the force order hint.


Try both queries (the one with inner and left join) with OPTION (FORCE ORDER) at the end and post the results. OPTION (FORCE ORDER) is a query hint that forces the optimizer to build the execution plan with the join order you provided in the query.

If INNER JOIN starts performing as fast as LEFT JOIN, it's because:

  • In a query composed entirely by INNER JOINs, the join order doesn't matter. This gives freedom for the query optimizer to order the joins as it sees fit, so the problem might rely on the optimizer.
  • With LEFT JOIN, that's not the case because changing the join order will alter the results of the query. This means the engine must follow the join order you provided on the query, which might be better than the optimized one.

Don't know if this answers your question but I was once in a project that featured highly complex queries making calculations, which completely messed up the optimizer. We had cases where a FORCE ORDER would reduce the execution time of a query from 5 minutes to 10 seconds.


From my comparisons, I find that they have the exact same execution plan. There're three scenarios:

  1. If and when they return the same results, they have the same speed. However, we must keep in mind that they are not the same queries, and that LEFT JOIN will possibly return more results (when some ON conditions aren't met) --- this is why it's usually slower.

  2. When the main table (first non-const one in the execution plan) has a restrictive condition (WHERE id = ?) and the corresponding ON condition is on a NULL value, the "right" table is not joined --- this is when LEFT JOIN is faster.

  3. As discussed in Point 1, usually INNER JOIN is more restrictive and returns fewer results and is therefore faster.

Both use (the same) indices.


There is one important scenario that can lead to an outer join being faster than an inner join that has not been discussed yet.

When using an outer join, the optimizer is always free to drop the outer joined table from the execution plan if the join columns are the PK of the outer table, and none of the outer table columns are referenced outside of the outer join itself. For example SELECT A.* FROM A LEFT OUTER JOIN B ON A.KEY=B.KEY and B.KEY is the PK for B. Both Oracle (I believe I was using release 10) and Sql Server (I used 2008 R2) prune table B from the execution plan.

The same is not necessarily true for an inner join: SELECT A.* FROM A INNER JOIN B ON A.KEY=B.KEY may or may not require B in the execution plan depending on what constraints exist.

If A.KEY is a nullable foreign key referencing B.KEY, then the optimizer cannot drop B from the plan because it must confirm that a B row exists for every A row.

If A.KEY is a mandatory foreign key referencing B.KEY, then the optimizer is free to drop B from the plan because the constraints guarantee the existence of the row. But just because the optimizer can drop the table from the plan, doesn't mean it will. SQL Server 2008 R2 does NOT drop B from the plan. Oracle 10 DOES drop B from the plan. It is easy to see how the outer join will out-perform the inner join on SQL Server in this case.

This is a trivial example, and not practical for a stand-alone query. Why join to a table if you don't need to?

But this could be a very important design consideration when designing views. Frequently a "do-everything" view is built that joins everything a user might need related to a central table. (Especially if there are naive users doing ad-hoc queries that do not understand the relational model) The view may include all the relevent columns from many tables. But the end users might only access columns from a subset of the tables within the view. If the tables are joined with outer joins, then the optimizer can (and does) drop the un-needed tables from the plan.

It is critical to make sure that the view using outer joins gives the correct results. As Aaronaught has said - you cannot blindly substitute OUTER JOIN for INNER JOIN and expect the same results. But there are times when it can be useful for performance reasons when using views.

One last note - I haven't tested the impact on performance in light of the above, but in theory it seems you should be able to safely replace an INNER JOIN with an OUTER JOIN if you also add the condition <FOREIGN_KEY> IS NOT NULL to the where clause.


Have done a number of comparisons between left outer and inner joins and have not been able to find a consisten difference. There are many variables. Am working on a reporting database with thousands of tables many with a large number of fields, many changes over time (vendor versions and local workflow) . It is not possible to create all of the combinations of covering indexes to meet the needs of such a wide variety of queries and handle historical data. Have seen inner queries kill server performance because two large (millions to tens of millions of rows) tables are inner joined both pulling a large number of fields and no covering index exists.

The biggest issue though, doesn't seem to appeaer in the discussions above. Maybe your database is well designed with triggers and well designed transaction processing to ensure good data. Mine frequently has NULL values where they aren't expected. Yes the table definitions could enforce no-Nulls but that isn't an option in my environment.

So the question is... do you design your query only for speed, a higher priority for transaction processing that runs the same code thousands of times a minute. Or do you go for accuracy that a left outer join will provide. Remember that inner joins must find matches on both sides, so an unexpected NULL will not only remove data from the two tables but possibly entire rows of information. And it happens so nicely, no error messages.

You can be very fast as getting 90% of the needed data and not discover the inner joins have silently removed information. Sometimes inner joins can be faster, but I don't believe anyone making that assumption unless they have reviewed the execution plan. Speed is important, but accuracy is more important.


Outer joins can offer superior performance when used in views.

Say you have a query that involves a view, and that view is comprised of 10 tables joined together. Say your query only happens to use columns from 3 out of those 10 tables.

If those 10 tables had been inner-joined together, then the query optimizer would have to join them all even though your query itself doesn't need 7 out of 10 of the tables. That's because the inner joins themselves might filter down the data, making them essential to compute.

If those 10 tables had been outer-joined together instead, then the query optimizer would only actually join the ones that were necessary: 3 out of 10 of them in this case. That's because the joins themselves are no longer filtering the data, and thus unused joins can be skipped.

Source: http://www.sqlservercentral.com/blogs/sql_coach/2010/07/29/poor-little-misunderstood-views/


I found something interesting in SQL server when checking if inner joins are faster than left joins.

If you dont include the items of the left joined table, in the select statement, the left join will be faster than the same query with inner join.

If you do include the left joined table in the select statement, the inner join with the same query was equal or faster than the left join.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder