[sql-server] INSERT INTO vs SELECT INTO

What is the difference between using

SELECT ... INTO MyTable FROM...

and

INSERT INTO MyTable (...)
SELECT ... FROM ....

?

From BOL [ INSERT, SELECT...INTO ], I know that using SELECT...INTO will create the insertion table on the default file group if it doesn't already exist, and that the logging for this statement depends on the recovery model of the database.

  1. Which statement is preferable?
  2. Are there other performance implications?
  3. What is a good use case for SELECT...INTO over INSERT INTO ...?

Edit: I already stated that I know that that SELECT INTO... creates a table where it doesn't exist. What I want to know is that SQL includes this statement for a reason, what is it? Is it doing something different behind the scenes for inserting rows, or is it just syntactic sugar on top of a CREATE TABLE and INSERT INTO.

This question is related to sql-server tsql sql-server-2008

The answer is


  1. Which statement is preferable? Depends on what you are doing.

  2. Are there other performance implications? If the table is a permanent table, you can create indexes at the time of table creation which has implications for performance both negatively and positiviely. Select into does not recreate indexes that exist on current tables and thus subsequent use of the table may be slower than it needs to be.

  3. What is a good use case for SELECT...INTO over INSERT INTO ...? Select into is used if you may not know the table structure in advance. It is faster to write than create table and an insert statement, so it is used to speed up develoment at times. It is often faster to use when you are creating a quick temp table to test things or a backup table of a specific query (maybe records you are going to delete). It should be rare to see it used in production code that will run multiple times (except for temp tables) because it will fail if the table was already in existence.

It is sometimes used inappropriately by people who don't know what they are doing. And they can cause havoc in the db as a result. I strongly feel it is inappropriate to use SELECT INTO for anything other than a throwaway table (a temporary backup, a temp table that will go away at the end of the stored proc ,etc.). Permanent tables need real thought as to their design and SELECT INTO makes it easy to avoid thinking about anything even as basic as what columns and what datatypes.

In general, I prefer the use of the create table and insert statement - you have more controls and it is better for repeatable processes. Further, if the table is a permanent table, it should be created from a separate create table script (one that is in source control) as creating permanent objects should not, in general, in code are inserts/deletes/updates or selects from a table. Object changes should be handled separately from data changes because objects have implications beyond the needs of a specific insert/update/select/delete. You need to consider the best data types, think about FK constraints, PKs and other constraints, consider auditing requirements, think about indexing, etc.


Each statement has a distinct use case. They are not interchangeable.

SELECT...INTO MyTable... creates a new MyTable where one did not exist before.

INSERT INTO MyTable...SELECT... is used when MyTable already exists.


SELECT INTO is typically used to generate temp tables or to copy another table (data and/or structure).

In day to day code you use INSERT because your tables should already exist to be read, UPDATEd, DELETEd, JOINed etc. Note: the INTO keyword is optional with INSERT

That is, applications won't normally create and drop tables as part of normal operations unless it is a temporary table for some scope limited and specific usage.

A table created by SELECT INTO will have no keys or indexes or constraints unlike a real, persisted, already existing table

The 2 aren't directly comparable because they have almost no overlap in usage


I only want to cover second point of the question that is related to performance, because no body else has covered this. Select Into is a lot more faster than insert into, when it comes to tables with large datasets. I prefer select into when I have to read a very large table. insert into for a table with 10 million rows may take hours while select into will do this in minutes, and as for as losing indexes on new table is concerned you can recreate the indexes by query and can still save a lot more time when compared to insert into.


The simple difference between select Into and Insert Into is: --> Select Into don't need existing table. If you want to copy table A data, you just type Select * INTO [tablename] from A. Here, tablename can be existing table or new table will be created which has same structure like table A.

--> Insert Into do need existing table.INSERT INTO [tablename] SELECT * FROM A;. Here tablename is an existing table.

Select Into is usually more popular to copy data especially backup data.

You can use as per your requirement, it is totally developer choice which should be used in his scenario.

Performance wise Insert INTO is fast.

References :

https://www.w3schools.com/sql/sql_insert_into_select.asp https://www.w3schools.com/sql/sql_select_into.asp


Actually SELECT ... INTO not only creates the table but will fail if it already exists, so basically the only time you would use it is when the table you are inserting to does not exists.

In regards to your EDIT:

I personally mainly use SELECT ... INTO when I am creating a temp table. That to me is the main use. However I also use it when creating new tables with many columns with similar structures to other tables and then edit it in order to save time.


Select into creates new table for you at the time and then insert records in it from the source table. The newly created table has the same structure as of the source table.If you try to use select into for a existing table it will produce a error, because it will try to create new table with the same name. Insert into requires the table to be exist in your database before you insert rows in it.


The primary difference is that SELECT INTO MyTable will create a new table called MyTable with the results, while INSERT INTO requires that MyTable already exists.

You would use SELECT INTO only in the case where the table didn't exist and you wanted to create it based on the results of your query. As such, these two statements really are not comparable. They do very different things.

In general, SELECT INTO is used more often for one off tasks, while INSERT INTO is used regularly to add rows to tables.

EDIT:
While you can use CREATE TABLE and INSERT INTO to accomplish what SELECT INTO does, with SELECT INTO you do not have to know the table definition beforehand. SELECT INTO is probably included in SQL because it makes tasks like ad hoc reporting or copying tables much easier.


The other answers are all great/correct (the main difference is whether the DestTable exists already (INSERT), or doesn't exist yet (SELECT ... INTO))

You may prefer to use INSERT (instead of SELECT ... INTO), if you want to be able to COUNT(*) the rows that have been inserted so far.

Using SELECT COUNT(*) ... WITH NOLOCK is a simple/crude technique that may help you check the "progress" of the INSERT; helpful if it's a long-running insert, as seen in this answer).

[If you use...] INSERT DestTable SELECT ... FROM SrcTable ...then your SELECT COUNT(*) from DestTable WITH (NOLOCK) query would work.


Select into for large datasets may be good only for a single user using one single connection to the database doing a bulk operation task. I do not recommend to use

SELECT * INTO table

as this creates one big transaction and creates schema lock to create the object, preventing other users to create object or access system objects until the SELECT INTO operation completes.

As proof of concept open 2 sessions, in first session try to use

select into temp table from a huge table 

and in the second section try to

create a temp table 

and check the locks, blocking and the duration of second session to create a temp table object. My recommendation it is always a good practice to create and Insert statement and if needed for minimal logging use trace flag 610.


Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to sql-server-2008

Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object How to Use Multiple Columns in Partition By And Ensure No Duplicate Row is Returned SQL Server : How to test if a string has only digit characters Conversion of a varchar data type to a datetime data type resulted in an out-of-range value in SQL query Get last 30 day records from today date in SQL Server How to subtract 30 days from the current date using SQL Server Calculate time difference in minutes in SQL Server SQL Connection Error: System.Data.SqlClient.SqlException (0x80131904) SQL Server Service not available in service list after installation of SQL Server Management Studio How to delete large data of table in SQL without log?