I'm running a performance comparison between using 1000 INSERT statements:
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('6f3f7257-a3d8-4a78-b2e1-c9b767cfe1c1', 'First 0', 'Last 0', 0)
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('32023304-2e55-4768-8e52-1ba589b82c8b', 'First 1', 'Last 1', 1)
...
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('f34d95a7-90b1-4558-be10-6ceacd53e4c4', 'First 999', 'Last 999', 999)
..versus using single INSERT statement with 1000 values:
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES
('db72b358-e9b5-4101-8d11-7d7ea3a0ae7d', 'First 0', 'Last 0', 0),
('6a4874ab-b6a3-4aa4-8ed4-a167ab21dd3d', 'First 1', 'Last 1', 1),
...
('9d7f2a58-7e57-4ed4-ba54-5e9e335fb56c', 'First 999', 'Last 999', 999)
To my big surprise, the results are the opposite of what I thought:
The test is executed directly in MSSQL Management Studio with SQL Server Profiler used for measurement (and I've got similar results running it from C# code using SqlClient, which is even more suprising considering all the DAL layers roundtrips)
Can this be reasonable or somehow explained? How come, a supposedly faster method results in 10 times (!) worse performance?
Thank you.
EDIT: Attaching execution plans for both:
This question is related to
sql
sql-server
performance
sql-server-2008
tsql
I ran into a similar situation trying to convert a table with several 100k rows with a C++ program (MFC/ODBC).
Since this operation took a very long time, I figured bundling multiple inserts into one (up to 1000 due to MSSQL limitations). My guess that a lot of single insert statements would create an overhead similar to what is described here.
However, it turns out that the conversion took actually quite a bit longer:
Method 1 Method 2 Method 3
Single Insert Multi Insert Joined Inserts
Rows 1000 1000 1000
Insert 390 ms 765 ms 270 ms
per Row 0.390 ms 0.765 ms 0.27 ms
So, 1000 single calls to CDatabase::ExecuteSql each with a single INSERT statement (method 1) are roughly twice as fast as a single call to CDatabase::ExecuteSql with a multi-line INSERT statement with 1000 value tuples (method 2).
Update: So, the next thing I tried was to bundle 1000 separate INSERT statements into a single string and have the server execute that (method 3). It turns out this is even a bit faster than method 1.
Edit: I am using Microsoft SQL Server Express Edition (64-bit) v10.0.2531.0
It is not too surprising: the execution plan for the tiny insert is computed once, and then reused 1000 times. Parsing and preparing the plan is quick, because it has only four values to del with. A 1000-row plan, on the other hand, needs to deal with 4000 values (or 4000 parameters if you parameterized your C# tests). This could easily eat up the time savings you gain by eliminating 999 roundtrips to SQL Server, especially if your network is not overly slow.
The issue probably has to do with the time it takes to compile the query.
If you want to speed up the inserts, what you really need to do is wrap them in a transaction:
BEGIN TRAN;
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('6f3f7257-a3d8-4a78-b2e1-c9b767cfe1c1', 'First 0', 'Last 0', 0);
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('32023304-2e55-4768-8e52-1ba589b82c8b', 'First 1', 'Last 1', 1);
...
INSERT INTO T_TESTS (TestId, FirstName, LastName, Age)
VALUES ('f34d95a7-90b1-4558-be10-6ceacd53e4c4', 'First 999', 'Last 999', 999);
COMMIT TRAN;
From C#, you might also consider using a table valued parameter. Issuing multiple commands in a single batch, by separating them with semicolons, is another approach that will also help.
Source: Stackoverflow.com