[sql] SQL Bulk Insert with FIRSTROW parameter skips the following line

I can't seem to figure out how this is happening.

Here's an example of the file that I'm attempting to bulk insert into SQL server 2005:

***A NICE HEADER HERE***
0000001234|SSNV|00013893-03JUN09
0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09

Here's my bulk insert statement:

BULK INSERT sometable
FROM 'E:\filefromabove.txt
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)

But, for some reason the only output I can get is:

0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09

The first record always gets skipped, unless I remove the header altogether and don't use the FIRSTROW parameter. How is this possible?

Thanks in advance!

This question is related to sql sql-server-2005 bulkinsert

The answer is


You can use the below snippet

BULK INSERT TextData
FROM 'E:\filefromabove.txt'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '|',  --CSV field delimiter
ROWTERMINATOR = '\n',   --Use to shift the control to next row
ERRORFILE = 'E:\ErrorRows.csv',
TABLOCK
)

Given how mangled some data can look after BCP importing into SQL Server from non-SQL data sources, I'd suggest doing all the BCP import into some scratch tables first.

For example

truncate table Address_Import_tbl

BULK INSERT dbo.Address_Import_tbl FROM 'E:\external\SomeDataSource\Address.csv' WITH ( FIELDTERMINATOR = '|', ROWTERMINATOR = '\n', MAXERRORS = 10 )

Make sure all the columns in Address_Import_tbl are nvarchar(), to make it as agnostic as possible, and avoid type conversion errors.

Then apply whatever fixes you need to Address_Import_tbl. Like deleting the unwanted header.

Then run a INSERT SELECT query, to copy from Address_Import_tbl to Address_tbl, along with any datatype conversions you need. For example, to cast imported dates to SQL DATETIME.


I found it easiest to just read the entire line into one column then parse out the data using XML.

IF (OBJECT_ID('tempdb..#data') IS NOT NULL) DROP TABLE #data
CREATE TABLE #data (data VARCHAR(MAX))

BULK INSERT #data FROM 'E:\filefromabove.txt' WITH (FIRSTROW = 2, ROWTERMINATOR = '\n')

IF (OBJECT_ID('tempdb..#dataXml') IS NOT NULL) DROP TABLE #dataXml
CREATE TABLE #dataXml (ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED, data XML)

INSERT #dataXml (data)
SELECT CAST('<r><d>' + REPLACE(data, '|', '</d><d>') + '</d></r>' AS XML)
FROM #data

SELECT  d.data.value('(/r//d)[1]', 'varchar(max)') AS col1,
        d.data.value('(/r//d)[2]', 'varchar(max)') AS col2,
        d.data.value('(/r//d)[3]', 'varchar(max)') AS col3
FROM #dataXml d

To let SQL handle quote escape and everything else do this

BULK INSERT Test_CSV
FROM  'C:\MyCSV.csv' 
WITH (
 FORMAT='CSV'
 --FIRSTROW = 2,  --uncomment this if your CSV contains header, so start parsing at line 2
);

In regards to other answers, here is valuable info as well:

I keep seeing this in all answers: ROWTERMINATOR = '\n'
The \n means LF and it is Linux style EOL

In Windows the EOL is made of 2 chars CRLF so you need ROWTERMINATOR = '\r\n'

enter image description here

enter image description here


Maybe check that the header has the same line-ending as the actual data rows (as specified in ROWTERMINATOR)?

Update: from MSDN:

The FIRSTROW attribute is not intended to skip column headers. Skipping headers is not supported by the BULK INSERT statement. When skipping rows, the SQL Server Database Engine looks only at the field terminators, and does not validate the data in the fields of skipped rows.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server-2005

Add a row number to result set of a SQL query SQL Server : Transpose rows to columns Select info from table where row has max date How to query for Xml values and attributes from table in SQL Server? How to restore SQL Server 2014 backup in SQL Server 2008 SQL Server 2005 Using CHARINDEX() To split a string Is it necessary to use # for creating temp tables in SQL server? SQL Query to find the last day of the month JDBC connection to MSSQL server in windows authentication mode How to convert the system date format to dd/mm/yy in SQL Server 2008 R2?

Examples related to bulkinsert

Insert Multiple Rows Into Temp Table With SQL Server 2012 PostgreSQL - SQL state: 42601 syntax error Import CSV file into SQL Server Cannot bulk load. Operating system error code 5 (Access is denied.) How to Bulk Insert from XLSX file extension? Bulk Insert Correctly Quoted CSV File in SQL Server How to speed up insertion performance in PostgreSQL BULK INSERT with identity (auto-increment) column How do I temporarily disable triggers in PostgreSQL? mongodb: insert if not exists