[sql-server] Turning a Comma Separated string into individual rows

I have a SQL Table like this:

| SomeID         | OtherID     | Data
+----------------+-------------+-------------------
| abcdef-.....   | cdef123-... | 18,20,22
| abcdef-.....   | 4554a24-... | 17,19
| 987654-.....   | 12324a2-... | 13,19,20

is there a query where I can perform a query like SELECT OtherID, SplitData WHERE SomeID = 'abcdef-.......' that returns individual rows, like this:

| OtherID     | SplitData
+-------------+-------------------
| cdef123-... | 18
| cdef123-... | 20
| cdef123-... | 22
| 4554a24-... | 17
| 4554a24-... | 19

Basically split my data at the comma into individual rows?

I am aware that storing a comma-separated string into a relational database sounds dumb, but the normal use case in the consumer application makes that really helpful.

I don't want to do the split in the application as I need paging, so I wanted to explore options before refactoring the whole app.

It's SQL Server 2008 (non-R2).

This question is related to sql-server tsql split comma

The answer is


As of Feb 2016 - see the TALLY Table Example - very likely to outperform my TVF below, from Feb 2014. Keeping original post below for posterity:


Too much repeated code for my liking in the above examples. And I dislike the performance of CTEs and XML. Also, an explicit Id so that consumers that are order specific can specify an ORDER BY clause.

CREATE FUNCTION dbo.Split
(
    @Line nvarchar(MAX),
    @SplitOn nvarchar(5) = ','
)
RETURNS @RtnValue table
(
    Id INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
    Data nvarchar(100) NOT NULL
)
AS
BEGIN
    IF @Line IS NULL RETURN

    DECLARE @split_on_len INT = LEN(@SplitOn)
    DECLARE @start_at INT = 1
    DECLARE @end_at INT
    DECLARE @data_len INT

    WHILE 1=1
    BEGIN
        SET @end_at = CHARINDEX(@SplitOn,@Line,@start_at)
        SET @data_len = CASE @end_at WHEN 0 THEN LEN(@Line) ELSE @end_at-@start_at END
        INSERT INTO @RtnValue (data) VALUES( SUBSTRING(@Line,@start_at,@data_len) );
        IF @end_at = 0 BREAK;
        SET @start_at = @end_at + @split_on_len
    END

    RETURN
END

By creating this function ([DelimitedSplit]) which splits a string, you could do an OUTER APPLY to your SELECT.

CREATE FUNCTION [dbo].[DelimitedSplit]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a INNER JOIN E1 b ON b.N = a.N), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a INNER JOIN E2 b ON b.N = a.N), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;

TEST

CREATE TABLE #Testdata
(
    SomeID INT,
    OtherID INT,
    String VARCHAR(MAX)
)

INSERT #Testdata SELECT 1,  9, '18,20,22'
INSERT #Testdata SELECT 2,  8, '17,19'
INSERT #Testdata SELECT 3,  7, '13,19,20'
INSERT #Testdata SELECT 4,  6, ''
INSERT #Testdata SELECT 9, 11, '1,2,3,4'

SELECT
 *
FROM #Testdata
OUTER APPLY [dbo].[DelimitedSplit](String,',')

DROP TABLE #Testdata

RESULT

SomeID  OtherID String      ItemNumber  Item
1       9       18,20,22    1           18
1       9       18,20,22    2           20
1       9       18,20,22    3           22
2       8       17,19       1           17
2       8       17,19       2           19
3       7       13,19,20    1           13
3       7       13,19,20    2           19
3       7       13,19,20    3           20
4       6       1   
9       11      1,2,3,4     1           1
9       11      1,2,3,4     2           2
9       11      1,2,3,4     3           3
9       11      1,2,3,4     4           4

select t.OtherID,x.Kod
    from testData t
    cross apply (select Code from dbo.Split(t.Data,',') ) x

Finally, the wait is over with SQL Server 2016. They have introduced the Split string function, STRING_SPLIT:

select OtherID, cs.Value --SplitData
from yourtable
cross apply STRING_SPLIT (Data, ',') cs

All the other methods to split string like XML, Tally table, while loop, etc.. have been blown away by this STRING_SPLIT function.

Here is an excellent article with performance comparison: Performance Surprises and Assumptions: STRING_SPLIT.

For older versions, using tally table here is one split string function(best possible approach)

CREATE FUNCTION [dbo].[DelimitedSplit8K]
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
     -- enough to cover NVARCHAR(4000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;

Referred from Tally OH! An Improved SQL 8K “CSV Splitter” Function


Nice to see that it have been solved in the 2016 version, but for all of those that is not on that, here are two generalized and simplified versions of the methods above.

The XML-method is shorter, but of course requires the string to allow for the xml-trick (no 'bad' chars.)

XML-Method:

create function dbo.splitString(@input Varchar(max), @Splitter VarChar(99)) returns table as
Return
    SELECT Split.a.value('.', 'VARCHAR(max)') AS Data FROM
    ( SELECT CAST ('<M>' + REPLACE(@input, @Splitter, '</M><M>') + '</M>' AS XML) AS Data 
    ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a); 

Recursive method:

create function dbo.splitString(@input Varchar(max), @Splitter Varchar(99)) returns table as
Return
  with tmp (DataItem, ix) as
   ( select @input  , CHARINDEX('',@Input)  --Recu. start, ignored val to get the types right
     union all
     select Substring(@input, ix+1,ix2-ix-1), ix2
     from (Select *, CHARINDEX(@Splitter,@Input+@Splitter,ix+1) ix2 from tmp) x where ix2<>0
   ) select DataItem from tmp where ix<>0

Function in action

Create table TEST_X (A int, CSV Varchar(100));
Insert into test_x select 1, 'A,B';
Insert into test_x select 2, 'C,D';

Select A,data from TEST_X x cross apply dbo.splitString(x.CSV,',') Y;

Drop table TEST_X

XML-METHOD 2: Unicode Friendly (Addition courtesy of Max Hodges) create function dbo.splitString(@input nVarchar(max), @Splitter nVarchar(99)) returns table as Return SELECT Split.a.value('.', 'NVARCHAR(max)') AS Data FROM ( SELECT CAST ('<M>' + REPLACE(@input, @Splitter, '</M><M>') + '</M>' AS XML) AS Data ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);


You can use the following function to extract data

CREATE FUNCTION [dbo].[SplitString]
(    
    @RowData NVARCHAR(MAX),
    @Delimeter NVARCHAR(MAX)
)
RETURNS @RtnValue TABLE 
(
    ID INT IDENTITY(1,1),
    Data NVARCHAR(MAX)
) 
AS
BEGIN 
    DECLARE @Iterator INT
    SET @Iterator = 1

    DECLARE @FoundIndex INT
    SET @FoundIndex = CHARINDEX(@Delimeter,@RowData)

    WHILE (@FoundIndex>0)
    BEGIN
        INSERT INTO @RtnValue (data)
        SELECT 
            Data = LTRIM(RTRIM(SUBSTRING(@RowData, 1, @FoundIndex - 1)))

        SET @RowData = SUBSTRING(@RowData,
                @FoundIndex + DATALENGTH(@Delimeter) / 2,
                LEN(@RowData))

        SET @Iterator = @Iterator + 1
        SET @FoundIndex = CHARINDEX(@Delimeter, @RowData)
    END

    INSERT INTO @RtnValue (Data)
    SELECT Data = LTRIM(RTRIM(@RowData))

    RETURN
END

Below works on sql server 2008

select *, ROW_NUMBER() OVER(order by items) as row# 
from 
( select 134 myColumn1, 34 myColumn2, 'd,c,k,e,f,g,h,a' comaSeperatedColumn) myTable
    cross apply 
SPLIT (rtrim(comaSeperatedColumn), ',') splitedTable -- gives 'items'  column 

Will get all Cartesian product with the origin table columns plus "items" of split table.


Very late but try this out:

SELECT ColumnID, Column1, value  --Do not change 'value' name. Leave it as it is.
FROM tbl_Sample  
CROSS APPLY STRING_SPLIT(Tags, ','); --'Tags' is the name of column containing comma separated values

So we were having this: tbl_Sample :

ColumnID|   Column1 |   Tags
--------|-----------|-------------
1       |   ABC     |   10,11,12    
2       |   PQR     |   20,21,22

After running this query:

ColumnID|   Column1 |   value
--------|-----------|-----------
1       |   ABC     |   10
1       |   ABC     |   11
1       |   ABC     |   12
2       |   PQR     |   20
2       |   PQR     |   21
2       |   PQR     |   22

Thanks!


DECLARE @id_list VARCHAR(MAX) = '1234,23,56,576,1231,567,122,87876,57553,1216'
DECLARE @table TABLE ( id VARCHAR(50) )
DECLARE @x INT = 0
DECLARE @firstcomma INT = 0
DECLARE @nextcomma INT = 0

SET @x = LEN(@id_list) - LEN(REPLACE(@id_list, ',', '')) + 1 -- number of ids in id_list

WHILE @x > 0
    BEGIN
        SET @nextcomma = CASE WHEN CHARINDEX(',', @id_list, @firstcomma + 1) = 0
                              THEN LEN(@id_list) + 1
                              ELSE CHARINDEX(',', @id_list, @firstcomma + 1)
                         END
        INSERT  INTO @table
        VALUES  ( SUBSTRING(@id_list, @firstcomma + 1, (@nextcomma - @firstcomma) - 1) )
        SET @firstcomma = CHARINDEX(',', @id_list, @firstcomma + 1)
        SET @x = @x - 1
    END

SELECT  *
FROM    @table

Function

CREATE FUNCTION dbo.SplitToRows (@column varchar(100), @separator varchar(10))
RETURNS @rtnTable TABLE
  (
  ID int identity(1,1),
  ColumnA varchar(max)
  )
 AS
BEGIN
    DECLARE @position int = 0
    DECLARE @endAt int = 0
    DECLARE @tempString varchar(100)

    set @column = ltrim(rtrim(@column))

    WHILE @position<=len(@column)
    BEGIN       
        set @endAt = CHARINDEX(@separator,@column,@position)
            if(@endAt=0)
            begin
            Insert into @rtnTable(ColumnA) Select substring(@column,@position,len(@column)-@position)
            break;
            end
        set @tempString = substring(ltrim(rtrim(@column)),@position,@endAt-@position)

        Insert into @rtnTable(ColumnA) select @tempString
        set @position=@endAt+1;
    END
    return
END

Use case

select * from dbo.SplitToRows('T14; p226.0001; eee; 3554;', ';')

Or just a select with multiple result set

DECLARE @column varchar(max)= '1234; 4748;abcde; 324432'
DECLARE @separator varchar(10) = ';'
DECLARE @position int = 0
DECLARE @endAt int = 0
DECLARE @tempString varchar(100)

set @column = ltrim(rtrim(@column))

WHILE @position<=len(@column)
BEGIN       
    set @endAt = CHARINDEX(@separator,@column,@position)
        if(@endAt=0)
        begin
        Select substring(@column,@position,len(@column)-@position)
        break;
        end
    set @tempString = substring(ltrim(rtrim(@column)),@position,@endAt-@position)

    select @tempString
    set @position=@endAt+1;
END

When using this approach you have to make sure that none of your values contains something that would be illegal XML – user1151923

I always use the XML method. Make sure you use VALID XML. I have two functions to convert between valid XML and Text. (I tend to strip out the carriage returns as I don't usually need them.

CREATE FUNCTION dbo.udf_ConvertTextToXML (@Text varchar(MAX)) 
    RETURNS varchar(MAX)
AS
    BEGIN
        SET @Text = REPLACE(@Text,CHAR(10),'')
        SET @Text = REPLACE(@Text,CHAR(13),'')
        SET @Text = REPLACE(@Text,'<','&lt;')
        SET @Text = REPLACE(@Text,'&','&amp;')
        SET @Text = REPLACE(@Text,'>','&gt;')
        SET @Text = REPLACE(@Text,'''','&apos;')
        SET @Text = REPLACE(@Text,'"','&quot;')
    RETURN @Text
END


CREATE FUNCTION dbo.udf_ConvertTextFromXML (@Text VARCHAR(MAX)) 
    RETURNS VARCHAR(max)
AS
    BEGIN
        SET @Text = REPLACE(@Text,'&lt;','<')
        SET @Text = REPLACE(@Text,'&amp;','&')
        SET @Text = REPLACE(@Text,'&gt;','>')
        SET @Text = REPLACE(@Text,'&apos;','''')
        SET @Text = REPLACE(@Text,'&quot;','"')
    RETURN @Text
END

Check this

 SELECT A.OtherID,  
     Split.a.value('.', 'VARCHAR(100)') AS Data  
 FROM  
 (
     SELECT OtherID,  
         CAST ('<M>' + REPLACE(Data, ',', '</M><M>') + '</M>' AS XML) AS Data  
     FROM  Table1
 ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a); 

I know it has a lot of answers, but I want to write my version of split function like others and like string_split SQL Server 2016 native function.

create function [dbo].[Split]
(
    @Value nvarchar(max),
    @Delimiter nvarchar(50)
)
returns @tbl table
(
    Seq int primary key identity(1, 1),
    Value nvarchar(max)
)
as begin
    declare @Xml xml = cast('<d>' + replace(@Value, @Delimiter, '</d><d>') + '</d>' as xml)

    insert into @tbl
            (Value)
    select  a.split.value('.', 'nvarchar(max)') as Value
    from    @Xml.nodes('/d') a(split)
    
    return
end
  • Seq column is primary key to support fast join with other real table or Split function returned table.
  • Used XML function to support large data (looping version will slow down significantly when you have large data)

Here's a answer to question.

CREATE TABLE Testdata
(
    SomeID INT,
    OtherID INT,
    String VARCHAR(MAX)
)

INSERT Testdata SELECT 1,  9, '18,20,22'
INSERT Testdata SELECT 2,  8, '17,19'
INSERT Testdata SELECT 3,  7, '13,19,20'
INSERT Testdata SELECT 4,  6, ''
INSERT Testdata SELECT 9, 11, '1,2,3,4'


select  t.SomeID, t.OtherID, s.Value
from    Testdata t
        cross apply dbo.Split(t.String, ',') s

--Output
SomeID  OtherID Value
1       9       18
1       9       20
1       9       22
2       8       17
2       8       19
3       7       13
3       7       19
3       7       20
4       6       
9       11      1
9       11      2
9       11      3
9       11      4

Joining Split with other split

declare @Names nvarchar(max) = 'a,b,c,d'
declare @Codes nvarchar(max) = '10,20,30,40'

select  n.Seq, n.Value Name, c.Value Code
from    dbo.Split(@Names, ',') n
        inner join dbo.Split(@Codes, ',') c on n.Seq = c.Seq

--Output
Seq Name    Code
1   a       10
2   b       20
3   c       30
4   d       40

Split two times

declare @NationLocSex nvarchar(max) = 'Korea,Seoul,1;Vietnam,Kiengiang,0;China,Xian,0'

; with rows as
(
    select  Value
    from    dbo.Split(@NationLocSex, ';')
)
select  rw.Value r, cl.Value c
from    rows rw
        cross apply dbo.Split(rw.Value, ',') cl

--Output
r                       c
Korea,Seoul,1           Korea
Korea,Seoul,1           Seoul
Korea,Seoul,1           1
Vietnam,Kiengiang,0     Vietnam
Vietnam,Kiengiang,0     Kiengiang
Vietnam,Kiengiang,0     0
China,Xian,0            China
China,Xian,0            Xian
China,Xian,0            0

Split to columns

declare @Numbers nvarchar(50) = 'First,Second,Third'

; with t as
(
    select  case when Seq = 1 then Value end f1,
            case when Seq = 2 then Value end f2,
            case when Seq = 3 then Value end f3
    from    dbo.Split(@Numbers, ',')
)
select  min(f1) f1, min(f2) f2, min(f3) f3
from    t

--Output
f1      f2      f3
First   Second  Third

Generate rows by range


declare @Ranges nvarchar(50) = '1-2,4-6'

declare @Numbers table (Num int)
insert into @Numbers values (1),(2),(3),(4),(5),(6),(7),(8)

; with t as
(
    select  r.Seq, r.Value,
            min(case when ft.Seq = 1 then ft.Value end) ValueFrom,
            min(case when ft.Seq = 2 then ft.Value end) ValueTo
    from    dbo.Split(@Ranges, ',') r
            cross apply dbo.Split(r.Value, '-') ft
    group by r.Seq, r.Value
)
select  t.Seq, t.Value, t.ValueFrom, t.ValueTo, n.Num
from    t
        inner join @Numbers n on n.Num between t.ValueFrom and t.ValueTo

--Output
Seq Value   ValueFrom   ValueTo Num
1   1-2     1           2       1
1   1-2     1           2       2
2   4-6     4           6       4
2   4-6     4           6       5
2   4-6     4           6       6


Please refer below TSQL. STRING_SPLIT function is available only under compatibility level 130 and above.

TSQL:

DECLARE @stringValue NVARCHAR(400) = 'red,blue,green,yellow,black'  
DECLARE @separator CHAR = ','

SELECT [value]  As Colour
FROM STRING_SPLIT(@stringValue, @separator); 

RESULT:

Colour

red blue green yellow black


;WITH tmp(SomeID, OtherID, DataItem, Data) as (
    SELECT SomeID, OtherID, LEFT(Data, CHARINDEX(',',Data+',')-1),
        STUFF(Data, 1, CHARINDEX(',',Data+','), '')
FROM Testdata
WHERE Data > ''
)
SELECT SomeID, OtherID, Data
FROM tmp
ORDER BY SomeID

with only tiny little modification to above query...


Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to split

Parameter "stratify" from method "train_test_split" (scikit Learn) Pandas split DataFrame by column value How to split large text file in windows? Attribute Error: 'list' object has no attribute 'split' Split function in oracle to comma separated values with automatic sequence How would I get everything before a : in a string Python Split String by delimiter position using oracle SQL JavaScript split String with white space Split a String into an array in Swift? Split pandas dataframe in two if it has more than 10 rows

Examples related to comma

comma separated string of selected values in mysql How to Delete a directory from Hadoop cluster which is having comma(,) in its name? PHP remove commas from numeric strings Turning a Comma Separated string into individual rows