[sql] How to create a SQL Server function to "join" multiple rows from a subquery into a single delimited field?

To illustrate, assume that I have two tables as follows:

VehicleID Name
1         Chuck
2         Larry

LocationID VehicleID City
1          1         New York
2          1         Seattle
3          1         Vancouver
4          2         Los Angeles
5          2         Houston

I want to write a query to return the following results:

VehicleID Name    Locations
1         Chuck   New York, Seattle, Vancouver
2         Larry   Los Angeles, Houston

I know that this can be done using server side cursors, ie:

DECLARE @VehicleID int
DECLARE @VehicleName varchar(100)
DECLARE @LocationCity varchar(100)
DECLARE @Locations varchar(4000)
DECLARE @Results TABLE
(
  VehicleID int
  Name varchar(100)
  Locations varchar(4000)
)

DECLARE VehiclesCursor CURSOR FOR
SELECT
  [VehicleID]
, [Name]
FROM [Vehicles]

OPEN VehiclesCursor

FETCH NEXT FROM VehiclesCursor INTO
  @VehicleID
, @VehicleName
WHILE @@FETCH_STATUS = 0
BEGIN

  SET @Locations = ''

  DECLARE LocationsCursor CURSOR FOR
  SELECT
    [City]
  FROM [Locations]
  WHERE [VehicleID] = @VehicleID

  OPEN LocationsCursor

  FETCH NEXT FROM LocationsCursor INTO
    @LocationCity
  WHILE @@FETCH_STATUS = 0
  BEGIN
    SET @Locations = @Locations + @LocationCity

    FETCH NEXT FROM LocationsCursor INTO
      @LocationCity
  END
  CLOSE LocationsCursor
  DEALLOCATE LocationsCursor

  INSERT INTO @Results (VehicleID, Name, Locations) SELECT @VehicleID, @Name, @Locations

END     
CLOSE VehiclesCursor
DEALLOCATE VehiclesCursor

SELECT * FROM @Results

However, as you can see, this requires a great deal of code. What I would like is a generic function that would allow me to do something like this:

SELECT VehicleID
     , Name
     , JOIN(SELECT City FROM Locations WHERE VehicleID = Vehicles.VehicleID, ', ') AS Locations
FROM Vehicles

Is this possible? Or something similar?

This question is related to sql sql-server string-concatenation

The answer is


I don't belive there's a way to do it within one query, but you can play tricks like this with a temporary variable:

declare @s varchar(max)
set @s = ''
select @s = @s + City + ',' from Locations

select @s

It's definitely less code than walking over a cursor, and probably more efficient.


The below code will work for Sql Server 2000/2005/2008

CREATE FUNCTION fnConcatVehicleCities(@VehicleId SMALLINT)
RETURNS VARCHAR(1000) AS
BEGIN
  DECLARE @csvCities VARCHAR(1000)
  SELECT @csvCities = COALESCE(@csvCities + ', ', '') + COALESCE(City,'')
  FROM Vehicles 
  WHERE VehicleId = @VehicleId 
  return @csvCities
END

-- //Once the User defined function is created then run the below sql

SELECT VehicleID
     , dbo.fnConcatVehicleCities(VehicleId) AS Locations
FROM Vehicles
GROUP BY VehicleID

VERSION NOTE: You must be using SQL Server 2005 or greater with Compatibility Level set to 90 or greater for this solution.

See this MSDN article for the first example of creating a user-defined aggregate function that concatenates a set of string values taken from a column in a table.

My humble recommendation would be to leave out the appended comma so you can use your own ad-hoc delimiter, if any.

Referring to the C# version of Example 1:

change:  this.intermediateResult.Append(value.Value).Append(',');
    to:  this.intermediateResult.Append(value.Value);

And

change:  output = this.intermediateResult.ToString(0, this.intermediateResult.Length - 1);
    to:  output = this.intermediateResult.ToString();

That way when you use your custom aggregate, you can opt to use your own delimiter, or none at all, such as:

SELECT dbo.CONCATENATE(column1 + '|') from table1

NOTE: Be careful about the amount of the data you attempt to process in your aggregate. If you try to concatenate thousands of rows or many very large datatypes you may get a .NET Framework error stating "[t]he buffer is insufficient."


I've found a solution by creating the following function:

CREATE FUNCTION [dbo].[JoinTexts]
(
  @delimiter VARCHAR(20) ,
  @whereClause VARCHAR(1)
)
RETURNS VARCHAR(MAX)
AS 
BEGIN
    DECLARE @Texts VARCHAR(MAX)

    SELECT  @Texts = COALESCE(@Texts + @delimiter, '') + T.Texto
    FROM    SomeTable AS T
    WHERE   T.SomeOtherColumn = @whereClause

    RETURN @Texts
END
GO

Usage:

SELECT dbo.JoinTexts(' , ', 'Y')

Note that Matt's code will result in an extra comma at the end of the string; using COALESCE (or ISNULL for that matter) as shown in the link in Lance's post uses a similar method but doesn't leave you with an extra comma to remove. For the sake of completeness, here's the relevant code from Lance's link on sqlteam.com:

DECLARE @EmployeeList varchar(100)
SELECT @EmployeeList = COALESCE(@EmployeeList + ', ', '') + 
    CAST(EmpUniqueID AS varchar(5))
FROM SalesCallsEmployees
WHERE SalCal_UniqueID = 1

Try this query

SELECT v.VehicleId, v.Name, ll.LocationList
FROM Vehicles v 
LEFT JOIN 
    (SELECT 
     DISTINCT
        VehicleId,
        REPLACE(
            REPLACE(
                REPLACE(
                    (
                        SELECT City as c 
                        FROM Locations x 
                        WHERE x.VehicleID = l.VehicleID FOR XML PATH('')
                    ),    
                    '</c><c>',', '
                 ),
             '<c>',''
            ),
        '</c>', ''
        ) AS LocationList
    FROM Locations l
) ll ON ll.VehicleId = v.VehicleId

In SQL Server 2005

SELECT Stuff(
  (SELECT N', ' + Name FROM Names FOR XML PATH(''),TYPE)
  .value('text()[1]','nvarchar(max)'),1,2,N'')

In SQL Server 2016

you can use the FOR JSON syntax

i.e.

SELECT per.ID,
Emails = JSON_VALUE(
   REPLACE(
     (SELECT _ = em.Email FROM Email em WHERE em.Person = per.ID FOR JSON PATH)
    ,'"},{"_":"',', '),'$[0]._'
) 
FROM Person per

And the result will become

Id  Emails
1   [email protected]
2   NULL
3   [email protected], [email protected]

This will work even your data contains invalid XML characters

the '"},{"":"' is safe because if you data contain '"},{"":"', it will be escaped to "},{\"_\":\"

You can replace ', ' with any string separator


And in SQL Server 2017, Azure SQL Database

You can use the new STRING_AGG function


With the other answers, the person reading the answer must be aware of the vehicle table and create the vehicle table and data to test a solution.

Below is an example that uses SQL Server "Information_Schema.Columns" table. By using this solution, no tables need to be created or data added. This example creates a comma separated list of column names for all tables in the database.

SELECT
    Table_Name
    ,STUFF((
        SELECT ',' + Column_Name
        FROM INFORMATION_SCHEMA.Columns Columns
        WHERE Tables.Table_Name = Columns.Table_Name
        ORDER BY Column_Name
        FOR XML PATH ('')), 1, 1, ''
    )Columns
FROM INFORMATION_SCHEMA.Columns Tables
GROUP BY TABLE_NAME 

From what I can see FOR XML (as posted earlier) is the only way to do it if you want to also select other columns (which I'd guess most would) as the OP does. Using COALESCE(@var... does not allow inclusion of other columns.

Update: Thanks to programmingsolutions.net there is a way to remove the "trailing" comma to. By making it into a leading comma and using the STUFF function of MSSQL you can replace the first character (leading comma) with an empty string as below:

stuff(
    (select ',' + Column 
     from Table
         inner where inner.Id = outer.Id 
     for xml path('')
), 1,1,'') as Values

In a single SQL query, without using the FOR XML clause.
A Common Table Expression is used to recursively concatenate the results.

-- rank locations by incrementing lexicographical order
WITH RankedLocations AS (
  SELECT
    VehicleID,
    City,
    ROW_NUMBER() OVER (
        PARTITION BY VehicleID 
        ORDER BY City
    ) Rank
  FROM
    Locations
),
-- concatenate locations using a recursive query
-- (Common Table Expression)
Concatenations AS (
  -- for each vehicle, select the first location
  SELECT
    VehicleID,
    CONVERT(nvarchar(MAX), City) Cities,
    Rank
  FROM
    RankedLocations
  WHERE
    Rank = 1

  -- then incrementally concatenate with the next location
  -- this will return intermediate concatenations that will be 
  -- filtered out later on
  UNION ALL

  SELECT
    c.VehicleID,
    (c.Cities + ', ' + l.City) Cities,
    l.Rank
  FROM
    Concatenations c -- this is a recursion!
    INNER JOIN RankedLocations l ON
        l.VehicleID = c.VehicleID 
        AND l.Rank = c.Rank + 1
),
-- rank concatenation results by decrementing length 
-- (rank 1 will always be for the longest concatenation)
RankedConcatenations AS (
  SELECT
    VehicleID,
    Cities,
    ROW_NUMBER() OVER (
        PARTITION BY VehicleID 
        ORDER BY Rank DESC
    ) Rank
  FROM 
    Concatenations
)
-- main query
SELECT
  v.VehicleID,
  v.Name,
  c.Cities
FROM
  Vehicles v
  INNER JOIN RankedConcatenations c ON 
    c.VehicleID = v.VehicleID 
    AND c.Rank = 1

If you're running SQL Server 2005, you can write a custom CLR aggregate function to handle this.

C# version:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;
[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.UserDefined,MaxByteSize=8000)]
public class CSV:IBinarySerialize
{
    private StringBuilder Result;
    public void Init() {
        this.Result = new StringBuilder();
    }

    public void Accumulate(SqlString Value) {
        if (Value.IsNull) return;
        this.Result.Append(Value.Value).Append(",");
    }
    public void Merge(CSV Group) {
        this.Result.Append(Group.Result);
    }
    public SqlString Terminate() {
        return new SqlString(this.Result.ToString());
    }
    public void Read(System.IO.BinaryReader r) {
        this.Result = new StringBuilder(r.ReadString());
    }
    public void Write(System.IO.BinaryWriter w) {
        w.Write(this.Result.ToString());
    }
}

Mun's answer didn't work for me so I made some changes to that answer to get it to work. Hope this helps someone. Using SQL Server 2012:

SELECT [VehicleID]
     , [Name]
     , STUFF((SELECT DISTINCT ',' + CONVERT(VARCHAR,City) 
         FROM [Location] 
         WHERE (VehicleID = Vehicle.VehicleID) 
         FOR XML PATH ('')), 1, 2, '') AS Locations
FROM [Vehicle]

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to string-concatenation

How do I concatenate strings? How do I concatenate strings in Swift? How to set the id attribute of a HTML element dynamically with angularjs (1.x)? C++ String Concatenation operator<< how to use concatenate a fixed string and a variable in Python How do I concatenate strings and variables in PowerShell? Concatenating variables in Bash SQL NVARCHAR and VARCHAR Limits Concatenating string and integer in python PHP string concatenation