[sql] Simulating group_concat MySQL function in Microsoft SQL Server 2005?

I'm trying to migrate a MySQL-based app over to Microsoft SQL Server 2005 (not by choice, but that's life).

In the original app, we used almost entirely ANSI-SQL compliant statements, with one significant exception -- we used MySQL's group_concat function fairly frequently.

group_concat, by the way, does this: given a table of, say, employee names and projects...

SELECT empName, projID FROM project_members;


ANDY   |  A100
ANDY   |  B391
ANDY   |  X010
TOM    |  A100
TOM    |  A510

... and here's what you get with group_concat:

    empName, group_concat(projID SEPARATOR ' / ') 


ANDY   |  A100 / B391 / X010
TOM    |  A100 / A510

So what I'd like to know is: Is it possible to write, say, a user-defined function in SQL Server which emulates the functionality of group_concat?

I have almost no experience using UDFs, stored procedures, or anything like that, just straight-up SQL, so please err on the side of too much explanation :)

Tried these but for my purposes in MS SQL Server 2005 the following was most useful, which I found at xaprb

declare @result varchar(8000);

set @result = '';

select @result = @result + name + ' '

from master.dbo.systypes;

select rtrim(@result);

@Mark as you mentioned it was the space character that caused issues for me.

For my fellow Googlers out there, here's a very simple plug-and-play solution that worked for me after struggling with the more complex solutions for a while:

distinct empName,
NewColumnName=STUFF((SELECT ','+ CONVERT(VARCHAR(10), projID ) 
                     FROM returns 
                     WHERE empName=t.empName FOR XML PATH('')) , 1 , 1 , '' )
returns t

Notice that I had to convert the ID into a VARCHAR in order to concatenate it as a string. If you don't have to do that, here's an even simpler version:

distinct empName,
NewColumnName=STUFF((SELECT ','+ projID
                     FROM returns 
                     WHERE empName=t.empName FOR XML PATH('')) , 1 , 1 , '' )
returns t

All credit for this goes to here: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/9508abc2-46e7-4186-b57f-7f368374e084/replicating-groupconcat-function-of-mysql-in-sql-server?forum=transactsql

I may be a bit late to the party but this method works for me and is easier than the COALESCE method.

             (SELECT ',' + Column_Name 
              FROM Table_Name
              FOR XML PATH (''))
             , 1, 1, '')

Have a look at the GROUP_CONCAT project on Github, I think I does exactly what you are searching for:

This project contains a set of SQLCLR User-defined Aggregate functions (SQLCLR UDAs) that collectively offer similar functionality to the MySQL GROUP_CONCAT function. There are multiple functions to ensure the best performance based on the functionality required...

Possibly too late to be of benefit now, but is this not the easiest way to do things?

SELECT     empName, projIDs = replace
                          ((SELECT Surname AS [data()]
                              FROM project_members
                              WHERE  empName = a.empName
                              ORDER BY empName FOR xml path('')), ' ', REQUIRED SEPERATOR)
FROM         project_members a
GROUP BY empName

I may be a bit late to the party but this method works for me and is easier than the COALESCE method.

             (SELECT ',' + Column_Name 
              FROM Table_Name
              FOR XML PATH (''))
             , 1, 1, '')

About J Hardiman's answer, how about:

SELECT empName, projIDs=
      (SELECT REPLACE(projID, ' ', '-somebody-puts-microsoft-out-of-his-misery-please-') AS [data()] FROM project_members WHERE empName=a.empName FOR XML PATH('')), 
      ' ', 
      ' / '), 
    ' ') 
  FROM project_members a WHERE empName IS NOT NULL GROUP BY empName

By the way, is the use of "Surname" a typo or am i not understanding a concept here?

Anyway, thanks a lot guys cuz it saved me quite some time :)

UPDATE 2020: SQL Server 2016+ JSON Serialization and De-serialization Examples

The data provided by the OP inserted into a temporary table called #project_members

drop table if exists #project_members;
create table #project_members(
  empName        varchar(20) not null,
  projID         varchar(20) not null);
insert #project_members(empName, projID) values
('ANDY', 'A100'),
('ANDY', 'B391'),
('ANDY', 'X010'),
('TOM', 'A100'),
('TOM', 'A510');

How to serialize this data into a single JSON string with a nested array containing projID's

select empName, (select pm_json.projID 
                 from #project_members pm_json 
                 where pm.empName=pm_json.empName 
                 for json path, root('projList')) projJSON
from #project_members pm
group by empName
for json path;


    "empName": "ANDY",
    "projJSON": {
      "projList": [
        { "projID": "A100" },
        { "projID": "B391" },
        { "projID": "X010" }
    "empName": "TOM",
    "projJSON": {
      "projList": [
        { "projID": "A100" },
        { "projID": "A510" }

How to de-serialize this data from a single JSON string back to it's original rows and columns

declare @json           nvarchar(max)=N'[{"empName":"ANDY","projJSON":{"projList":[{"projID":"A100"},

select oj.empName, noj.projID 
from openjson(@json) with (empName        varchar(20),
                           projJSON       nvarchar(max) as json) oj
     cross apply openjson(oj.projJSON, '$.projList') with (projID    varchar(20)) noj;


empName projID
ANDY    A100
ANDY    B391
ANDY    X010
TOM     A100
TOM     A510

How to persist the unique empName to a table and store the projID's in a nested JSON array

drop table if exists #project_members_with_json;
create table #project_members_with_json(
  empName        varchar(20) unique not null,
  projJSON       nvarchar(max) not null);
insert #project_members_with_json(empName, projJSON) 
select empName, (select pm_json.projID 
                 from #project_members pm_json 
                 where pm.empName=pm_json.empName 
                 for json path, root('projList')) 
from #project_members pm
group by empName;


empName projJSON
ANDY    {"projList":[{"projID":"A100"},{"projID":"B391"},{"projID":"X010"}]}
TOM     {"projList":[{"projID":"A100"},{"projID":"A510"}]}

How to de-serialize from a table with unique empName and nested JSON array column containing projID's

select wj.empName, oj.projID
  #project_members_with_json wj
 cross apply
  openjson(wj.projJSON, '$.projList') with (projID    varchar(20)) oj;


empName projID
ANDY    A100
ANDY    B391
ANDY    X010
TOM     A100
TOM     A510

To concatenate all the project manager names from projects that have multiple project managers write:

SELECT a.project_id,a.project_name,Stuff((SELECT N'/ ' + first_name + ', '+last_name FROM projects_v 
where a.project_id=project_id
 XML PATH(''),TYPE).value('text()[1]','nvarchar(max)'),1,2,N''
) mgr_names
from projects_v a
group by a.project_id,a.project_name

SQL Server 2017 does introduce a new aggregate function

STRING_AGG ( expression, separator).

Concatenates the values of string expressions and places separator values between them. The separator is not added at the end of string.

The concatenated elements can be ordered by appending WITHIN GROUP (ORDER BY some_expression)

For versions 2005-2016 I typically use the XML method in the accepted answer.

This can fail in some circumstances however. e.g. if the data to be concatenated contains CHAR(29) you see

FOR XML could not serialize the data ... because it contains a character (0x001D) which is not allowed in XML.

A more robust method that can deal with all characters would be to use a CLR aggregate. However applying an ordering to the concatenated elements is more difficult with this approach.

The method of assigning to a variable is not guaranteed and should be avoided in production code.

