Categories
sql sql-server sql-server-2005 string-aggregation

Simulating group_concat MySQL function in Microsoft SQL Server 2005?

361

I’m trying to migrate a MySQL-based app over to Microsoft SQL Server 2005 (not by choice, but that’s life).

In the original app, we used almost entirely ANSI-SQL compliant statements, with one significant exception — we used MySQL’s group_concat function fairly frequently.

group_concat, by the way, does this: given a table of, say, employee names and projects…

SELECT empName, projID FROM project_members;

returns:

ANDY   |  A100
ANDY   |  B391
ANDY   |  X010
TOM    |  A100
TOM    |  A510

… and here’s what you get with group_concat:

SELECT 
    empName, group_concat(projID SEPARATOR "https://stackoverflow.com/") 
FROM 
    project_members 
GROUP BY 
    empName;

returns:

ANDY   |  A100 / B391 / X010
TOM    |  A100 / A510

So what I’d like to know is: Is it possible to write, say, a user-defined function in SQL Server which emulates the functionality of group_concat?

I have almost no experience using UDFs, stored procedures, or anything like that, just straight-up SQL, so please err on the side of too much explanation 🙂

7

174

No REAL easy way to do this. Lots of ideas out there, though.

Best one I’ve found:

SELECT table_name, LEFT(column_names , LEN(column_names )-1) AS column_names
FROM information_schema.columns AS extern
CROSS APPLY
(
    SELECT column_name + ','
    FROM information_schema.columns AS intern
    WHERE extern.table_name = intern.table_name
    FOR XML PATH('')
) pre_trimmed (column_names)
GROUP BY table_name, column_names;

Or a version that works correctly if the data might contain characters such as <

WITH extern
     AS (SELECT DISTINCT table_name
         FROM   INFORMATION_SCHEMA.COLUMNS)
SELECT table_name,
       LEFT(y.column_names, LEN(y.column_names) - 1) AS column_names
FROM   extern
       CROSS APPLY (SELECT column_name + ','
                    FROM   INFORMATION_SCHEMA.COLUMNS AS intern
                    WHERE  extern.table_name = intern.table_name
                    FOR XML PATH(''), TYPE) x (column_names)
       CROSS APPLY (SELECT x.column_names.value('.', 'NVARCHAR(MAX)')) y(column_names) 

3

  • 1

    This example worked for me, but I tried doing another aggregation and it didn’t work, gave me an error: “the correlation name ‘pre_trimmed’ is specified multiple times in a FROM clause.”

    Jan 22, 2010 at 19:24

  • 7

    ‘pre_trimmed’ is just an alias for the subquery. Aliases are required for subqueries and have to be unique, so for another subquery change it to something unique…

    – Koen

    Mar 12, 2012 at 16:57

  • 2

    can you show an example without table_name as a column name it’s confusing.

    – S.Mason

    Nov 17, 2017 at 21:47

172

I may be a bit late to the party but this method works for me and is easier than the COALESCE method.

SELECT STUFF(
             (SELECT ',' + Column_Name 
              FROM Table_Name
              FOR XML PATH (''))
             , 1, 1, '')

2

  • 1

    This only shows how to concat values – group_concat concats them by group, which is more challenging (and what the OP appears to require). See the accepted answer to SO 15154644 for how to do this – the WHERE clause is the critical addition

    – DJDave

    Jan 11, 2018 at 13:26

  • @DJDave was referring to this answer. See also the accepted answer to a similar question.

    Feb 27, 2019 at 20:19

54

SQL Server 2017 does introduce a new aggregate function

STRING_AGG ( expression, separator).

Concatenates the values of string expressions and places separator
values between them. The separator is not added at the end of string.

The concatenated elements can be ordered by appending WITHIN GROUP (ORDER BY some_expression)

For versions 2005-2016 I typically use the XML method in the accepted answer.

This can fail in some circumstances however. e.g. if the data to be concatenated contains CHAR(29) you see

FOR XML could not serialize the data … because it
contains a character (0x001D) which is not allowed in XML.

A more robust method that can deal with all characters would be to use a CLR aggregate. However applying an ordering to the concatenated elements is more difficult with this approach.

The method of assigning to a variable is not guaranteed and should be avoided in production code.

1