[sql] How to concatenate strings of a string field in a PostgreSQL 'group by' query?

I am looking for a way to concatenate the strings of a field within a group by query. So for example, I have a table:

ID   COMPANY_ID   EMPLOYEE
1    1            Anna
2    1            Bill
3    2            Carol
4    2            Dave

and I wanted to group by company_id to get something like:

COMPANY_ID   EMPLOYEE
1            Anna, Bill
2            Carol, Dave

There is a built-in function in mySQL to do this group_concat

This question is related to sql postgresql group-by string-aggregation

The answer is


According to version PostgreSQL 9.0 and above you can use the aggregate function called string_agg. Your new SQL should look something like this:

SELECT company_id, string_agg(employee, ', ')
    FROM mytable GROUP BY company_id;

As from PostgreSQL 9.0 you can use the aggregate function called string_agg. Your new SQL should look something like this:

SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;


If you are on Amazon Redshift, where string_agg is not supported, try using listagg.

SELECT company_id, listagg(EMPLOYEE, ', ') as employees
FROM EMPLOYEE_table
GROUP BY company_id;

I claim no credit for the answer because I found it after some searching:

What I didn't know is that PostgreSQL allows you to define your own aggregate functions with CREATE AGGREGATE

This post on the PostgreSQL list shows how trivial it is to create a function to do what's required:

CREATE AGGREGATE textcat_all(
  basetype    = text,
  sfunc       = textcat,
  stype       = text,
  initcond    = ''
);

SELECT company_id, textcat_all(employee || ', ')
FROM mytable
GROUP BY company_id;

How about using Postgres built-in array functions? At least on 8.4 this works out of the box:

SELECT company_id, array_to_string(array_agg(employee), ',')
FROM mytable
GROUP BY company_id;

Following up on Kev's answer, using the Postgres docs:

First, create an array of the elements, then use the built-in array_to_string function.

CREATE AGGREGATE array_accum (anyelement)
(
 sfunc = array_append,
 stype = anyarray,
 initcond = '{}'
);

select array_to_string(array_accum(name),'|') from table group by id;

As already mentioned, creating your own aggregate function is the right thing to do. Here is my concatenation aggregate function (you can find details in French):

CREATE OR REPLACE FUNCTION concat2(text, text) RETURNS text AS '
    SELECT CASE WHEN $1 IS NULL OR $1 = \'\' THEN $2
            WHEN $2 IS NULL OR $2 = \'\' THEN $1
            ELSE $1 || \' / \' || $2
            END; 
'
 LANGUAGE SQL;

CREATE AGGREGATE concatenate (
  sfunc = concat2,
  basetype = text,
  stype = text,
  initcond = ''

);

And then use it as:

SELECT company_id, concatenate(employee) AS employees FROM ...

This latest announcement list snippet might be of interest if you'll be upgrading to 8.4:

Until 8.4 comes out with a super-effient native one, you can add the array_accum() function in the PostgreSQL documentation for rolling up any column into an array, which can then be used by application code, or combined with array_to_string() to format it as a list:

http://www.postgresql.org/docs/current/static/xaggr.html

I'd link to the 8.4 development docs but they don't seem to list this feature yet.


Use STRING_AGG function for PostgreSQL and Google BigQuery SQL:

SELECT company_id, STRING_AGG(employee, ', ')
FROM employees
GROUP BY company_id;

I'm using Jetbrains Rider and it was a hassle copying the results from above examples to re-execute because it seemed to wrap it all in JSON. This joins them into a single statement that was easier to run

select string_agg('drop table if exists "' || tablename || '" cascade', ';') 
from pg_tables where schemaname != $$pg_catalog$$ and tableName like $$rm_%$$

I found this PostgreSQL documentation helpful: http://www.postgresql.org/docs/8.0/interactive/functions-conditional.html.

In my case, I sought plain SQL to concatenate a field with brackets around it, if the field is not empty.

select itemid, 
  CASE 
    itemdescription WHEN '' THEN itemname 
    ELSE itemname || ' (' || itemdescription || ')' 
  END 
from items;

Following yet again on the use of a custom aggregate function of string concatenation: you need to remember that the select statement will place rows in any order, so you will need to do a sub select in the from statement with an order by clause, and then an outer select with a group by clause to aggregate the strings, thus:

SELECT custom_aggregate(MY.special_strings)
FROM (SELECT special_strings, grouping_column 
        FROM a_table 
        ORDER BY ordering_column) MY
GROUP BY MY.grouping_column

You can also use format function. Which can also implicitly take care of type conversion of text, int, etc by itself.

create or replace function concat_return_row_count(tbl_name text, column_name text, value int)
returns integer as $row_count$
declare
total integer;
begin
    EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
    return total;
end;
$row_count$ language plpgsql;


postgres=# select concat_return_row_count('tbl_name','column_name',2); --2 is the value

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to postgresql

Subtracting 1 day from a timestamp date pgadmin4 : postgresql application server could not be contacted. Psql could not connect to server: No such file or directory, 5432 error? How to persist data in a dockerized postgres database using volumes input file appears to be a text format dump. Please use psql Postgres: check if array field contains value? Add timestamp column with default NOW() for new rows only Can't connect to Postgresql on port 5432 How to insert current datetime in postgresql insert query Connecting to Postgresql in a docker container from outside

Examples related to group-by

SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by Count unique values using pandas groupby Pandas group-by and sum Count unique values with pandas per groups Group dataframe and get sum AND count? Error related to only_full_group_by when executing a query in MySql Pandas sum by groupby, but exclude certain columns Using DISTINCT along with GROUP BY in SQL Server Python Pandas : group by in group by and average? How do I create a new column from the output of pandas groupby().sum()?

Examples related to string-aggregation

How to make a query with group_concat in sql server SQL Query to concatenate column values from multiple rows in Oracle Postgresql GROUP_CONCAT equivalent? How do I create a comma-separated list using a SQL query? How can multiple rows be concatenated into one in Oracle without creating a stored procedure? How can I combine multiple rows into a comma-delimited list in Oracle? Simulating group_concat MySQL function in Microsoft SQL Server 2005? How to concatenate strings of a string field in a PostgreSQL 'group by' query?