[sql-server] Understanding PIVOT function in T-SQL

I am very new to SQL.

I have a table like this:

ID | TeamID | UserID | ElementID | PhaseID | Effort
-----------------------------------------------------
1  |   1    |  1      |   3       |  5     |   6.74
2  |   1    |  1      |   3       |  6     |   8.25
3  |   1    |  1      |   4       |  1     |   2.23
4  |   1    |  1      |   4       |  5     |   6.8
5  |   1    |  1      |   4       |  6     |   1.5

And I was told to get data like this

ElementID | PhaseID1 | PhaseID5 | PhaseID6
--------------------------------------------
    3     |   NULL   |   6.74   |   8.25
    4     |   2.23   |   6.8    |   1.5

I understand I need to use PIVOT function. But can't understand it clearly. It would be great help if somebody can explain it in above case.(or any alternatives if any)

This question is related to sql-server sql-server-2008 tsql pivot

The answer is


These are the very basic pivot example kindly go through that.

SQL SERVER – PIVOT and UNPIVOT Table Examples

Example from above link for the product table:

SELECT PRODUCT, FRED, KATE
FROM (
SELECT CUST, PRODUCT, QTY
FROM Product) up
 PIVOT (SUM(QTY) FOR CUST IN (FRED, KATE)) AS pvt
ORDER BY PRODUCT

renders:

 PRODUCT FRED  KATE
 --------------------
 BEER     24    12
 MILK      3     1
 SODA   NULL     6
 VEG    NULL     5

Similar examples can be found in the blog post Pivot tables in SQL Server. A simple sample


Ive something to add here which no one mentioned.

The pivot function works great when the source has 3 columns: One for the aggregate, one to spread as columns with for, and one as a pivot for row distribution. In the product example it's QTY, CUST, PRODUCT.

However, if you have more columns in the source it will break the results into multiple rows instead of one row per pivot based on unique values per additional column (as Group By would do in a simple query).

See this example, ive added a timestamp column to the source table:

enter image description here

Now see its impact:

SELECT CUST, MILK

FROM Product
-- FROM (SELECT CUST, Product, QTY FROM PRODUCT) p
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

ORDER BY CUST

enter image description here


In order to fix this, you can either pull a subquery as a source as everyone has done above - with only 3 columns (this is not always going to work for your scenario, imagine if you need to put a where condition for the timestamp).

Second solution is to use a group by and do a sum of the pivoted column values again.

SELECT 
CUST, 
sum(MILK) t_MILK

FROM Product
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

GROUP BY CUST
ORDER BY CUST

GO

enter image description here


To set Compatibility error

use this before using pivot function

ALTER DATABASE [dbname] SET COMPATIBILITY_LEVEL = 100  

A pivot is used to convert one of the columns in your data set from rows into columns (this is typically referred to as the spreading column). In the example you have given, this means converting the PhaseID rows into a set of columns, where there is one column for each distinct value that PhaseID can contain - 1, 5 and 6 in this case.

These pivoted values are grouped via the ElementID column in the example that you have given.

Typically you also then need to provide some form of aggregation that gives you the values referenced by the intersection of the spreading value (PhaseID) and the grouping value (ElementID). Although in the example given the aggregation that will be used is unclear, but involves the Effort column.

Once this pivoting is done, the grouping and spreading columns are used to find an aggregation value. Or in your case, ElementID and PhaseIDX lookup Effort.

Using the grouping, spreading, aggregation terminology you will typically see example syntax for a pivot as:

WITH PivotData AS
(
    SELECT <grouping column>
        , <spreading column>
        , <aggregation column>
    FROM <source table>
)
SELECT <grouping column>, <distinct spreading values>
FROM PivotData
    PIVOT (<aggregation function>(<aggregation column>)
        FOR <spreading column> IN <distinct spreading values>));

This gives a graphical explanation of how the grouping, spreading and aggregation columns convert from the source to pivoted tables if that helps further.


    SELECT <non-pivoted column>,
    [first pivoted column] AS <column name>,
    [second pivoted column] AS <column name>,
    ...
    [last pivoted column] AS <column name>
FROM
    (<SELECT query that produces the data>)
    AS <alias for the source query>
PIVOT
(
    <aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;

USE AdventureWorks2008R2 ;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost 
FROM Production.Product
GROUP BY DaysToManufacture;

    DaysToManufacture          AverageCost
0                          5.0885
1                          223.88
2                          359.1082
4                          949.4105

    -- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days, 
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost 
    FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;




Here is the result set.
Cost_Sorted_By_Production_Days    0         1         2           3       4       
AverageCost                       5.0885    223.88    359.1082    NULL    949.4105

FOR XML PATH might not work on Microsoft Azure Synapse Serve. A possible alternative, following @Taryn dynamic generated cols approach, same results is obtained by using STRING_AGG.

DECLARE @cols AS NVARCHAR(MAX), @query AS NVARCHAR(MAX)

SELECT @cols = STRING_AGG(QUOTENAME(c.phaseid),', ')
/*OPTIONAL: within group (order by cast(t1.[FLOW_SP_SLPM] as INT) asc)*/
FROM (SELECT phaseid FROM temp
GROUP BY phaseid) c

set @query = 'SELECT elementid,' + @cols + ' from 
             (
                select elementid,
                phaseid,
                effort
                from temp
            ) x
            PIVOT 
            (
                max(effort)
                for phaseid in (' + @cols + ')
            ) p '

execute(@query)

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to sql-server-2008

Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object How to Use Multiple Columns in Partition By And Ensure No Duplicate Row is Returned SQL Server : How to test if a string has only digit characters Conversion of a varchar data type to a datetime data type resulted in an out-of-range value in SQL query Get last 30 day records from today date in SQL Server How to subtract 30 days from the current date using SQL Server Calculate time difference in minutes in SQL Server SQL Connection Error: System.Data.SqlClient.SqlException (0x80131904) SQL Server Service not available in service list after installation of SQL Server Management Studio How to delete large data of table in SQL without log?

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to pivot

Laravel, sync() - how to sync an array and also pass additional pivot fields? Construct pandas DataFrame from list of tuples of (row,col,values) How to convert Rows to Columns in Oracle? TSQL PIVOT MULTIPLE COLUMNS Convert Rows to columns using 'Pivot' in SQL Server Efficiently convert rows to columns in sql server MySQL - sum column value(s) based on row from the same table SQL Server Pivot Table with multiple column aggregates Simple way to transpose columns and rows in SQL? MySQL pivot table query with dynamic columns