[sql] Partition Function COUNT() OVER possible using DISTINCT

I'm trying to write the following in order to get a running total of distinct NumUsers, like so:

NumUsers = COUNT(DISTINCT [UserAccountKey]) OVER (PARTITION BY [Mth])

Management studio doesn't seem too happy about this. The error disappears when I remove the DISTINCT keyword, but then it won't be a distinct count.

DISTINCT does not appear to be possible within the partition functions. How do I go about finding the distinct count? Do I use a more traditional method such as a correlated subquery?

Looking into this a bit further, maybe these OVER functions work differently to Oracle in the way that they cannot be used in SQL-Server to calculate running totals.

I've added a live example here on SQLfiddle where I attempt to use a partition function to calculate a running total.

This question is related to sql sql-server tsql sql-server-2008-r2 sql-server-2014

The answer is


Necromancing:

It's relativiely simple to emulate a COUNT DISTINCT over PARTITION BY with MAX via DENSE_RANK:

;WITH baseTable AS
(
    SELECT 'RM1' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM1' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM2' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM2' AS RM, 'ADR2' AS ADR
    UNION ALL SELECT 'RM2' AS RM, 'ADR2' AS ADR
    UNION ALL SELECT 'RM2' AS RM, 'ADR3' AS ADR
    UNION ALL SELECT 'RM3' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM2' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM3' AS RM, 'ADR1' AS ADR
    UNION ALL SELECT 'RM3' AS RM, 'ADR2' AS ADR
)
,CTE AS
(
    SELECT RM, ADR, DENSE_RANK() OVER(PARTITION BY RM ORDER BY ADR) AS dr 
    FROM baseTable
)
SELECT
     RM
    ,ADR

    ,COUNT(CTE.ADR) OVER (PARTITION BY CTE.RM ORDER BY ADR) AS cnt1 
    ,COUNT(CTE.ADR) OVER (PARTITION BY CTE.RM) AS cnt2 
    -- Not supported
    --,COUNT(DISTINCT CTE.ADR) OVER (PARTITION BY CTE.RM ORDER BY CTE.ADR) AS cntDist
    ,MAX(CTE.dr) OVER (PARTITION BY CTE.RM ORDER BY CTE.RM) AS cntDistEmu 
FROM CTE

Note:
This assumes the fields in question are NON-nullable fields.
If there is one or more NULL-entries in the fields, you need to subtract 1.


I use a solution that is similar to that of David above, but with an additional twist if some rows should be excluded from the count. This assumes that [UserAccountKey] is never null.

-- subtract an extra 1 if null was ranked within the partition,
-- which only happens if there were rows where [Include] <> 'Y'
dense_rank() over (
  partition by [Mth] 
  order by case when [Include] = 'Y' then [UserAccountKey] else null end asc
) 
+ dense_rank() over (
  partition by [Mth] 
  order by case when [Include] = 'Y' then [UserAccountKey] else null end desc
)
- max(case when [Include] = 'Y' then 0 else 1 end) over (partition by [Mth])
- 1

An SQL Fiddle with an extended example can be found here.


I think the only way of doing this in SQL-Server 2008R2 is to use a correlated subquery, or an outer apply:

SELECT  datekey,
        COALESCE(RunningTotal, 0) AS RunningTotal,
        COALESCE(RunningCount, 0) AS RunningCount,
        COALESCE(RunningDistinctCount, 0) AS RunningDistinctCount
FROM    document
        OUTER APPLY
        (   SELECT  SUM(Amount) AS RunningTotal,
                    COUNT(1) AS RunningCount,
                    COUNT(DISTINCT d2.dateKey) AS RunningDistinctCount
            FROM    Document d2
            WHERE   d2.DateKey <= document.DateKey
        ) rt;

This can be done in SQL-Server 2012 using the syntax you have suggested:

SELECT  datekey,
        SUM(Amount) OVER(ORDER BY DateKey) AS RunningTotal
FROM    document

However, use of DISTINCT is still not allowed, so if DISTINCT is required and/or if upgrading isn't an option then I think OUTER APPLY is your best option


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to sql-server-2008-r2

A connection was successfully established with the server, but then an error occurred during the login process. (Error Number: 233) Rebuild all indexes in a Database How to get last 7 days data from current datetime to last 7 days in sql server Get last 30 day records from today date in SQL Server what is numeric(18, 0) in sql server 2008 r2 Conversion failed when converting date and/or time from character string in SQL SERVER 2008 'MOD' is not a recognized built-in function name How to Convert datetime value to yyyymmddhhmmss in SQL server? SQL Server stored procedure Nullable parameter SQL Server 2008 R2 can't connect to local database in Management Studio

Examples related to sql-server-2014

Could not find server 'server name' in sys.servers. SQL Server 2014 How do I create a local database inside of Microsoft SQL Server 2014? SQL Server after update trigger Possible to restore a backup of SQL Server 2014 on SQL Server 2012? How to restore SQL Server 2014 backup in SQL Server 2008 Partition Function COUNT() OVER possible using DISTINCT