[sql] Oracle "Partition By" Keyword

Can someone please explain what the partition by keyword does and give a simple example of it in action, as well as why one would want to use it? I have a SQL query written by someone else and I'm trying to figure out what it does.

An example of partition by:

SELECT empno, deptno, COUNT(*) 
OVER (PARTITION BY deptno) DEPT_COUNT
FROM emp

The examples I've seen online seem a bit too in-depth.

This question is related to sql oracle window-functions

The answer is


I think, this example suggests a small nuance on how the partitioning works and how group by works. My example is from Oracle 12, if my example happens to be a compiling bug.

I tried :

SELECT t.data_key
,      SUM ( CASE when t.state = 'A' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_a_rows
,      SUM ( CASE when t.state = 'B' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_b_rows
,      SUM ( CASE when t.state = 'C' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_c_rows
,      COUNT (1) total_rows
from mytable t
group by t.data_key  ---- This does not compile as the compiler feels that t.state isn't in the group by and doesn't recognize the aggregation I'm looking for

This however works as expected :

SELECT distinct t.data_key
,      SUM ( CASE when t.state = 'A' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_a_rows
,      SUM ( CASE when t.state = 'B' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_b_rows
,      SUM ( CASE when t.state = 'C' THEN 1 ELSE 0 END) 
OVER   (PARTITION BY t.data_key) count_c_rows
,      COUNT (1) total_rows
from mytable t;

Producing the number of elements in each state based on the external key "data_key". So, if, data_key = 'APPLE' had 3 rows with state 'A', 2 rows with state 'B', a row with state 'C', the corresponding row for 'APPLE' would be 'APPLE', 3, 2, 1, 6.


the over partition keyword is as if we are partitioning the data by client_id creation a subset of each client id

select client_id, operation_date,
       row_number() count(*) over (partition by client_id order by client_id ) as operationctrbyclient
from client_operations e
order by e.client_id;

this query will return the number of operations done by the client_id


The concept is very well explained by the accepted answer, but I find that the more example one sees, the better it sinks in. Here's an incremental example:

1) Boss says "get me number of items we have in stock grouped by brand"

You say: "no problem"

SELECT 
      BRAND
      ,COUNT(ITEM_ID) 
FROM 
      ITEMS
GROUP BY 
      BRAND;

Result:

+--------------+---------------+
|  Brand       |   Count       | 
+--------------+---------------+
| H&M          |     50        |
+--------------+---------------+
| Hugo Boss    |     100       |
+--------------+---------------+
| No brand     |     22        |
+--------------+---------------+

2) The boss says "Now get me a list of all items, with their brand AND number of items that the respective brand has"

You may try:

 SELECT 
      ITEM_NR
      ,BRAND
      ,COUNT(ITEM_ID) 
 FROM 
      ITEMS
 GROUP BY 
      BRAND;

But you get:

ORA-00979: not a GROUP BY expression 

This is where the OVER (PARTITION BY BRAND) comes in:

 SELECT 
      ITEM_NR
      ,BRAND
      ,COUNT(ITEM_ID) OVER (PARTITION BY BRAND) 
 FROM 
      ITEMS;

Whic means:

  • COUNT(ITEM_ID) - get the number of items
  • OVER - Over the set of rows
  • (PARTITION BY BRAND) - that have the same brand

And the result is:

+--------------+---------------+----------+
|  Items       |  Brand        | Count()  |
+--------------+---------------+----------+
|  Item 1      |  Hugo Boss    |   100    | 
+--------------+---------------+----------+
|  Item 2      |  Hugo Boss    |   100    | 
+--------------+---------------+----------+
|  Item 3      |  No brand     |   22     | 
+--------------+---------------+----------+
|  Item 4      |  No brand     |   22     | 
+--------------+---------------+----------+
|  Item 5      |  H&M          |   50     | 
+--------------+---------------+----------+

etc...


EMPNO     DEPTNO DEPT_COUNT

 7839         10          4
 5555         10          4
 7934         10          4
 7782         10          4 --- 4 records in table for dept 10
 7902         20          4
 7566         20          4
 7876         20          4
 7369         20          4 --- 4 records in table for dept 20
 7900         30          6
 7844         30          6
 7654         30          6
 7521         30          6
 7499         30          6
 7698         30          6 --- 6 records in table for dept 30

Here we are getting count for respective deptno. As for deptno 10 we have 4 records in table emp similar results for deptno 20 and 30 also.


It is the SQL extension called analytics. The "over" in the select statement tells oracle that the function is a analytical function, not a group by function. The advantage to using analytics is that you can collect sums, counts, and a lot more with just one pass through of the data instead of looping through the data with sub selects or worse, PL/SQL.

It does look confusing at first but this will be second nature quickly. No one explains it better then Tom Kyte. So the link above is great.

Of course, reading the documentation is a must.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to oracle

concat yesterdays date with a specific time ORA-28001: The password has expired how to modify the size of a column How to create a blank/empty column with SELECT query in oracle? Find the number of employees in each department - SQL Oracle Query to display all tablespaces in a database and datafiles When or Why to use a "SET DEFINE OFF" in Oracle Database How to insert date values into table error: ORA-65096: invalid common user or role name in oracle In Oracle SQL: How do you insert the current date + time into a table?

Examples related to window-functions

What is ROWS UNBOUNDED PRECEDING used for in Teradata? Pandas get topmost n records within each group Select Row number in postgres What's the difference between RANK() and DENSE_RANK() functions in oracle? PostgreSQL unnest() with element number SQL Server: Difference between PARTITION BY and GROUP BY OVER clause in Oracle Oracle "Partition By" Keyword