We can take a simple example.
Consider a table named TableA
with the following values:
id firstname lastname Mark
-------------------------------------------------------------------
1 arun prasanth 40
2 ann antony 45
3 sruthy abc 41
6 new abc 47
1 arun prasanth 45
1 arun prasanth 49
2 ann antony 49
GROUP BY
The SQL GROUP BY clause can be used in a SELECT statement to collect data across multiple records and group the results by one or more columns.
In more simple words GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
Syntax:
SELECT expression1, expression2, ... expression_n,
aggregate_function (aggregate_expression)
FROM tables
WHERE conditions
GROUP BY expression1, expression2, ... expression_n;
We can apply GROUP BY
in our table:
select SUM(Mark)marksum,firstname from TableA
group by id,firstName
Results:
marksum firstname
----------------
94 ann
134 arun
47 new
41 sruthy
In our real table we have 7 rows and when we apply GROUP BY id
, the server group the results based on id
:
In simple words:
here
GROUP BY
normally reduces the number of rows returned by rolling them up and calculatingSum()
for each row.
PARTITION BY
Before going to PARTITION BY, let us look at the OVER
clause:
According to the MSDN definition:
OVER clause defines a window or user-specified set of rows within a query result set. A window function then computes a value for each row in the window. You can use the OVER clause with functions to compute aggregated values such as moving averages, cumulative aggregates, running totals, or a top N per group results.
PARTITION BY will not reduce the number of rows returned.
We can apply PARTITION BY in our example table:
SELECT SUM(Mark) OVER (PARTITION BY id) AS marksum, firstname FROM TableA
Result:
marksum firstname
-------------------
134 arun
134 arun
134 arun
94 ann
94 ann
41 sruthy
47 new
Look at the results - it will partition the rows and returns all rows, unlike GROUP BY.