[sql] What is the difference between HAVING and WHERE in SQL?

What is the difference between HAVING and WHERE in an SQL SELECT statement?

EDIT: I have marked Steven's answer as the correct one as it contained the key bit of information on the link:

When GROUP BY is not used, HAVING behaves like a WHERE clause

The situation I had seen the WHERE in did not have GROUP BY and is where my confusion started. Of course, until you know this you can't specify it in the question.

This question is related to sql where-clause having

The answer is


In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,

Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.


WHERE clause does not work for aggregate functions
means : you should not use like this bonus : table name

SELECT name  
FROM bonus  
GROUP BY name  
WHERE sum(salary) > 200  

HERE Instead of using WHERE clause you have to use HAVING..

without using GROUP BY clause, HAVING clause just works as WHERE clause

SELECT name  
FROM bonus  
GROUP BY name  
HAVING sum(salary) > 200  

WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.

HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.

SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.


In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,

Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.


HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City

Gives you a table of all cities in MA and the number of addresses in each city.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5

Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.


I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0


From here.

the SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions

as opposed to the WHERE clause which is applied to database rows


When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.

However, when GROUP BY is used:

  • The WHERE clause is used to filter records from a result. The filtering occurs before any groupings are made.
  • The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).

The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.

Check out this w3schools link for more information

Syntax:

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value

A query such as this:

SELECT column_name, COUNT( column_name ) AS column_name_tally
  FROM table_name
 WHERE column_name < 3
 GROUP 
    BY column_name
HAVING COUNT( column_name ) >= 3;

...may be rewritten using a derived table (and omitting the HAVING) like this:

SELECT column_name, column_name_tally
  FROM (
        SELECT column_name, COUNT(column_name) AS column_name_tally
          FROM table_name
         WHERE column_name < 3
         GROUP 
            BY column_name
       ) pointless_range_variable_required_here
 WHERE column_name_tally >= 3;

In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,

Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.


From here.

the SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions

as opposed to the WHERE clause which is applied to database rows


While working on a project, this was also my question. As stated above, the HAVING checks the condition on the query result already found. But WHERE is for checking condition while query runs.

Let me give an example to illustrate this. Suppose you have a database table like this.

usertable{ int userid, date datefield, int dailyincome }

Suppose, the following rows are in table:

1, 2011-05-20, 100

1, 2011-05-21, 50

1, 2011-05-30, 10

2, 2011-05-30, 10

2, 2011-05-20, 20

Now, we want to get the userids and sum(dailyincome) whose sum(dailyincome)>100

If we write:

SELECT userid, sum(dailyincome) FROM usertable WHERE sum(dailyincome)>100 GROUP BY userid

This will be an error. The correct query would be:

SELECT userid, sum(dailyincome) FROM usertable GROUP BY userid HAVING sum(dailyincome)>100


HAVING is used when you are using an aggregate such as GROUP BY.

SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;

I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0


WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.

HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.

SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.


The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.

Check out this w3schools link for more information

Syntax:

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value

A query such as this:

SELECT column_name, COUNT( column_name ) AS column_name_tally
  FROM table_name
 WHERE column_name < 3
 GROUP 
    BY column_name
HAVING COUNT( column_name ) >= 3;

...may be rewritten using a derived table (and omitting the HAVING) like this:

SELECT column_name, column_name_tally
  FROM (
        SELECT column_name, COUNT(column_name) AS column_name_tally
          FROM table_name
         WHERE column_name < 3
         GROUP 
            BY column_name
       ) pointless_range_variable_required_here
 WHERE column_name_tally >= 3;

Where clause is used to filter out rows from all over the database. But Having clause is used to filter out rows from a specific group of database. Having clause can also help in aggregate function like min/max/average

for more detail.. what's the difference between WHERE and HAVING what is the difference between WHERE and HAVING


The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.

Check out this w3schools link for more information

Syntax:

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value

A query such as this:

SELECT column_name, COUNT( column_name ) AS column_name_tally
  FROM table_name
 WHERE column_name < 3
 GROUP 
    BY column_name
HAVING COUNT( column_name ) >= 3;

...may be rewritten using a derived table (and omitting the HAVING) like this:

SELECT column_name, column_name_tally
  FROM (
        SELECT column_name, COUNT(column_name) AS column_name_tally
          FROM table_name
         WHERE column_name < 3
         GROUP 
            BY column_name
       ) pointless_range_variable_required_here
 WHERE column_name_tally >= 3;

In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,

Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.


I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0


HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City

Gives you a table of all cities in MA and the number of addresses in each city.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5

Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.


I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0


HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City

Gives you a table of all cities in MA and the number of addresses in each city.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5

Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.


From here.

the SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions

as opposed to the WHERE clause which is applied to database rows


Difference b/w WHERE and HAVING clause:

The main difference between WHERE and HAVING clause is, WHERE is used for row operations and HAVING is used for column operations.

Why we need HAVING clause?

As we know, aggregate functions can only be performed on columns, so we can not use aggregate functions in WHERE clause. Therefore, we use aggregate functions in HAVING clause.


When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.

However, when GROUP BY is used:

  • The WHERE clause is used to filter records from a result. The filtering occurs before any groupings are made.
  • The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).

Resource from Here


HAVING is used when you are using an aggregate such as GROUP BY.

SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;

Where clause is used to filter out rows from all over the database. But Having clause is used to filter out rows from a specific group of database. Having clause can also help in aggregate function like min/max/average

for more detail.. what's the difference between WHERE and HAVING what is the difference between WHERE and HAVING


HAVING is used when you are using an aggregate such as GROUP BY.

SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;

WHERE clause is used for comparing values in the base table, whereas the HAVING clause can be used for filtering the results of aggregate functions in the result set of the query Click here!


From here.

the SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions

as opposed to the WHERE clause which is applied to database rows


I had a problem and found out another difference between WHERE and HAVING. It does not act the same way on indexed columns.

WHERE my_indexed_row = 123 will show rows and automatically perform a "ORDER ASC" on other indexed rows.

HAVING my_indexed_row = 123 shows everything from the oldest "inserted" row to the newest one, no ordering.


The difference between the two is in the relationship to the GROUP BY clause:

  • WHERE comes before GROUP BY; SQL evaluates the WHERE clause before it groups records.

  • HAVING comes after GROUP BY; SQL evaluates HAVING after it groups records.

select statement diagram

References


When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.

However, when GROUP BY is used:

  • The WHERE clause is used to filter records from a result. The filtering occurs before any groupings are made.
  • The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).

Resource from Here


Number one difference for me: if HAVING was removed from the SQL language then life would go on more or less as before. Certainly, a minority queries would need to be rewritten using a derived table, CTE, etc but they would arguably be easier to understand and maintain as a result. Maybe vendors' optimizer code would need to be rewritten to account for this, again an opportunity for improvement within the industry.

Now consider for a moment removing WHERE from the language. This time the majority of queries in existence would need to be rewritten without an obvious alternative construct. Coders would have to get creative e.g. inner join to a table known to contain exactly one row (e.g. DUAL in Oracle) using the ON clause to simulate the prior WHERE clause. Such constructions would be contrived; it would be obvious there was something was missing from the language and the situation would be worse as a result.

TL;DR we could lose HAVING tomorrow and things would be no worse, possibly better, but the same cannot be said of WHERE.


From the answers here, it seems that many folk don't realize that a HAVING clause may be used without a GROUP BY clause. In this case, the HAVING clause is applied to the entire table expression and requires that only constants appear in the SELECT clause. Typically the HAVING clause will involve aggregates.

This is more useful than it sounds. For example, consider this query to test whether the name column is unique for all values in T:

SELECT 1 AS result
  FROM T
HAVING COUNT( DISTINCT name ) = COUNT( name );

There are only two possible results: if the HAVING clause is true then the result with be a single row containing the value 1, otherwise the result will be the empty set.


WHERE clause is used for comparing values in the base table, whereas the HAVING clause can be used for filtering the results of aggregate functions in the result set of the query Click here!


WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.

HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.

SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.


Number one difference for me: if HAVING was removed from the SQL language then life would go on more or less as before. Certainly, a minority queries would need to be rewritten using a derived table, CTE, etc but they would arguably be easier to understand and maintain as a result. Maybe vendors' optimizer code would need to be rewritten to account for this, again an opportunity for improvement within the industry.

Now consider for a moment removing WHERE from the language. This time the majority of queries in existence would need to be rewritten without an obvious alternative construct. Coders would have to get creative e.g. inner join to a table known to contain exactly one row (e.g. DUAL in Oracle) using the ON clause to simulate the prior WHERE clause. Such constructions would be contrived; it would be obvious there was something was missing from the language and the situation would be worse as a result.

TL;DR we could lose HAVING tomorrow and things would be no worse, possibly better, but the same cannot be said of WHERE.


From the answers here, it seems that many folk don't realize that a HAVING clause may be used without a GROUP BY clause. In this case, the HAVING clause is applied to the entire table expression and requires that only constants appear in the SELECT clause. Typically the HAVING clause will involve aggregates.

This is more useful than it sounds. For example, consider this query to test whether the name column is unique for all values in T:

SELECT 1 AS result
  FROM T
HAVING COUNT( DISTINCT name ) = COUNT( name );

There are only two possible results: if the HAVING clause is true then the result with be a single row containing the value 1, otherwise the result will be the empty set.


One way to think of it is that the having clause is an additional filter to the where clause.

A WHERE clause is used filters records from a result. The filter occurs before any groupings are made. A HAVING clause is used to filter values from a group


When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.

However, when GROUP BY is used:

  • The WHERE clause is used to filter records from a result. The filtering occurs before any groupings are made.
  • The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).

HAVING is used when you are using an aggregate such as GROUP BY.

SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;

Difference b/w WHERE and HAVING clause:

The main difference between WHERE and HAVING clause is, WHERE is used for row operations and HAVING is used for column operations.

Why we need HAVING clause?

As we know, aggregate functions can only be performed on columns, so we can not use aggregate functions in WHERE clause. Therefore, we use aggregate functions in HAVING clause.


WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.

HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.

SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.


One way to think of it is that the having clause is an additional filter to the where clause.

A WHERE clause is used filters records from a result. The filter occurs before any groupings are made. A HAVING clause is used to filter values from a group


HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City

Gives you a table of all cities in MA and the number of addresses in each city.

This code:

select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5

Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.


WHERE clause does not work for aggregate functions
means : you should not use like this bonus : table name

SELECT name  
FROM bonus  
GROUP BY name  
WHERE sum(salary) > 200  

HERE Instead of using WHERE clause you have to use HAVING..

without using GROUP BY clause, HAVING clause just works as WHERE clause

SELECT name  
FROM bonus  
GROUP BY name  
HAVING sum(salary) > 200  

The difference between the two is in the relationship to the GROUP BY clause:

  • WHERE comes before GROUP BY; SQL evaluates the WHERE clause before it groups records.

  • HAVING comes after GROUP BY; SQL evaluates HAVING after it groups records.

select statement diagram

References


I had a problem and found out another difference between WHERE and HAVING. It does not act the same way on indexed columns.

WHERE my_indexed_row = 123 will show rows and automatically perform a "ORDER ASC" on other indexed rows.

HAVING my_indexed_row = 123 shows everything from the oldest "inserted" row to the newest one, no ordering.


While working on a project, this was also my question. As stated above, the HAVING checks the condition on the query result already found. But WHERE is for checking condition while query runs.

Let me give an example to illustrate this. Suppose you have a database table like this.

usertable{ int userid, date datefield, int dailyincome }

Suppose, the following rows are in table:

1, 2011-05-20, 100

1, 2011-05-21, 50

1, 2011-05-30, 10

2, 2011-05-30, 10

2, 2011-05-20, 20

Now, we want to get the userids and sum(dailyincome) whose sum(dailyincome)>100

If we write:

SELECT userid, sum(dailyincome) FROM usertable WHERE sum(dailyincome)>100 GROUP BY userid

This will be an error. The correct query would be:

SELECT userid, sum(dailyincome) FROM usertable GROUP BY userid HAVING sum(dailyincome)>100