[sql] Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I got an error -

Column 'Employee.EmpID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.


select loc.LocationID, emp.EmpID
from Employee as emp full join Location as loc 
on emp.LocationID = loc.LocationID
group by loc.LocationID 

This situation fits into the answer given by Bill Karwin.

correction for above, fits into answer by ExactaBox -

select loc.LocationID, count(emp.EmpID) -- not count(*), don't want to count nulls
from Employee as emp full join Location as loc 
on emp.LocationID = loc.LocationID
group by loc.LocationID 

ORIGINAL QUESTION -

For the SQL query -

select *
from Employee as emp full join Location as loc 
on emp.LocationID = loc.LocationID
group by (loc.LocationID)

I don't understand why I get this error. All I want to do is join the tables and then group all the employees in a particular location together.

I think I have a partial explanation for my own question. Tell me if its ok -

To group all employees that work in the same location we have to first mention the LocationID.

Then, we cannot/do not mention each employee ID next to it. Rather, we mention the total number of employees in that location, ie we should SUM() the employees working in that location. Why do we do it the latter way, i am not sure. So, this explains the "it is not contained in either an aggregate function" part of the error.

What is the explanation for the GROUP BY clause part of the error ?

This question is related to sql group-by aggregate-functions

The answer is


Suppose I have the following table T:

a   b
--------
1   abc
1   def
1   ghi
2   jkl
2   mno
2   pqr

And I do the following query:

SELECT a, b
FROM T
GROUP BY a

The output should have two rows, one row where a=1 and a second row where a=2.

But what should the value of b show on each of these two rows? There are three possibilities in each case, and nothing in the query makes it clear which value to choose for b in each group. It's ambiguous.

This demonstrates the single-value rule, which prohibits the undefined results you get when you run a GROUP BY query, and you include any columns in the select-list that are neither part of the grouping criteria, nor appear in aggregate functions (SUM, MIN, MAX, etc.).

Fixing it might look like this:

SELECT a, MAX(b) AS x
FROM T
GROUP BY a

Now it's clear that you want the following result:

a   x
--------
1   ghi
2   pqr

Your query will work in MYSQL if you set to disable ONLY_FULL_GROUP_BY server mode (and by default It is). But in this case, you are using different RDBMS. So to make your query work, add all non-aggregated columns to your GROUP BY clause, eg

SELECT col1, col2, SUM(col3) totalSUM
FROM tableName
GROUP BY col1, col2

Non-Aggregated columns means the column is not pass into aggregated functions like SUM, MAX, COUNT, etc..


Basically, what this error is saying is that if you are going to use the GROUP BY clause, then your result is going to be a relation/table with a row for each group, so in your SELECT statement you can only "select" the column that you are grouping by and use aggregate functions on that column because the other columns will not appear in the resulting table.


"All I want to do is join the tables and then group all the employees in a particular location together."

It sounds like what you want is for the output of the SQL statement to list every employee in the company, but first all the people in the Anaheim office, then the people in the Buffalo office, then the people in the Cleveland office (A, B, C, get it, obviously I don't know what locations you have).

In that case, lose the GROUP BY statement. All you need is ORDER BY loc.LocationID


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to group-by

SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by Count unique values using pandas groupby Pandas group-by and sum Count unique values with pandas per groups Group dataframe and get sum AND count? Error related to only_full_group_by when executing a query in MySql Pandas sum by groupby, but exclude certain columns Using DISTINCT along with GROUP BY in SQL Server Python Pandas : group by in group by and average? How do I create a new column from the output of pandas groupby().sum()?

Examples related to aggregate-functions

Spark SQL: apply aggregate functions to a list of columns GROUP BY without aggregate function GROUP BY + CASE statement must appear in the GROUP BY clause or be used in an aggregate function Naming returned columns in Pandas aggregate function? Concatenate multiple result rows of one column into one, group by another column How to include "zero" / "0" results in COUNT aggregate? Apply multiple functions to multiple groupby columns Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause Optimal way to concatenate/aggregate strings