Using group by and having clause

Question

Using the following schema   Supplier  sid  name  status  city  Part  pid  name  color  weight  city  Project  jid  name  city  Supplies  sid  pid  jid    quantity     Get supplier numbers and names for suppliers of parts supplied to at least two different projects  Get supplier numbers and names for suppliers of the same part to at least two different projects    These were my answers   1   SELECT s sid  s name FROM Supplier s  Supplies su  Project pr WHERE s sid   su sid AND su jid   pr jid GROUP BY s sid  s name HAVING COUNT  DISTINCT pr jid   gt   2    2   SELECT s sid  s name FROM Suppliers s  Supplies su  Project pr  Part p WHERE s sid   su sid AND su pid   p pid AND su jid   pr jid GROUP BY s sid  s name HAVING COUNT  DISTINCT pr jid  gt  2   Can anyone confirm if I wrote this correctly  I m a little confused as to how the Group By and Having clause works

User · Answer

What type of sql database are using  MSSQL  Oracle etc   I believe what you have written is correct   You could also write the first query like this   SELECT s sid  s name FROM Supplier s WHERE  SELECT COUNT DISTINCT pr jid         FROM Supplies su  Projects pr        WHERE su sid   s sid             AND pr jid   su jid   gt   2   It s a little more readable  and less mind-bending than trying to do it with GROUP BY  Performance may differ though

User · Answer

The semantics of Having  To better understand having  you need to see it from a theoretical point of view   A group by is a query that takes a table and summarizes it into another table  You summarize the original table by grouping the original table into subsets  based upon the attributes that you specify in the group by   Each of these groups will yield one tuple   The Having is simply equivalent to a WHERE clause after the group by has executed and before the select part of the query is computed    Lets say your query is   select a  b  count     from Table  where c  gt  100  group by a  b  having count     gt  10    The evaluation of this query can be seen as the following steps    Perform the WHERE  eliminating rows that do not satisfy it  Group the table into subsets based upon the values of a and b  each tuple in each subset has the same values of a and b   Eliminate subsets that do not satisfy the HAVING condition Process each subset outputting the values as indicated in the SELECT part of the query  This creates one output tuple per subset left after step 3    You can extend this to any complex query there Table can be any complex query that return a table  a cross product  a join  a UNION  etc    In fact  having is syntactic sugar and does not extend the power of SQL  Any given query   SELECT list  FROM table GROUP BY attrList HAVING condition    can be rewritten as   SELECT list from      SELECT listatt     FROM table     GROUP BY attrList  as Name WHERE condition    The listatt is a list that includes the GROUP BY attributes and the expressions used in list and condition  It might be necessary to name some expressions in this list  with AS   For instance  the example query above can be rewritten as   select a  b  count  from  select a  b  count    as count       from Table        where c  gt  100       group by a  b  as someName where count  gt  10    The solution you need  Your solution seems to be correct   SELECT s sid  s name FROM Supplier s  Supplies su  Project pr WHERE s sid   su sid AND su jid   pr jid GROUP BY s sid  s name HAVING COUNT  DISTINCT pr jid   gt   2    You join the three tables  then using sid as a grouping attribute  sname is functionally dependent on it  so it does not have an impact on the number of groups  but you must include it  otherwise it cannot be part of the select part of the statement   Then you are removing those that do not satisfy your condition   the satisfy pr jid is  gt   2  which is that you wanted originally   Best solution to your problem  I personally prefer a simpler cleaner solution    You need to only group by Supplies  sid  pid  jid    quantity  to find the sid of those that supply at least to two projects  Then join it to the Suppliers table to get the  supplier same       SELECT sid  sname from      SELECT sid from supplies      GROUP BY sid  pid      HAVING count DISTINCT jid   gt   2       AS T1 NATURAL JOIN  Supliers    It will also be faster to execute  because the join is only done when needed  not all the times   --dmg

User · Answer

First of all  you should use the JOIN syntax rather than FROM table1  table2  and you should always limit the grouping to as little fields as you need   Altought I haven t tested  your first query seems fine to me  but could be re-written as   SELECT s sid  s name FROM      Supplier s     INNER JOIN          SELECT su sid        FROM Supplies su        GROUP BY su sid        HAVING COUNT DISTINCT su jid   gt  1       g         ON g sid   s sid   Or simplified as   SELECT sid  name FROM Supplier s WHERE       SELECT COUNT DISTINCT su jid      FROM Supplies su     WHERE su sid   s sid    gt  1   However  your second query seems wrong to me  because you should also GROUP BY pid    SELECT s sid  s name     FROM          Supplier s         INNER JOIN               SELECT su sid             FROM Supplies su             GROUP BY su sid  su pid             HAVING COUNT DISTINCT su jid   gt  1           g             ON g sid   s sid   As you may have noticed in the query above  I used the INNER JOIN syntax to perform the filtering  however it can be also written as   SELECT s sid  s name FROM Supplier s WHERE        SELECT COUNT DISTINCT su jid       FROM Supplies su      WHERE su sid   s sid      GROUP BY su sid  su pid    gt  1

User · Answer

1 Get supplier numbers and names for suppliers of parts supplied to at least two different projects    SELECT S SID  S NAME  FROM SUPPLIES SP  JOIN SUPPLIER S  ON SP SID   S SID  WHERE PID IN   SELECT PID FROM SUPPPLIES GROUP BY PID  JID HAVING COUNT     gt   2    I am not slear about your second question

User · Answer

Having  It applies filter conditions to each group of rows  Where  It applies a filter of individual rows

User · Answer

Because we can not use Where clause with aggregate functions like count   min    sum   etc  so having clause came into existence to overcome this problem in sql  see example for having  clause go through this link  http   www sqlfundamental com having-clause php

[sql] Using group by and having clause

Examples related to sql