The table name is "OrderDetails" and columns are given below:
OrderDetailID || ProductID || ProductName || OrderQuantity
I'm trying to select multiple columns and Group By ProductID while having SUM of OrderQuantity.
Select ProductID,ProductName,OrderQuantity Sum(OrderQuantity)
from OrderDetails Group By ProductID
But of course this code gives an error. I have to add other column names to group by, but that's not what I want and since my data has many items so results are unexpected that way.
Sample Data Query:
ProductID,ProductName,OrderQuantity from OrderDetails
Results are below:
ProductID ProductName OrderQuantity
1001 abc 5
1002 abc 23 (ProductNames can be same)
2002 xyz 8
3004 ytp 15
4001 aze 19
1001 abc 7 (2nd row of same ProductID)
Expected result:
ProductID ProductName OrderQuantity
1001 abc 12 (group by productID while summing)
1002 abc 23
2002 xyz 8
3004 ytp 15
4001 aze 19
How do I select multiple columns and Group By ProductID column since ProductName is not unique?
While doing that, also get the sum of the OrderQuantity column.
In my opinion this is a serious language flaw that puts SQL light years behind other languages. This is my incredibly hacky workaround. It is a total kludge but it always works.
Before I do I want to draw attention to @Peter Mortensen's answer, which in my opinion is the correct answer. The only reason I do the below instead is because most implementations of SQL have incredibly slow join operations and force you to break "don't repeat yourself". I need my queries to populate fast.
Also this is an old way of doing things. STRING_AGG and STRING_SPLIT are a lot cleaner. Again I do it this way because it always works.
-- remember Substring is 1 indexed, not 0 indexed
SELECT ProductId
, SUBSTRING (
MAX(enc.pnameANDoq), 1, CHARINDEX(';', MAX(enc.pnameANDoq)) - 1
) AS ProductName
, SUM ( CAST ( SUBSTRING (
MAX(enc.pnameAndoq), CHARINDEX(';', MAX(enc.pnameANDoq)) + 1, 9999
) AS INT ) ) AS OrderQuantity
FROM (
SELECT CONCAT (ProductName, ';', CAST(OrderQuantity AS VARCHAR(10)))
AS pnameANDoq, ProductID
FROM OrderDetails
) enc
GROUP BY ProductId
Or in plain language :
Performance wise I have always had superior performance using strings over things like, say, bigints. At least with microsoft and oracle substring is a fast operation.
This avoids the problems you run into when you use MAX() where when you use MAX() on multiple fields they no longer agree and come from different rows. In this case your data is guaranteed to be glued together exactly the way you asked it to be.
To access a 3rd or 4th field, you'll need nested substrings, "after the first semicolon look for a 2nd". This is why STRING_SPLIT is better if it is available.
Note : While outside the scope of your question this is especially useful when you are in the opposite situation and you're grouping on a combined key, but don't want every possible permutation displayed, that is you want to expose 'foo' and 'bar' as a combined key but want to group by 'foo'
I just wanted to add a more effective and generic way to solve this kind of problems. The main idea is about working with sub queries.
do your group by and join the same table on the ID of the table.
your case is more specific since your productId is not unique so there is 2 ways to solve this.
I will begin by the more specific solution:
Since your productId is not unique we will need an extra step which is to select DISCTINCT
product ids after grouping and doing the sub query like following:
WITH CTE_TEST AS (SELECT productId, SUM(OrderQuantity) Total
FROM OrderDetails
GROUP BY productId)
SELECT DISTINCT(OrderDetails.ProductID), OrderDetails.ProductName, CTE_TEST.Total
FROM OrderDetails
INNER JOIN CTE_TEST ON CTE_TEST.ProductID = OrderDetails.ProductID
this returns exactly what is expected
ProductID ProductName Total
1001 abc 12
1002 abc 23
2002 xyz 8
3004 ytp 15
4001 aze 19
But there a cleaner way to do this. I guess that ProductId
is a foreign key to products table and i guess that there should be and OrderId
primary key (unique) in this table.
in this case there are few steps to do to include extra columns while grouping on only one. It will be the same solution as following
Let's take this t_Value
table for example:
If i want to group by description and also display all columns.
All i have to do is:
WITH CTE_Name
subquery with your GroupBy column and COUNT conditionINNER JOIN
with CTE on the ID(primary key or unique constraint) columnand that's it!
Here is the query
WITH CTE_TEST AS (SELECT Description, MAX(Id) specID, COUNT(Description) quantity
FROM sch_dta.t_value
GROUP BY Description)
SELECT sch_dta.t_Value.*, CTE_TEST.quantity
FROM sch_dta.t_Value
INNER JOIN CTE_TEST ON CTE_TEST.specID = sch_dta.t_Value.Id
And here is the result:
SELECT ProductID, ProductName, OrderQuantity, SUM(OrderQuantity) FROM OrderDetails WHERE(OrderQuantity) IN(SELECT SUM(OrderQuantity) FROM OrderDetails GROUP BY OrderDetails) GROUP BY ProductID, ProductName, OrderQuantity;
I used the above solution to solve a similar problem in Oracle12c.
==EDIT==
I checked your question again and have concluded this can't be done.
ProductName is not unique, It must either be part of the Group By
or excluded from your results.
For example how would SQL present these results to you if you Group By
only ProductID?
ProductID | ProductName | OrderQuantity
---------------------------------------
1234 | abc | 1
1234 | def | 1
1234 | ghi | 1
1234 | jkl | 1
You can try the below query. I assume you have a single table for all your data.
SELECT OD.ProductID, OD.ProductName, CalQ.OrderQuantity
FROM (SELECT DISTINCT ProductID, ProductName
FROM OrderDetails) OD
INNER JOIN (SELECT ProductID, OrderQuantity SUM(OrderQuantity)
FROM OrderDetails
GROUP BY ProductID) CalQ
ON CalQ.ProductID = OD.ProductID
WITH CTE_SUM AS (
SELECT ProductID, Sum(OrderQuantity) AS TotalOrderQuantity
FROM OrderDetails GROUP BY ProductID
)
SELECT DISTINCT OrderDetails.ProductID, OrderDetails.ProductName, OrderDetails.OrderQuantity,CTE_SUM.TotalOrderQuantity
FROM
OrderDetails INNER JOIN CTE_SUM
ON OrderDetails.ProductID = CTE_SUM.ProductID
Please check if this works.
mysql GROUP_CONCAT
function could help https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_group-concat
SELECT ProductID, GROUP_CONCAT(DISTINCT ProductName) as Names, SUM(OrderQuantity)
FROM OrderDetails GROUP BY ProductID
This would return:
ProductID Names OrderQuantity
1001 red 5
1002 red,black 6
1003 orange 8
1004 black,orange 15
Similar idea as the one @Urs Marian here posted https://stackoverflow.com/a/38779277/906265
You can try this:
Select ProductID,ProductName,Sum(OrderQuantity)
from OrderDetails Group By ProductID, ProductName
You're only required to Group By
columns that doesn't come with an aggregate function in the Select
clause. So you can just use Group By
ProductID and ProductName in this case.
Your Data
DECLARE @OrderDetails TABLE
(ProductID INT,ProductName VARCHAR(10), OrderQuantity INT)
INSERT INTO @OrderDetails VALUES
(1001,'abc',5),(1002,'abc',23),(2002,'xyz',8),
(3004,'ytp',15),(4001,'aze',19),(1001,'abc',7)
Query
Select ProductID, ProductName, Sum(OrderQuantity) AS Total
from @OrderDetails
Group By ProductID, ProductName ORDER BY ProductID
Result
+---------------------------------+
¦ ProductID ¦ ProductName ¦ Total ¦
¦-----------+-------------+-------¦
¦ 1001 ¦ abc ¦ 12 ¦
¦ 1002 ¦ abc ¦ 23 ¦
¦ 2002 ¦ xyz ¦ 8 ¦
¦ 3004 ¦ ytp ¦ 15 ¦
¦ 4001 ¦ aze ¦ 19 ¦
+---------------------------------+
I had a similar problem to the OP. Then I saw the answer from @Urs Marian which helped a lot. But additionally what I was looking for is, when there are multiple values in a column and they will be grouped, how I can get the last submitted value (e.g. ordered by a date/id column).
Example:
We have following table structure:
CREATE TABLE tablename(
[msgid] [int] NOT NULL,
[userid] [int] NOT NULL,
[username] [varchar](70) NOT NULL,
[message] [varchar](5000) NOT NULL
)
Now there are at least two datasets in the table:
+-------+--------+----------+---------+
| msgid | userid | username | message |
+-------+--------+----------+---------+
| 1 | 1 | userA | hello |
| 2 | 1 | userB | world |
+-------+--------+----------+---------+
Therefore following SQL script does work (checked on MSSQL) to group it, also if the same userid has different username values. In the example below, the username with the highest msgid will be shown:
SELECT m.userid,
(select top 1 username from table where userid = m.userid order by msgid desc) as username,
count(*) as messages
FROM tablename m
GROUP BY m.userid
ORDER BY count(*) DESC
Source: Stackoverflow.com