Ok, I found this post very interesting and I would like to share some of my knowledge on creating a query. Thanks for this Fluffeh. Others who may read this and may feel that I'm wrong are 101% free to edit and criticise my answer. (Honestly, I feel very thankful for correcting my mistake(s).)
I'll be posting some of the frequently asked questions in MySQL
tag.
Given this schema
CREATE TABLE MovieList
(
ID INT,
MovieName VARCHAR(25),
CONSTRAINT ml_pk PRIMARY KEY (ID),
CONSTRAINT ml_uq UNIQUE (MovieName)
);
INSERT INTO MovieList VALUES (1, 'American Pie');
INSERT INTO MovieList VALUES (2, 'The Notebook');
INSERT INTO MovieList VALUES (3, 'Discovery Channel: Africa');
INSERT INTO MovieList VALUES (4, 'Mr. Bean');
INSERT INTO MovieList VALUES (5, 'Expendables 2');
CREATE TABLE CategoryList
(
MovieID INT,
CategoryName VARCHAR(25),
CONSTRAINT cl_uq UNIQUE(MovieID, CategoryName),
CONSTRAINT cl_fk FOREIGN KEY (MovieID) REFERENCES MovieList(ID)
);
INSERT INTO CategoryList VALUES (1, 'Comedy');
INSERT INTO CategoryList VALUES (1, 'Romance');
INSERT INTO CategoryList VALUES (2, 'Romance');
INSERT INTO CategoryList VALUES (2, 'Drama');
INSERT INTO CategoryList VALUES (3, 'Documentary');
INSERT INTO CategoryList VALUES (4, 'Comedy');
INSERT INTO CategoryList VALUES (5, 'Comedy');
INSERT INTO CategoryList VALUES (5, 'Action');
QUESTION
Find all movies that belong to at least both Comedy
and Romance
categories.
Solution
This question can be very tricky sometimes. It may seem that a query like this will be the answer:-
SELECT DISTINCT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName = 'Comedy' AND
b.CategoryName = 'Romance'
which is definitely very wrong because it produces no result. The explanation of this is that there is only one valid value of CategoryName
on each row. For instance, the first condition returns true, the second condition is always false. Thus, by using AND
operator, both condition should be true; otherwise, it will be false. Another query is like this,
SELECT DISTINCT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName IN ('Comedy','Romance')
and the result is still incorrect because it matches to record that has at least one match on the categoryName
. The real solution would be by counting the number of record instances per movie. The number of instance should match to the total number of the values supplied in the condition.
SELECT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName IN ('Comedy','Romance')
GROUP BY a.MovieName
HAVING COUNT(*) = 2
Given schema,
CREATE TABLE Software
(
ID INT,
SoftwareName VARCHAR(25),
Descriptions VARCHAR(150),
CONSTRAINT sw_pk PRIMARY KEY (ID),
CONSTRAINT sw_uq UNIQUE (SoftwareName)
);
INSERT INTO Software VALUES (1,'PaintMe','used for photo editing');
INSERT INTO Software VALUES (2,'World Map','contains map of different places of the world');
INSERT INTO Software VALUES (3,'Dictionary','contains description, synonym, antonym of the words');
CREATE TABLE VersionList
(
SoftwareID INT,
VersionNo INT,
DateReleased DATE,
CONSTRAINT sw_uq UNIQUE (SoftwareID, VersionNo),
CONSTRAINT sw_fk FOREIGN KEY (SOftwareID) REFERENCES Software(ID)
);
INSERT INTO VersionList VALUES (3, 2, '2009-12-01');
INSERT INTO VersionList VALUES (3, 1, '2009-11-01');
INSERT INTO VersionList VALUES (3, 3, '2010-01-01');
INSERT INTO VersionList VALUES (2, 2, '2010-12-01');
INSERT INTO VersionList VALUES (2, 1, '2009-12-01');
INSERT INTO VersionList VALUES (1, 3, '2011-12-01');
INSERT INTO VersionList VALUES (1, 2, '2010-12-01');
INSERT INTO VersionList VALUES (1, 1, '2009-12-01');
INSERT INTO VersionList VALUES (1, 4, '2012-12-01');
QUESTION
Find the latest version on each software. Display the following columns: SoftwareName
,Descriptions
,LatestVersion
(from VersionNo column),DateReleased
Solution
Some SQL developers mistakenly use MAX()
aggregate function. They tend to create like this,
SELECT a.SoftwareName, a.Descriptions,
MAX(b.VersionNo) AS LatestVersion, b.DateReleased
FROM Software a
INNER JOIN VersionList b
ON a.ID = b.SoftwareID
GROUP BY a.ID
ORDER BY a.ID
(most RDBMS generates a syntax error on this because of not specifying some of the non-aggregated columns on the group by
clause) the result produces the correct LatestVersion
on each software but obviously the DateReleased
are incorrect. MySQL
doesn't support Window Functions
and Common Table Expression
yet as some RDBMS do already. The workaround on this problem is to create a subquery
which gets the individual maximum versionNo
on each software and later on be joined on the other tables.
SELECT a.SoftwareName, a.Descriptions,
b.LatestVersion, c.DateReleased
FROM Software a
INNER JOIN
(
SELECT SoftwareID, MAX(VersionNO) LatestVersion
FROM VersionList
GROUP BY SoftwareID
) b ON a.ID = b.SoftwareID
INNER JOIN VersionList c
ON c.SoftwareID = b.SoftwareID AND
c.VersionNO = b.LatestVersion
GROUP BY a.ID
ORDER BY a.ID
So that was it. I'll be posting another soon as I recall any other FAQ on MySQL
tag. Thank you for reading this little article. I hope that you have atleast get even a little knowledge from this.
UPDATE 1
Given Schema
CREATE TABLE userList
(
ID INT,
NAME VARCHAR(20),
CONSTRAINT us_pk PRIMARY KEY (ID),
CONSTRAINT us_uq UNIQUE (NAME)
);
INSERT INTO userList VALUES (1, 'Fluffeh');
INSERT INTO userList VALUES (2, 'John Woo');
INSERT INTO userList VALUES (3, 'hims056');
CREATE TABLE CONVERSATION
(
ID INT,
FROM_ID INT,
TO_ID INT,
MESSAGE VARCHAR(250),
DeliveryDate DATE
);
INSERT INTO CONVERSATION VALUES (1, 1, 2, 'hi john', '2012-01-01');
INSERT INTO CONVERSATION VALUES (2, 2, 1, 'hello fluff', '2012-01-02');
INSERT INTO CONVERSATION VALUES (3, 1, 3, 'hey hims', '2012-01-03');
INSERT INTO CONVERSATION VALUES (4, 1, 3, 'please reply', '2012-01-04');
INSERT INTO CONVERSATION VALUES (5, 3, 1, 'how are you?', '2012-01-05');
INSERT INTO CONVERSATION VALUES (6, 3, 2, 'sample message!', '2012-01-05');
QUESTION
Find the latest conversation between two users.
Solution
SELECT b.Name SenderName,
c.Name RecipientName,
a.Message,
a.DeliveryDate
FROM Conversation a
INNER JOIN userList b
ON a.From_ID = b.ID
INNER JOIN userList c
ON a.To_ID = c.ID
WHERE (LEAST(a.FROM_ID, a.TO_ID), GREATEST(a.FROM_ID, a.TO_ID), DeliveryDate)
IN
(
SELECT LEAST(FROM_ID, TO_ID) minFROM,
GREATEST(FROM_ID, TO_ID) maxTo,
MAX(DeliveryDate) maxDate
FROM Conversation
GROUP BY minFROM, maxTo
)