SQL join selecting the last records in a one-to-many relationship

Question

Suppose I have a table of customers and a table of purchases  Each purchase belongs to one customer  I want to get a list of all customers along with their last purchase in one SELECT statement  What is the best practice  Any advice on building indexes  Please use these table column names in your answer   customer  id  name purchase  id  customer id  item id  date  And in more complicated situations  would it be  performance-wise  beneficial to denormalize the database by putting the last purchase into the customer table  If the  purchase  id is guaranteed to be sorted by date  can the statements be simplified by using something like LIMIT 1

User · Answer

I found this thread as a solution to my problem.

But when I tried them the performance was low. Bellow is my suggestion for better performance.

With MaxDates as (
SELECT  customer_id,
                MAX(date) MaxDate
        FROM    purchase
        GROUP BY customer_id
)

SELECT  c.*, M.*
FROM    customer c INNER JOIN
        MaxDates as M ON c.id = M.customer_id

Hope this will be helpful.

User · Answer

If you re using PostgreSQL you can use DISTINCT ON to find the first row in a group   SELECT customer    purchase   FROM customer JOIN      SELECT DISTINCT ON  customer id       FROM purchase    ORDER BY customer id  date DESC   purchase ON purchase customer id   customer id   PostgreSQL Docs - Distinct On  Note that the DISTINCT ON field s  -- here customer id -- must match the left most field s  in the ORDER BY clause   Caveat  This is a nonstandard clause

User · Answer

Try this  It will help   I have used this in my project   SELECT    FROM customer c OUTER APPLY SELECT top 1   FROM purchase pi  WHERE pi customer id   c Id order by pi Id desc  AS  LastPurchasePrice

User · Answer

This is an example of the greatest-n-per-group problem that has appeared regularly on StackOverflow   Here s how I usually recommend solving it   SELECT c    p1   FROM customer c JOIN purchase p1 ON  c id   p1 customer id  LEFT OUTER JOIN purchase p2 ON  c id   p2 customer id AND       p1 date  lt  p2 date OR  p1 date   p2 date AND p1 id  lt  p2 id    WHERE p2 id IS NULL    Explanation  given a row p1  there should be no row p2 with the same customer and a later date  or in the case of ties  a later id    When we find that to be true  then p1 is the most recent purchase for that customer   Regarding indexes  I d create a compound index in purchase over the columns  customer id  date  id    That may allow the outer join to be done using a covering index   Be sure to test on your platform  because optimization is implementation-dependent   Use the features of your RDBMS to analyze the optimization plan   E g  EXPLAIN on MySQL     Some people use subqueries instead of the solution I show above  but I find my solution makes it easier to resolve ties

User · Answer

Without getting into the code first  the logic algorithm goes below   Go to the transaction table with multiple records for the same client   Select records of clientID and the latestDate of client s activity using group by clientID and max transactionDate     select clientID  max transactionDate  as latestDate     from transaction     group by clientID   inner join the transaction table with the outcome from Step 2  then you will have the full records of the transaction table with only each client s latest record     select   from     transaction t     inner join        select clientID  max transactionDate  as latestDate      from transaction       group by clientID  d     on t clientID   d clientID and t transactionDate   d latestDate     You can use the result from step 3 to join any table you want to get different results

User · Answer

Another approach would be to use a NOT EXISTS condition in your join condition to test for later purchases   SELECT   FROM customer c LEFT JOIN purchase p ON          c id   p customer id    AND NOT EXISTS        SELECT 1 FROM purchase p1      WHERE p1 customer id   c id      AND p1 id  gt  p id

User · Answer

Please try this   SELECT  c Id  c name   SELECT pi price FROM purchase pi WHERE pi Id   MAX p Id   AS  LastPurchasePrice  FROM customer c INNER JOIN purchase p  ON c Id   p customerId  GROUP BY c Id c name

User · Answer

You haven t specified the database  If it is one that allows analytical functions it may be faster to use this approach than the GROUP BY one definitely faster in Oracle  most likely faster in the late SQL Server editions  don t know about others    Syntax in SQL Server would be   SELECT c    p   FROM customer c INNER JOIN        SELECT RANK   OVER  PARTITION BY customer id ORDER BY date DESC  r                 FROM purchase  p ON  c id   p customer id  WHERE p r   1

User · Answer

You could also try doing this using a sub select  SELECT  c    p   FROM    customer c INNER JOIN                       SELECT  customer id                      MAX date  MaxDate             FROM    purchase             GROUP BY customer id           MaxDates ON c id   MaxDates customer id INNER JOIN         purchase p ON   MaxDates customer id   p customer id                     AND MaxDates MaxDate   p date   The select should join on all customers and their Last purchase date

User · Answer

Tested on SQLite   SELECT c    p    max p date  FROM customer c LEFT OUTER JOIN purchase p ON c id   p customer id GROUP BY c id   The max   aggregate function will make sure that the latest purchase is selected from each group  but assumes that the date column is in a format whereby max   gives the latest - which is normally the case   If you want to handle purchases with the same date then you can use max p date  p id    In terms of indexes  I would use an index on purchase with  customer id  date   any other purchase columns you want to return in your select     The LEFT OUTER JOIN  as opposed to INNER JOIN  will make sure that customers that have never made a purchase are also included

[sql] SQL join: selecting the last records in a one-to-many relationship

Examples related to sql

Examples related to select

Examples related to join

Examples related to indexing

Examples related to greatest-n-per-group