Retrieving the last record in each group - MySQL

Question

There is a table messages that contains data as shown below   Id   Name   Other Columns ------------------------- 1    A       A data 1 2    A       A data 2 3    A       A data 3 4    B       B data 1 5    B       B data 2 6    C       C data 1   If I run a query select   from messages group by name  I will get the result as   1    A       A data 1 4    B       B data 1 6    C       C data 1   What query will return the following result   3    A       A data 3 5    B       B data 2 6    C       C data 1   That is  the last record in each group should be returned   At present  this is the query that I use   SELECT     FROM  SELECT     FROM messages ORDER BY id DESC  AS x GROUP BY name   But this looks highly inefficient  Any other ways to achieve the same result

User · Answer

Here is another way to get the last related record using GROUP_CONCAT with order by and SUBSTRING_INDEX to pick one of the record from the list

SELECT 
  `Id`,
  `Name`,
  SUBSTRING_INDEX(
    GROUP_CONCAT(
      `Other_Columns` 
      ORDER BY `Id` DESC 
      SEPARATOR '||'
    ),
    '||',
    1
  ) Other_Columns 
FROM
  messages 
GROUP BY `Name`

Above query will group the all the Other_Columns that are in same Name group and using ORDER BY id DESC will join all the Other_Columns in a specific group in descending order with the provided separator in my case i have used || ,using SUBSTRING_INDEX over this list will pick the first one

Fiddle Demo

User · Answer

What about  select    max id  from messages group by name   I have tested it on sqlite and it returns all columns and max id value for all names

User · Answer

Solution by sub query fiddle Link select   from messages where id in  select max id  from messages group by Name   Solution By join condition fiddle link select m1   from messages m1  left outer join messages m2  on   m1 id lt m2 id and m1 name m2 name   where m2 id is null  Reason for this post is to give fiddle link only  Same SQL is already provided in other answers

User · Answer

An approach with considerable speed is as follows   SELECT    FROM messages a WHERE Id    SELECT MAX Id  FROM messages WHERE a Name   Name    Result  Id  Name    Other Columns 3   A   A data 3 5   B   B data 2 6   C   C data 1

User · Answer

UPD  2017-03-31  the version 5 7 5 of MySQL made the ONLY FULL GROUP BY switch enabled by default  hence  non-deterministic GROUP BY queries became disabled   Moreover  they updated the GROUP BY implementation and the solution might not work as expected anymore even with the disabled switch  One needs to check   Bill Karwin s solution above works fine when item count within groups is rather small  but the performance of the query becomes bad when the groups are rather large  since the solution requires about n n 2   n 2 of only IS NULL comparisons   I made my tests on a InnoDB table of 18684446 rows with 1182 groups  The table contains testresults for functional tests and has the  test id  request id  as the primary key  Thus  test id is a group and I was searching for the last request id for each test id   Bill s solution has already been running for several hours on my dell e4310 and I do not know when it is going to finish even though it operates on a coverage index  hence using index in EXPLAIN    I have a couple of other solutions that are based on the same ideas    if the underlying index is BTREE index  which is usually the case   the largest  group id  item value  pair is the last value within each group id  that is the first for each group id if we walk through the index in descending order  if we read the values which are covered by an index  the values are read in the order of the index  each index implicitly contains primary key columns appended to that  that is the primary key is in the coverage index   In solutions below I operate directly on the primary key  in you case  you will just need to add primary key columns in the result  in many cases it is much cheaper to collect the required row ids in the required order in a subquery and join the result of the subquery on the id  Since for each row in the subquery result MySQL will need a single fetch based on primary key  the subquery will be put first in the join and the rows will be output in the order of the ids in the subquery  if we omit explicit ORDER BY for the join    3 ways MySQL uses indexes is a great article to understand some details   Solution 1  This one is incredibly fast  it takes about 0 8 secs on my 18M  rows   SELECT test id  MAX request id  AS request id FROM testresults GROUP BY test id DESC    If you want to change the order to ASC  put it in a subquery  return the ids only and use that as the subquery to join to the rest of the columns   SELECT test id  request id FROM       SELECT test id  MAX request id  AS request id     FROM testresults     GROUP BY test id DESC  as ids ORDER BY test id    This one takes about 1 2 secs on my data   Solution 2  Here is another solution that takes about 19 seconds for my table   SELECT test id  request id FROM testresults   SELECT  group  NULL  as init WHERE IF IFNULL  group  -1   group  test id  0  1  ORDER BY test id DESC  request id DESC   It returns tests in descending order as well  It is much slower since it does a full index scan but it is here to give you an idea how to output N max rows for each group    The disadvantage of the query is that its result cannot be cached by the query cache

User · Answer

Is there any way we could use this method to delete duplicates in a table  The result set is basically a collection of unique records  so if we could delete all records not in the result set  we would effectively have no duplicates  I tried this but mySQL gave a 1093 error    DELETE FROM messages WHERE id NOT IN   SELECT m1 id    FROM messages m1 LEFT JOIN messages m2    ON  m1 name   m2 name AND m1 id  lt  m2 id     WHERE m2 id IS NULL    Is there a way to maybe save the output to a temp variable then delete from  NOT IN  temp variable    Bill thanks for a very useful solution   EDIT  Think i found the solution   DROP TABLE IF EXISTS UniqueIDs   CREATE Temporary table UniqueIDs  id Int 11      INSERT INTO UniqueIDs       SELECT T1 ID FROM Table T1 LEFT JOIN Table T2 ON       T1 Field1   T2 Field1 AND T1 Field2   T2 Field2  Comparison Fields       AND T1 ID  lt  T2 ID       WHERE T2 ID IS NULL     DELETE FROM Table WHERE id NOT IN  SELECT ID FROM UniqueIDs

User · Answer

Clearly there are lots of different ways of getting the same results  your question seems to be what is an efficient way of getting the last results in each group in MySQL  If you are working with huge amounts of data and assuming you are using InnoDB with even the latest versions of MySQL  such as 5 7 21 and 8 0 4-rc  then there might not be an efficient way of doing this   We sometimes need to do this with tables with even more than 60 million rows   For these examples I will use data with only about 1 5 million rows where the queries would need to find results for all groups in the data  In our actual cases we would often need to return back data from about 2 000 groups  which hypothetically would not require examining very much of the data    I will use the following tables   CREATE TABLE temperature    id INT UNSIGNED NOT NULL AUTO INCREMENT     groupID INT UNSIGNED NOT NULL     recordedTimestamp TIMESTAMP NOT NULL     recordedValue INT NOT NULL    INDEX groupIndex groupID  recordedTimestamp      PRIMARY KEY  id      CREATE TEMPORARY TABLE selected group id INT UNSIGNED NOT NULL  PRIMARY KEY id       The temperature table is populated with about 1 5 million random records  and with 100 different groups  The selected group is populated with those 100 groups  in our cases this would normally be less than 20  for all of the groups    As this data is random it means that multiple rows can have the same recordedTimestamps  What we want is to get a list of all of the selected groups in order of groupID with the last recordedTimestamp for each group  and if the same group has more than one matching row like that then the last matching id of those rows   If hypothetically MySQL had a last   function which returned values from the last row in a special ORDER BY clause then we could simply do    SELECT    last t1 id  AS id     t1 groupID     last t1 recordedTimestamp  AS recordedTimestamp     last t1 recordedValue  AS recordedValue FROM selected group g INNER JOIN temperature t1 ON t1 groupID   g id ORDER BY t1 recordedTimestamp  t1 id GROUP BY t1 groupID    which would only need to examine a few 100 rows in this case as it doesn t use any of the normal GROUP BY functions  This would execute in 0 seconds and hence be highly efficient  Note that normally in MySQL we would see an ORDER BY clause following the GROUP BY clause however this ORDER BY clause is used to determine the ORDER for the last   function  if it was after the GROUP BY then it would be ordering the GROUPS  If no GROUP BY clause is present then the last values will be the same in all of the returned rows   However MySQL does not have this so let s look at different ideas of what it does have and prove that none of these are efficient   Example 1  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue FROM selected group g INNER JOIN temperature t1 ON t1 id       SELECT t2 id   FROM temperature t2    WHERE t2 groupID   g id   ORDER BY t2 recordedTimestamp DESC  t2 id DESC   LIMIT 1      This examined 3 009 254 rows and took  0 859 seconds on 5 7 21 and slightly longer on 8 0 4-rc  Example 2  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue  FROM temperature t1 INNER JOIN      SELECT max t2 id  AS id      FROM temperature t2   INNER JOIN       SELECT t3 groupID  max t3 recordedTimestamp  AS recordedTimestamp     FROM selected group g     INNER JOIN temperature t3 ON t3 groupID   g id     GROUP BY t3 groupID     t4 ON t4 groupID   t2 groupID AND t4 recordedTimestamp   t2 recordedTimestamp   GROUP BY t2 groupID   t5 ON t5 id   t1 id    This examined 1 505 331 rows and took  1 25 seconds on 5 7 21 and slightly longer on 8 0 4-rc  Example 3  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue  FROM temperature t1 WHERE t1 id IN      SELECT max t2 id  AS id      FROM temperature t2   INNER JOIN       SELECT t3 groupID  max t3 recordedTimestamp  AS recordedTimestamp     FROM selected group g     INNER JOIN temperature t3 ON t3 groupID   g id     GROUP BY t3 groupID     t4 ON t4 groupID   t2 groupID AND t4 recordedTimestamp   t2 recordedTimestamp   GROUP BY t2 groupID   ORDER BY t1 groupID    This examined 3 009 685 rows and took  1 95 seconds on 5 7 21 and slightly longer on 8 0 4-rc  Example 4  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue FROM selected group g INNER JOIN temperature t1 ON t1 id       SELECT max t2 id    FROM temperature t2    WHERE t2 groupID   g id AND t2 recordedTimestamp           SELECT max t3 recordedTimestamp        FROM temperature t3        WHERE t3 groupID   g id            This examined 6 137 810 rows and took  2 2 seconds on 5 7 21 and slightly longer on 8 0 4-rc  Example 5  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue FROM     SELECT      t2 id       t2 groupID       t2 recordedTimestamp       t2 recordedValue       row number   OVER         PARTITION BY t2 groupID ORDER BY t2 recordedTimestamp DESC  t2 id DESC       AS rowNumber   FROM selected group g    INNER JOIN temperature t2 ON t2 groupID   g id   t1 WHERE t1 rowNumber   1    This examined 6 017 808 rows and took  4 2 seconds on 8 0 4-rc  Example 6  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue  FROM     SELECT      last value t2 id  OVER w AS id       t2 groupID       last value t2 recordedTimestamp  OVER w AS recordedTimestamp       last value t2 recordedValue  OVER w AS recordedValue   FROM selected group g   INNER JOIN temperature t2 ON t2 groupID   g id   WINDOW w AS       PARTITION BY t2 groupID      ORDER BY t2 recordedTimestamp  t2 id      RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING       t1 GROUP BY t1 groupID    This examined 6 017 908 rows and took  17 5 seconds on 8 0 4-rc  Example 7  SELECT t1 id  t1 groupID  t1 recordedTimestamp  t1 recordedValue  FROM selected group g INNER JOIN temperature t1 ON t1 groupID   g id LEFT JOIN temperature t2    ON t2 groupID   g id    AND       t2 recordedTimestamp  gt  t1 recordedTimestamp      OR  t2 recordedTimestamp   t1 recordedTimestamp AND t2 id  gt  t1 id      WHERE t2 id IS NULL ORDER BY t1 groupID    This one was taking forever so I had to kill it

User · Answer

MySQL 8 0 now supports windowing functions  like almost all popular SQL implementations  With this standard syntax  we can write greatest-n-per-group queries   WITH ranked messages AS     SELECT m    ROW NUMBER   OVER  PARTITION BY name ORDER BY id DESC  AS rn   FROM messages AS m   SELECT   FROM ranked messages WHERE rn   1    Below is the original answer I wrote for this question in 2009     I write the solution this way   SELECT m1   FROM messages m1 LEFT JOIN messages m2  ON  m1 name   m2 name AND m1 id  lt  m2 id  WHERE m2 id IS NULL    Regarding performance  one solution or the other can be better  depending on the nature of your data  So you should test both queries and use the one that is better at performance given your database   For example  I have a copy of the StackOverflow August data dump   I ll use that for benchmarking   There are 1 114 357 rows in the Posts table   This is running on MySQL 5 0 75 on my Macbook Pro 2 40GHz   I ll write a query to find the most recent post for a given user ID  mine    First using the technique shown by  Eric with the GROUP BY in a subquery   SELECT p1 postid FROM Posts p1 INNER JOIN  SELECT pi owneruserid  MAX pi postid  AS maxpostid             FROM Posts pi GROUP BY pi owneruserid  p2   ON  p1 postid   p2 maxpostid  WHERE p1 owneruserid   20860   1 row in set  1 min 17 89 sec    Even the EXPLAIN analysis takes over 16 seconds    ---- ------------- ------------ -------- ---------------------------- ------------- --------- -------------- --------- -------------    id   select type   table        type     possible keys                key           key len   ref            rows      Extra          ---- ------------- ------------ -------- ---------------------------- ------------- --------- -------------- --------- -------------     1   PRIMARY        lt derived2 gt    ALL      NULL                         NULL          NULL      NULL             76756                     1   PRIMARY       p1           eq ref   PRIMARY PostId OwnerUserId   PRIMARY       8         p2 maxpostid         1   Using where       2   DERIVED       pi           index    NULL                         OwnerUserId   8         NULL           1151268   Using index     ---- ------------- ------------ -------- ---------------------------- ------------- --------- -------------- --------- -------------  3 rows in set  16 09 sec    Now produce the same query result using my technique with LEFT JOIN   SELECT p1 postid FROM Posts p1 LEFT JOIN posts p2   ON  p1 owneruserid   p2 owneruserid AND p1 postid  lt  p2 postid  WHERE p2 postid IS NULL AND p1 owneruserid   20860   1 row in set  0 28 sec    The EXPLAIN analysis shows that both tables are able to use their indexes    ---- ------------- ------- ------ ---------------------------- ------------- --------- ------- ------ --------------------------------------    id   select type   table   type   possible keys                key           key len   ref     rows   Extra                                   ---- ------------- ------- ------ ---------------------------- ------------- --------- ------- ------ --------------------------------------     1   SIMPLE        p1      ref    OwnerUserId                  OwnerUserId   8         const   1384   Using index                                1   SIMPLE        p2      ref    PRIMARY PostId OwnerUserId   OwnerUserId   8         const   1384   Using where  Using index  Not exists     ---- ------------- ------- ------ ---------------------------- ------------- --------- ------- ------ --------------------------------------  2 rows in set  0 00 sec      Here s the DDL for my Posts table   CREATE TABLE  posts       PostId  bigint 20  unsigned NOT NULL auto increment     PostTypeId  bigint 20  unsigned NOT NULL     AcceptedAnswerId  bigint 20  unsigned default NULL     ParentId  bigint 20  unsigned default NULL     CreationDate  datetime NOT NULL     Score  int 11  NOT NULL default  0      ViewCount  int 11  NOT NULL default  0      Body  text NOT NULL     OwnerUserId  bigint 20  unsigned NOT NULL     OwnerDisplayName  varchar 40  default NULL     LastEditorUserId  bigint 20  unsigned default NULL     LastEditDate  datetime default NULL     LastActivityDate  datetime default NULL     Title  varchar 250  NOT NULL default        Tags  varchar 150  NOT NULL default        AnswerCount  int 11  NOT NULL default  0      CommentCount  int 11  NOT NULL default  0      FavoriteCount  int 11  NOT NULL default  0      ClosedDate  datetime default NULL    PRIMARY KEY    PostId      UNIQUE KEY  PostId    PostId      KEY  PostTypeId    PostTypeId      KEY  AcceptedAnswerId    AcceptedAnswerId      KEY  OwnerUserId    OwnerUserId      KEY  LastEditorUserId    LastEditorUserId      KEY  ParentId    ParentId      CONSTRAINT  posts ibfk 1  FOREIGN KEY   PostTypeId   REFERENCES  posttypes    PostTypeId     ENGINE InnoDB

User · Answer

SELECT    column1    column2  FROM   table name  WHERE id IN     SELECT      MAX id     FROM     table name    GROUP BY column1   ORDER BY column1

User · Answer

select   from messages group by name desc

User · Answer

I arrived at a different solution  which is to get the IDs for the last post within each group  then select from the messages table using the result from the first query as the argument for a WHERE x IN construct   SELECT id  name  other columns FROM messages WHERE id IN       SELECT MAX id      FROM messages     GROUP BY name      I don t know how this performs compared to some of the other solutions  but it worked spectacularly for my table with 3  million rows   4 second execution with 1200  results   This should work both on MySQL and SQL Server

User · Answer

You can group by counting and also get the last item of group like   SELECT      user      COUNT user  AS count      MAX id  as last FROM request  GROUP BY user

User · Answer

Another approach     Find the propertie with the max m2 price withing each program  n properties in 1 program      select   from properties p join       select max m2 price  as max price      from properties      group by program id   p2 on  p program id   p2 program id  having p m2 price   max price

User · Answer

i find best solution in https   dzone com articles get-last-record-in-each-mysql-group select   from  data  where  id  in  select max  id   from  data  group by  name id

User · Answer

Hope below Oracle query can help   WITH Temp table AS       Select id  name  othercolumns  ROW NUMBER   over  PARTITION BY name ORDER BY ID      desc as rank from messages   Select id  name othercolumns from Temp table where rank 1

User · Answer

The below query will work fine as per your question   SELECT M1    FROM MESSAGES M1     SELECT SUBSTR Others data 1 2  MAX Others data  AS Max Others data  FROM MESSAGES  GROUP BY 1   M2 WHERE M1 Others data   M2 Max Others data ORDER BY Others data

User · Answer

Try this   SELECT jos categories title AS name         joined  catid         joined  title         joined  introtext FROM   jos categories        INNER JOIN  SELECT                      FROM    SELECT  title                                     catid                                     created                                     introtext                            FROM    jos content                             WHERE   sectionid    6                            ORDER  BY  id  DESC  AS yes                    GROUP  BY  yes   catid  DESC                    ORDER  BY  yes   created  DESC  AS joined          ON  joined catid   jos categories id

User · Answer

Hi  Vijay Dev if your table messages contains Id which is auto increment primary key then to fetch the latest record basis on the primary key your query should read as below   SELECT m1   FROM messages m1 INNER JOIN  SELECT max Id  as lastmsgId FROM messages GROUP BY Name  m2 ON m1 Id m2 lastmsgId

User · Answer

You can take view from here as well    http   sqlfiddle com   9 ef42b 9  FIRST SOLUTION   SELECT d1 ID Name City FROM Demo User d1 INNER JOIN  SELECT MAX ID  AS ID FROM Demo User GROUP By NAME  AS P ON  d1 ID P ID     SECOND SOLUTION  SELECT   FROM  SELECT   FROM Demo User ORDER BY ID DESC  AS T GROUP BY NAME

User · Answer

I ve not yet tested with large DB but I think this could be faster than joining tables   SELECT    Max Id  FROM messages GROUP BY Name

User · Answer

Here is my solution   SELECT    DISTINCT NAME    MAX MESSAGES  OVER PARTITION BY NAME  MESSAGES  FROM MESSAGE

User · Answer

Here are two suggestions   First  if mysql supports ROW NUMBER    it s very simple   WITH Ranked AS     SELECT Id  Name  OtherColumns      ROW NUMBER   OVER         PARTITION BY Name       ORDER BY Id DESC       AS rk   FROM messages     SELECT Id  Name  OtherColumns   FROM messages   WHERE rk   1    I m assuming by  last  you mean last in Id order  If not  change the ORDER BY clause of the ROW NUMBER   window accordingly  If ROW NUMBER   isn t available  this is another solution   Second  if it doesn t  this is often a good way to proceed   SELECT   Id  Name  OtherColumns FROM messages WHERE NOT EXISTS     SELECT   FROM messages as M2   WHERE M2 Name   messages Name   AND M2 Id  gt  messages Id     In other words  select messages where there is no later-Id message with the same Name

User · Answer

Use your subquery to return the correct grouping  because you re halfway there   Try this   select     a   from     messages a     inner join           select name  max id  as maxid from messages group by name  as b on         a id   b maxid   If it s not id you want the max of   select     a   from     messages a     inner join           select name  max other col  as other col           from messages group by name  as b on         a name   b name         and a other col   b other col   This way  you avoid correlated subqueries and or ordering in your subqueries  which tend to be very slow inefficient

User · Answer

If you want the last row for each Name  then you can give a row number to each row group by the Name and order by Id in descending order   QUERY  SELECT t1 Id          t1 Name          t1 Other Columns FROM         SELECT Id               Name               Other Columns                CASE Name WHEN  curA          THEN  curRow     curRow   1          ELSE  curRow    1 AND  curA    Name END          1 AS rn      FROM messages t        SELECT  curRow    0   curA        r      ORDER BY Name Id DESC   t1 WHERE t1 rn   1 ORDER BY t1 Id    SQL Fiddle

User · Answer

SELECT   FROM table name WHERE primary key IN  SELECT MAX primary key  FROM table name GROUP BY column name

User · Answer

How about this   SELECT DISTINCT ON  name    FROM messages ORDER BY name  id DESC    I had similar issue  on postgresql tough  and on a 1M records table  This solution takes 1 7s vs 44s produced by the one with LEFT JOIN  In my case I had to filter the corrispondant of your name field against NULL values  resulting in even better performances by 0 2 secs

User · Answer

we will look at how you can use MySQL at getting the last record in a Group By of records  For example if you have this result set of posts   id   category id  post title  1      1                 Title 1  2      1                 Title 2  3      1                 Title 3  4      2                 Title 4  5      2                 Title 5  6      3                 Title 6  I want to be able to get the last post in each category which are Title 3  Title 5 and Title 6  To get the posts by the category you will use the MySQL Group By keyboard   select   from posts group by category id  But the results we get back from this query is   id   category id  post title  1      1                  Title 1  4      2                  Title 4  6      3                  Title 6  The group by will always return the first record in the group on the result set   SELECT id  category id  post title FROM posts WHERE id IN       SELECT MAX id      FROM posts     GROUP BY category id     This will return the posts with the highest IDs in each group   id   category id  post title  3      1                  Title 3   5      2                  Title 5  6      3                  Title 6  Reference Click Here

User · Answer

If performance is really your concern you can introduce a new column on the table called IsLastInGroup of type BIT    Set it to true on the columns which are last and maintain it with every row insert update delete  Writes will be slower  but you ll benefit on reads  It depends on your use case and I recommend it only if you re read-focused   So your query will look like   SELECT   FROM Messages WHERE IsLastInGroup   1

User · Answer

Hi  this query might help         SELECT      FROM    message   WHERE     Id  IN       SELECT        MAX  Id        FROM        message      GROUP BY         Name       ORDER BY      Id  DESC

[sql] Retrieving the last record in each group - MySQL

Fiddle Demo

Examples related to sql

Examples related to mysql

Examples related to group-by

Examples related to greatest-n-per-group