Get top n records for each group of grouped results

Question

The following is the simplest possible example  though any solution should be able to scale to however many n top results are needed   Given a table like that below  with person  group  and age columns  how would you get the 2 oldest people in each group   Ties within groups should not yield more results  but give the first 2 in alphabetical order     -------- ------- -----    Person   Group   Age    -------- ------- -----    Bob      1       32      Jill     1       34      Shawn    1       42      Jake     2       29      Paul     2       36      Laura    2       39     -------- ------- -----    Desired result set       -------- ------- -----    Shawn    1       42      Jill     1       34      Laura    2       39      Paul     2       36     -------- ------- -----      NOTE  This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group  and which received a great MySQL-specific answer from  Bohemian   select    from  select   from mytable order by  Group   Age desc  Person  x group by  Group    Would love to be able to build off this  though I don t see how

User · Answer

In other databases you can do this using ROW_NUMBER. MySQL doesn't support ROW_NUMBER but you can use variables to emulate it:

SELECT
    person,
    groupname,
    age
FROM
(
    SELECT
        person,
        groupname,
        age,
        @rn := IF(@prev = groupname, @rn + 1, 1) AS rn,
        @prev := groupname
    FROM mytable
    JOIN (SELECT @prev := NULL, @rn := 0) AS vars
    ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2

See it working online: sqlfiddle

Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:

It it is a single query. The variables are initialized inside the SELECT statement.
It handles ties as described in the question (alphabetical order by name).

So I'll leave it here in case it can help someone.

User · Answer

Check this out   SELECT   p Person    p  Group     p Age FROM   people p   INNER JOIN         SELECT MAX Age  AS Age   Group  FROM people GROUP BY  Group      UNION     SELECT MAX p3 Age  AS Age  p3  Group  FROM people p3 INNER JOIN  SELECT MAX Age  AS Age   Group  FROM people GROUP BY  Group   p4 ON p3 Age  lt  p4 Age AND p3  Group    p4  Group  GROUP BY  Group      p2 ON p Age   p2 Age AND p  Group    p2  Group  ORDER BY    Group     Age DESC    Person    SQL Fiddle  http   sqlfiddle com   2 cdbb6 15

User · Answer

Here is one way to do this  using UNION ALL  See SQL Fiddle with Demo   This works with two groups  if you have more than two groups  then you would need to specify the group number and add queries for each group       select     from mytable    where  group    1   order by age desc   LIMIT 2   UNION ALL     select     from mytable    where  group    2   order by age desc   LIMIT 2     There are a variety of ways to do this  see this article to determine the best route for your situation   http   www xaprb com blog 2006 12 07 how-to-select-the-firstleastmax-row-per-group-in-sql   Edit   This might work for you too  it generates a row number for each record  Using an example from the link above this will return only those records with a row number of less than or equal to 2   select person   group   age from       select person   group   age          num  if  group    group    num  1  if  group     group   1  1    row number    from test t   CROSS JOIN  select  num  0   group  null  c   order by  Group   Age desc  person   as x  where x row number  lt   2    See Demo

User · Answer

Snuffin solution seems quite slow to execute when you ve got plenty of rows and Mark Byers Rick James and Bluefeet solutions doesn t work on my environnement  MySQL 5 6  because order by is applied after  execution of select  so here is a variant of Marc Byers Rick James solutions to fix this issue  with an extra imbricated select    select person  groupname  age from       select person  groupname  age        rn  if  prev   groupname   rn  1  1   as rownumb       prev   groupname      from                select person  groupname  age         from persons          order by groupname    age desc  person         as sortedlist     JOIN  select  prev  NULL   rn   0  as vars   as groupedlist  where rownumb lt  2 order by groupname    age desc  person    I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds

User · Answer

Try this   SELECT a person  a group  a age FROM person AS a WHERE   SELECT COUNT    FROM person AS b  WHERE b group   a group AND b age  gt   a age   lt   2  ORDER BY a group ASC  a age DESC   DEMO

User · Answer

How about using self-joining   CREATE TABLE mytable  person  groupname  age   INSERT INTO mytable VALUES  Bob  1 32   INSERT INTO mytable VALUES  Jill  1 34   INSERT INTO mytable VALUES  Shawn  1 42   INSERT INTO mytable VALUES  Jake  2 29   INSERT INTO mytable VALUES  Paul  2 36   INSERT INTO mytable VALUES  Laura  2 39    SELECT a   FROM mytable AS a   LEFT JOIN mytable AS a2      ON a groupname   a2 groupname AND a age  lt   a2 age GROUP BY a person HAVING COUNT     lt   2 ORDER BY a groupname  a age DESC    gives me   a person    a groupname  a age      ----------  -----------  ---------- Shawn       1            42         Jill        1            34         Laura       2            39         Paul        2            36         I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category  Also  I m using SQLite  but this should work on MySQL   Another thing  in the above  I replaced the group column with a groupname column for convenience   Edit   Following-up on the OP s comment regarding missing tie results  I incremented on snuffin s answer to show all the ties  This means that if the last ones are ties  more than 2 rows can be returned  as shown below    headers on  mode column  CREATE TABLE foo  person  groupname  age   INSERT INTO foo VALUES  Paul  2 36   INSERT INTO foo VALUES  Laura  2 39   INSERT INTO foo VALUES  Joe  2 36   INSERT INTO foo VALUES  Bob  1 32   INSERT INTO foo VALUES  Jill  1 34   INSERT INTO foo VALUES  Shawn  1 42   INSERT INTO foo VALUES  Jake  2 29   INSERT INTO foo VALUES  James  2 15   INSERT INTO foo VALUES  Fred  1 12   INSERT INTO foo VALUES  Chuck  3 112     SELECT a person  a groupname  a age  FROM foo AS a  WHERE a age  gt    SELECT MIN b age                  FROM foo AS b                  WHERE  SELECT COUNT                           FROM foo AS c                        WHERE c groupname   b groupname AND c age  gt   b age   lt   2                 GROUP BY b groupname  ORDER BY a groupname ASC  a age DESC    gives me   person      groupname   age        ----------  ----------  ---------- Shawn       1           42         Jill        1           34         Laura       2           39         Paul        2           36         Joe         2           36         Chuck       3           112

User · Answer

SELECT p1 Person  p1  GROUP   p1 Age      FROM person AS p1   WHERE   SELECT     COUNT  DISTINCT   p2 age      FROM     person AS p2  WHERE     p2  GROUP    p1  GROUP       AND p2 Age  gt   p1 Age     lt  2  ORDER BY p1  GROUP  ASC  p1 age DESC  reference leetcode

User · Answer

I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I m working on  This doesn t quite give the output you re looking for but its close  The function in mysql called GROUP CONCAT   worked really well for specifying how many results to return in each group  Using LIMIT or any of the other fancy ways of trying to do this with COUNT didn t work for me  So if you re willing to accept a modified output  its a great solution  Lets say I have a table called  student  with student ids  their gender  and gpa  Lets say I want to top 5 gpas for each gender  Then I can write the query like this  SELECT sex  SUBSTRING INDEX GROUP CONCAT cast gpa AS char   ORDER BY gpa desc       5   AS subcategories FROM student GROUP BY sex    Note that the parameter  5  tells it how many entries to concatenate into each row  And the output would look something like    -------- ----------------    Male     4 4 4 4 3 9        Female   4 4 3 9 3 9 3 8   -------- ----------------    You can also change the ORDER BY variable and order them a different way  So if I had the student s age I could replace the  gpa desc  with  age desc  and it will work  You can also add variables to the group by statement to get more columns in the output  So this is just a way I found that is pretty flexible and works good if you are ok with just listing results

User · Answer

If the other answers are not fast enough  Give this code a try   SELECT         province  n  city  population     FROM         SELECT   prev         n    0   init     JOIN         SELECT   n    if province     prev  1   n   1  AS n                   prev    province                  province  city  population             FROM  Canada             ORDER BY                 province   ASC                  population DESC         x     WHERE  n  lt   3     ORDER BY  province  n    Output    --------------------------- ------ ------------------ ------------    province                    n      city               population    --------------------------- ------ ------------------ ------------    Alberta                        1   Calgary                968475     Alberta                        2   Edmonton               822319     Alberta                        3   Red Deer                73595     British Columbia               1   Vancouver             1837970     British Columbia               2   Victoria               289625     British Columbia               3   Abbotsford             151685     Manitoba                       1

User · Answer

In SQL Server row numer   is a powerful function that can get result easily as below   select Person  group  age from   select    row number   over partition by  group  order by age desc  rn from mytable   t where rn  lt   2

User · Answer

There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group  Based on the solution in the referenced link  your query would be like   SELECT Person  Group  Age    FROM       SELECT Person  Group  Age                      group rank    IF  group   Group   group rank   1  1  AS group rank                     current group    Group         FROM  your table         ORDER BY Group  Age DESC        ranked    WHERE group rank  lt    n     ORDER BY Group  Age DESC    where n is the top n and your table is the name of your table   I think the explanation in the reference is really clear  For quick reference I will copy and paste it here      Currently MySQL does not support ROW NUMBER   function that can assign   a sequence number within a group  but as a workaround we can use MySQL   session variables       These variables do not require declaration  and can be used in a query   to do calculations and to store intermediate results        current country    country This code is executed for each row and   stores the value of country column to  current country variable        country rank    IF  current country   country   country rank   1  1    In this code  if  current country is the same we increment rank    otherwise set it to 1  For the first row  current country is NULL  so   rank is also set to 1       For correct ranking  we need to have ORDER BY country  population DESC

[mysql] Get top n records for each group of grouped results

Examples related to mysql

Examples related to sql

Examples related to greatest-n-per-group

Examples related to mysql-variables