[sql] SQL: Group by minimum value in one field while selecting distinct rows

Here's what I'm trying to do. Let's say I have this table t:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
2      | 18 | 2012-05-19  | y
3      | 18 | 2012-08-09  | z
4      | 19 | 2009-06-01  | a
5      | 19 | 2011-04-03  | b
6      | 19 | 2011-10-25  | c
7      | 19 | 2012-08-09  | d

For each id, I want to select the row containing the minimum record_date. So I'd get:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
4      | 19 | 2009-06-01  | a

The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
5      | 19 | 2011-04-03  | b
4      | 19 | 2009-06-01  | a

This question is related to sql group-by max distinct min

The answer is


How about something like:

SELECT mt.*     
FROM MyTable mt INNER JOIN
    (
        SELECT id, MIN(record_date) AS MinDate
        FROM MyTable
        GROUP BY id
    ) t ON mt.id = t.id AND mt.record_date = t.MinDate

This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.


I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.

SELECT * FROM
(
    SELECT
        ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
        *
    FROM products P
) INNER
WHERE rownum = 2

This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma


This a old question, but this can useful for someone In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql

select t.* 
    from (select m.*, @g := 0
        from MyTable m --here i have a big query
        order by id, record_date) t
    where (1 = case when @g = 0 or @g <> id then 1 else  0 end )
          and (@g := id) IS NOT NULL

Basically I ordered the result and then put a variable in order to get only the first record in each group.


I could get to your expected result just by doing this in :

 SELECT id, min(record_date), other_cols 
  FROM mytable
  GROUP BY id

Does this work for you?


If record_date has no duplicates within a group:

think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:

SELECT * FROM t t1 WHERE record_date = (
                                 select MIN(record_date)
                                 from t t2 where t2.group_id = t1.group_id)

If there could be 2+ min record_date within a group:

  1. filter out non-min rows (see above)

  2. then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:

                    AND key_id = (select MIN(key_id)
                                  from t t3 where t3.record_date = t1.record_date
                                              and t3.group_id    = t1.group_id)
    

so

key_id | group_id | record_date | other_cols
1      | 18       | 2011-04-03  | x
4      | 19       | 2009-06-01  | a
8      | 19       | 2009-06-01  | e

will select key_ids: #1 and #4


The below query takes the first date for each work order (in a table of showing all status changes):

SELECT
    WORKORDERNUM,
    MIN(DATE)
FROM
    WORKORDERS
WHERE
    DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
    WORKORDERNUM

select 
    department, 
    min_salary, 
    (select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname 
from 
    (select department, min (salary) min_salary from staff s2 group by s2.department) s3

To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:

    SELECT categoryid,
       productid,
       productName,
       unitprice 
    FROM products a WHERE unitprice = (
                SELECT MIN(unitprice)
                FROM products b
                WHERE b.categoryid = a.categoryid)

The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.


This does it simply:

select t2.id,t2.record_date,t2.other_cols 
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2 
where t2.rownum = 1

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to group-by

SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by Count unique values using pandas groupby Pandas group-by and sum Count unique values with pandas per groups Group dataframe and get sum AND count? Error related to only_full_group_by when executing a query in MySql Pandas sum by groupby, but exclude certain columns Using DISTINCT along with GROUP BY in SQL Server Python Pandas : group by in group by and average? How do I create a new column from the output of pandas groupby().sum()?

Examples related to max

Min and max value of input in angular4 application numpy max vs amax vs maximum mongodb how to get max value from collections Python find min max and average of a list (array) Max length UITextField How to find the highest value of a column in a data frame in R? MAX function in where clause mysql Check if all values in list are greater than a certain number How do I get the max and min values from a set of numbers entered? SQL: Group by minimum value in one field while selecting distinct rows

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent

Examples related to min

Min and max value of input in angular4 application Python find min max and average of a list (array) How do I get the max and min values from a set of numbers entered? SQL: Group by minimum value in one field while selecting distinct rows How to exclude 0 from MIN formula Excel Using std::max_element on a vector<double> How can I get the max (or min) value in a vector? Most efficient way to find smallest of 3 numbers Java? Find the smallest amongst 3 numbers in C++ Obtain smallest value from array in Javascript?