[sql] Fetch the row which has the Max value for a column

Table:

UserId, Value, Date.

I want to get the UserId, Value for the max(Date) for each UserId. That is, the Value for each UserId that has the latest date. Is there a way to do this simply in SQL? (Preferably Oracle)

Update: Apologies for any ambiguity: I need to get ALL the UserIds. But for each UserId, only that row where that user has the latest date.

This question is related to sql oracle greatest-n-per-group

The answer is


Just had to write a "live" example at work :)

This one supports multiple values for UserId on the same date.

Columns: UserId, Value, Date

SELECT
   DISTINCT UserId,
   MAX(Date) OVER (PARTITION BY UserId ORDER BY Date DESC),
   MAX(Values) OVER (PARTITION BY UserId ORDER BY Date DESC)
FROM
(
   SELECT UserId, Date, SUM(Value) As Values
   FROM <<table_name>>
   GROUP BY UserId, Date
)

You can use FIRST_VALUE instead of MAX and look it up in the explain plan. I didn't have the time to play with it.

Of course, if searching through huge tables, it's probably better if you use FULL hints in your query.


(T-SQL) First get all the users and their maxdate. Join with the table to find the corresponding values for the users on the maxdates.

create table users (userid int , value int , date datetime)
insert into users values (1, 1, '20010101')
insert into users values (1, 2, '20020101')
insert into users values (2, 1, '20010101')
insert into users values (2, 3, '20030101')

select T1.userid, T1.value, T1.date 
    from users T1,
    (select max(date) as maxdate, userid from users group by userid) T2    
    where T1.userid= T2.userid and T1.date = T2.maxdate

results:

userid      value       date                                    
----------- ----------- -------------------------- 
2           3           2003-01-01 00:00:00.000
1           2           2002-01-01 00:00:00.000

I know you asked for Oracle, but in SQL 2005 we now use this:


-- Single Value
;WITH ByDate
AS (
SELECT UserId, Value, ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Date DESC) RowNum
FROM UserDates
)
SELECT UserId, Value
FROM ByDate
WHERE RowNum = 1

-- Multiple values where dates match
;WITH ByDate
AS (
SELECT UserId, Value, RANK() OVER (PARTITION BY UserId ORDER BY Date DESC) Rnk
FROM UserDates
)
SELECT UserId, Value
FROM ByDate
WHERE Rnk = 1

I think this should work?

Select
T1.UserId,
(Select Top 1 T2.Value From Table T2 Where T2.UserId = T1.UserId Order By Date Desc) As 'Value'
From
Table T1
Group By
T1.UserId
Order By
T1.UserId

This should be as simple as:

SELECT UserId, Value
FROM Users u
WHERE Date = (SELECT MAX(Date) FROM Users WHERE UserID = u.UserID)

Assuming Date is unique for a given UserID, here's some TSQL:

SELECT 
    UserTest.UserID, UserTest.Value
FROM UserTest
INNER JOIN
(
    SELECT UserID, MAX(Date) MaxDate
    FROM UserTest
    GROUP BY UserID
) Dates
ON UserTest.UserID = Dates.UserID
AND UserTest.Date = Dates.MaxDate 

In Oracle 12c+, you can use Top n queries along with analytic function rank to achieve this very concisely without subqueries:

select *
from your_table
order by rank() over (partition by user_id order by my_date desc)
fetch first 1 row with ties;

The above returns all the rows with max my_date per user.

If you want only one row with max date, then replace the rank with row_number:

select *
from your_table
order by row_number() over (partition by user_id order by my_date desc)
fetch first 1 row with ties; 

select   UserId,max(Date) over (partition by UserId) value from users;

Just tested this and it seems to work on a logging table

select ColumnNames, max(DateColumn) from log  group by ColumnNames order by 1 desc

I think something like this. (Forgive me for any syntax mistakes; I'm used to using HQL at this point!)

EDIT: Also misread the question! Corrected the query...

SELECT UserId, Value
FROM Users AS user
WHERE Date = (
    SELECT MAX(Date)
    FROM Users AS maxtest
    WHERE maxtest.UserId = user.UserId
)

Not being at work, I don't have Oracle to hand, but I seem to recall that Oracle allows multiple columns to be matched in an IN clause, which should at least avoid the options that use a correlated subquery, which is seldom a good idea.

Something like this, perhaps (can't remember if the column list should be parenthesised or not):

SELECT * 
FROM MyTable
WHERE (User, Date) IN
  ( SELECT User, MAX(Date) FROM MyTable GROUP BY User)

EDIT: Just tried it for real:

SQL> create table MyTable (usr char(1), dt date);
SQL> insert into mytable values ('A','01-JAN-2009');
SQL> insert into mytable values ('B','01-JAN-2009');
SQL> insert into mytable values ('A', '31-DEC-2008');
SQL> insert into mytable values ('B', '31-DEC-2008');
SQL> select usr, dt from mytable
  2  where (usr, dt) in 
  3  ( select usr, max(dt) from mytable group by usr)
  4  /

U DT
- ---------
A 01-JAN-09
B 01-JAN-09

So it works, although some of the new-fangly stuff mentioned elsewhere may be more performant.


check this link if your questions seems similar to that page then i would suggest you the following query which will give the solution for that link

select distinct sno,item_name,max(start_date) over(partition by sno),max(end_date) over(partition by sno),max(creation_date) over(partition by sno), max(last_modified_date) over(partition by sno) from uniq_select_records order by sno,item_name asc;

will given accurate results related to that link


SELECT a.userid,a.values1,b.mm 
FROM table_name a,(SELECT userid,Max(date1)AS mm FROM table_name GROUP BY userid) b
WHERE a.userid=b.userid AND a.DATE1=b.mm;

I don't have Oracle to test it, but the most efficient solution is to use analytic queries. It should look something like this:

SELECT DISTINCT
    UserId
  , MaxValue
FROM (
    SELECT UserId
      , FIRST (Value) Over (
          PARTITION BY UserId
          ORDER BY Date DESC
        ) MaxValue
    FROM SomeTable
  )

I suspect that you can get rid of the outer query and put distinct on the inner, but I'm not sure. In the meantime I know this one works.

If you want to learn about analytic queries, I'd suggest reading http://www.orafaq.com/node/55 and http://www.akadia.com/services/ora_analytic_functions.html. Here is the short summary.

Under the hood analytic queries sort the whole dataset, then process it sequentially. As you process it you partition the dataset according to certain criteria, and then for each row looks at some window (defaults to the first value in the partition to the current row - that default is also the most efficient) and can compute values using a number of analytic functions (the list of which is very similar to the aggregate functions).

In this case here is what the inner query does. The whole dataset is sorted by UserId then Date DESC. Then it processes it in one pass. For each row you return the UserId and the first Date seen for that UserId (since dates are sorted DESC, that's the max date). This gives you your answer with duplicated rows. Then the outer DISTINCT squashes duplicates.

This is not a particularly spectacular example of analytic queries. For a much bigger win consider taking a table of financial receipts and calculating for each user and receipt, a running total of what they paid. Analytic queries solve that efficiently. Other solutions are less efficient. Which is why they are part of the 2003 SQL standard. (Unfortunately Postgres doesn't have them yet. Grrr...)


SELECT a.* 
FROM user a INNER JOIN (SELECT userid,Max(date) AS date12 FROM user1 GROUP BY userid) b  
ON a.date=b.date12 AND a.userid=b.userid ORDER BY a.userid;

With PostgreSQL 8.4 or later, you can use this:

select user_id, user_value_1, user_value_2
  from (select user_id, user_value_1, user_value_2, row_number()
          over (partition by user_id order by user_date desc) 
        from users) as r
  where r.row_number=1

SELECT userid, MAX(value) KEEP (DENSE_RANK FIRST ORDER BY date DESC)
  FROM table
  GROUP BY userid

Wouldn't a QUALIFY clause be both simplest and best?

select userid, my_date, ...
from users
qualify rank() over (partition by userid order by my_date desc) = 1

For context, on Teradata here a decent size test of this runs in 17s with this QUALIFY version and in 23s with the 'inline view'/Aldridge solution #1.


If you're using Postgres, you can use array_agg like

SELECT userid,MAX(adate),(array_agg(value ORDER BY adate DESC))[1] as value
FROM YOURTABLE
GROUP BY userid

I'm not familiar with Oracle. This is what I came up with

SELECT 
  userid,
  MAX(adate),
  SUBSTR(
    (LISTAGG(value, ',') WITHIN GROUP (ORDER BY adate DESC)),
    0,
    INSTR((LISTAGG(value, ',') WITHIN GROUP (ORDER BY adate DESC)), ',')-1
  ) as value 
FROM YOURTABLE
GROUP BY userid 

Both queries return the same results as the accepted answer. See SQLFiddles:

  1. Accepted answer
  2. My solution with Postgres
  3. My solution with Oracle

I'm quite late to the party but the following hack will outperform both correlated subqueries and any analytics function but has one restriction: values must convert to strings. So it works for dates, numbers and other strings. The code does not look good but the execution profile is great.

select
    userid,
    to_number(substr(max(to_char(date,'yyyymmdd') || to_char(value)), 9)) as value,
    max(date) as date
from 
    users
group by
    userid

The reason why this code works so well is that it only needs to scan the table once. It does not require any indexes and most importantly it does not need to sort the table, which most analytics functions do. Indexes will help though if you need to filter the result for a single userid.


Use the code:

select T.UserId,T.dt from (select UserId,max(dt) 
over (partition by UserId) as dt from t_users)T where T.dt=dt;

This will retrieve the results, irrespective of duplicate values for UserId. If your UserId is unique, well it becomes more simple:

select UserId,max(dt) from t_users group by UserId;

I don't know your exact columns names, but it would be something like this:

    select userid, value
      from users u1
     where date = (select max(date)
                     from users u2
                    where u1.userid = u2.userid)

If (UserID, Date) is unique, i.e. no date appears twice for the same user then:

select TheTable.UserID, TheTable.Value
from TheTable inner join (select UserID, max([Date]) MaxDate
                          from TheTable
                          group by UserID) UserMaxDate
     on TheTable.UserID = UserMaxDate.UserID
        TheTable.[Date] = UserMaxDate.MaxDate;

First try I misread the question, following the top answer, here is a complete example with correct results:

CREATE TABLE table_name (id int, the_value varchar(2), the_date datetime);

INSERT INTO table_name (id,the_value,the_date) VALUES(1 ,'a','1/1/2000');
INSERT INTO table_name (id,the_value,the_date) VALUES(1 ,'b','2/2/2002');
INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'c','1/1/2000');
INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'d','3/3/2003');
INSERT INTO table_name (id,the_value,the_date) VALUES(2 ,'e','3/3/2003');

--

  select id, the_value
      from table_name u1
      where the_date = (select max(the_date)
                     from table_name u2
                     where u1.id = u2.id)

--

id          the_value
----------- ---------
2           d
2           e
1           b

(3 row(s) affected)

i thing you shuold make this variant to previous query:

SELECT UserId, Value FROM Users U1 WHERE 
Date = ( SELECT MAX(Date)    FROM Users where UserId = U1.UserId)

This will also take care of duplicates (return one row for each user_id):

SELECT *
FROM (
  SELECT u.*, FIRST_VALUE(u.rowid) OVER(PARTITION BY u.user_id ORDER BY u.date DESC) AS last_rowid
  FROM users u
) u2
WHERE u2.rowid = u2.last_rowid

select userid, value, date
  from thetable t1 ,
       ( select t2.userid, max(t2.date) date2 
           from thetable t2 
          group by t2.userid ) t3
 where t3.userid t1.userid and
       t3.date2 = t1.date

IMHO this works. HTH


Solution for MySQL which doesn't have concepts of partition KEEP, DENSE_RANK.

select userid,
       my_date,
       ...
from
(
select @sno:= case when @pid<>userid then 0
                    else @sno+1
    end as serialnumber, 
    @pid:=userid,
       my_Date,
       ...
from   users order by userid, my_date
) a
where a.serialnumber=0

Reference: http://benincampus.blogspot.com/2013/08/select-rows-which-have-maxmin-value-in.html


I see many people use subqueries or else window functions to do this, but I often do this kind of query without subqueries in the following way. It uses plain, standard SQL so it should work in any brand of RDBMS.

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON (t1.UserId = t2.UserId AND t1."Date" < t2."Date")
WHERE t2.UserId IS NULL;

In other words: fetch the row from t1 where no other row exists with the same UserId and a greater Date.

(I put the identifier "Date" in delimiters because it's an SQL reserved word.)

In case if t1."Date" = t2."Date", doubling appears. Usually tables has auto_inc(seq) key, e.g. id. To avoid doubling can be used follows:

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON t1.UserId = t2.UserId AND ((t1."Date" < t2."Date") 
         OR (t1."Date" = t2."Date" AND t1.id < t2.id))
WHERE t2.UserId IS NULL;

Re comment from @Farhan:

Here's a more detailed explanation:

An outer join attempts to join t1 with t2. By default, all results of t1 are returned, and if there is a match in t2, it is also returned. If there is no match in t2 for a given row of t1, then the query still returns the row of t1, and uses NULL as a placeholder for all of t2's columns. That's just how outer joins work in general.

The trick in this query is to design the join's matching condition such that t2 must match the same userid, and a greater date. The idea being if a row exists in t2 that has a greater date, then the row in t1 it's compared against can't be the greatest date for that userid. But if there is no match -- i.e. if no row exists in t2 with a greater date than the row in t1 -- we know that the row in t1 was the row with the greatest date for the given userid.

In those cases (when there's no match), the columns of t2 will be NULL -- even the columns specified in the join condition. So that's why we use WHERE t2.UserId IS NULL, because we're searching for the cases where no row was found with a greater date for the given userid.


Select  
   UserID,  
   Value,  
   Date  
From  
   Table,  
   (  
      Select  
          UserID,  
          Max(Date) as MDate  
      From  
          Table  
      Group by  
          UserID  
    ) as subQuery  
Where  
   Table.UserID = subQuery.UserID and  
   Table.Date = subQuery.mDate  

Below query can work :

SELECT user_id, value, date , row_number() OVER (PARTITION BY user_id ORDER BY date desc) AS rn
FROM table_name
WHERE rn= 1

The answer here is Oracle only. Here's a bit more sophisticated answer in all SQL:

Who has the best overall homework result (maximum sum of homework points)?

SELECT FIRST, LAST, SUM(POINTS) AS TOTAL
FROM STUDENTS S, RESULTS R
WHERE S.SID = R.SID AND R.CAT = 'H'
GROUP BY S.SID, FIRST, LAST
HAVING SUM(POINTS) >= ALL (SELECT SUM (POINTS)
FROM RESULTS
WHERE CAT = 'H'
GROUP BY SID)

And a more difficult example, which need some explanation, for which I don't have time atm:

Give the book (ISBN and title) that is most popular in 2008, i.e., which is borrowed most often in 2008.

SELECT X.ISBN, X.title, X.loans
FROM (SELECT Book.ISBN, Book.title, count(Loan.dateTimeOut) AS loans
FROM CatalogEntry Book
LEFT JOIN BookOnShelf Copy
ON Book.bookId = Copy.bookId
LEFT JOIN (SELECT * FROM Loan WHERE YEAR(Loan.dateTimeOut) = 2008) Loan 
ON Copy.copyId = Loan.copyId
GROUP BY Book.title) X
HAVING loans >= ALL (SELECT count(Loan.dateTimeOut) AS loans
FROM CatalogEntry Book
LEFT JOIN BookOnShelf Copy
ON Book.bookId = Copy.bookId
LEFT JOIN (SELECT * FROM Loan WHERE YEAR(Loan.dateTimeOut) = 2008) Loan 
ON Copy.copyId = Loan.copyId
GROUP BY Book.title);

Hope this helps (anyone).. :)

Regards, Guus


Use ROW_NUMBER() to assign a unique ranking on descending Date for each UserId, then filter to the first row for each UserId (i.e., ROW_NUMBER = 1).

SELECT UserId, Value, Date
FROM (SELECT UserId, Value, Date,
        ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Date DESC) rn
      FROM users) u
WHERE rn = 1;

select VALUE from TABLE1 where TIME = 
   (select max(TIME) from TABLE1 where DATE= 
   (select max(DATE) from TABLE1 where CRITERIA=CRITERIA))

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to oracle

concat yesterdays date with a specific time ORA-28001: The password has expired how to modify the size of a column How to create a blank/empty column with SELECT query in oracle? Find the number of employees in each department - SQL Oracle Query to display all tablespaces in a database and datafiles When or Why to use a "SET DEFINE OFF" in Oracle Database How to insert date values into table error: ORA-65096: invalid common user or role name in oracle In Oracle SQL: How do you insert the current date + time into a table?

Examples related to greatest-n-per-group

How to select the rows with maximum values in each group with dplyr? MAX function in where clause mysql Pandas get topmost n records within each group SQL Left Join first match only Select info from table where row has max date FORCE INDEX in MySQL - where do I put it? GROUP BY having MAX date How can I select rows with most recent timestamp for each key value? Select row with most recent date per user How to select id with max date group by category in PostgreSQL?