I have a table where each ID is repeated 3 times. there is a date in front of each id in each row.
I want to select entire row for each ID where date is latest.
There are total 370 columns in this table i want all columns to get selected when i select that row.
Sample -
ID Name Date Marks .. .. ..
1 XY 4/3/2017 27
1 fv 4/3/2014 98
1 jk 4/3/2016 09
2 RF 4/12/2015 87
2 kk 4/3/2009 56
2 PP 4/3/2011 76
3 ee 4/3/2001 12
3 ppp 4/3/2003 09
3 lll 4/3/2011 23
The Answer should be
ID Name Date Marks .. .. ..
1 XY 4/3/2017 27
2 RF 4/12/2015 87
3 lll 4/3/2011 23
I am attempting as below -
select distinct ID,*,max(date) as maxdate from table
Also i am trying this in Hive . so not sure if some sql functions dont work in Hive
Thanks
One way is:
select table.*
from table
join
(
select ID, max(Date) as max_dt
from table
group by ID
) t
on table.ID= t.ID and table.Date = t.max_dt
Note that if you have multiple equally higher dates for same ID, then you will get all those rows in result
Have you tried the following:
SELECT ID, COUNT(*), max(date)
FROM table
GROUP BY ID;
Here's one way. The inner query gets the max date for each id. Then you can join that back to your main table to get the rows that match.
select
*
from
<your table>
inner join
(select id, max(<date col> as max_date) m
where yourtable.id = m.id
and yourtable.datecolumn = m.max_date)
You can use a join to do this
SELECT t1.* from myTable t1
LEFT OUTER JOIN myTable t2 on t2.ID=t1.ID AND t2.`Date` > t1.`Date`
WHERE t2.`Date` IS NULL;
Only rows which have the latest date for each ID with have a NULL join to t2.
You can do this with a Correlated Subquery (That is a subquery wherein you reference a field in the main query). In this case:
SELECT *
FROM yourtable t1
WHERE date = (SELECT max(date) from yourtable WHERE id = t1.id)
Here we give the yourtable
table an alias of t1
and then use that alias in the subquery grabbing the max(date)
from the same table yourtable
for that id
.
Source: Stackoverflow.com