Categories
greatest-n-per-group oracle sql

Fetch the rows which have the Max value for a column for each distinct value of another column

641

Table:

UserId, Value, Date.

I want to get the UserId, Value for the max(Date) for each UserId. That is, the Value for each UserId that has the latest date. Is there a way to do this simply in SQL? (Preferably Oracle)

Update: Apologies for any ambiguity: I need to get ALL the UserIds. But for each UserId, only that row where that user has the latest date.

9

441

This will retrieve all rows for which the my_date column value is equal to the maximum value of my_date for that userid. This may retrieve multiple rows for the userid where the maximum date is on multiple rows.

select userid,
       my_date,
       ...
from
(
select userid,
       my_date,
       ...
       max(my_date) over (partition by userid) max_my_date
from   users
)
where my_date = max_my_date

“Analytic functions rock”

Edit: With regard to the first comment …

“using analytic queries and a self-join defeats the purpose of analytic queries”

There is no self-join in this code. There is instead a predicate placed on the result of the inline view that contains the analytic function — a very different matter, and completely standard practice.

“The default window in Oracle is from the first row in the partition to the current one”

The windowing clause is only applicable in the presence of the order by clause. With no order by clause, no windowing clause is applied by default and none can be explicitly specified.

The code works.

14

  • 46

    When applied to a table having 8.8 million rows, this query took half the time of the queries in some the other highly voted answers.

    Apr 15, 2011 at 23:59

  • 5

    Anyone care to post a link to the MySQL equivalent of this, if there is one?

    – redolent

    Jan 10, 2015 at 2:35

  • 3

    Couldn’t this return duplicates? Eg. if two rows have the same user_id and the same date (which happens to be the max).

    – jastr

    Jun 15, 2016 at 19:30

  • 3

    @jastr I think that was acknowledged in the question

    Jun 17, 2016 at 15:47

  • 8

    Instead of MAX(...) OVER (...) you can also use ROW_NUMBER() OVER (...) (for the top-n-per-group) or RANK() OVER (...) (for the greatest-n-per-group).

    – MT0

    Jun 27, 2016 at 8:13


441

This will retrieve all rows for which the my_date column value is equal to the maximum value of my_date for that userid. This may retrieve multiple rows for the userid where the maximum date is on multiple rows.

select userid,
       my_date,
       ...
from
(
select userid,
       my_date,
       ...
       max(my_date) over (partition by userid) max_my_date
from   users
)
where my_date = max_my_date

“Analytic functions rock”

Edit: With regard to the first comment …

“using analytic queries and a self-join defeats the purpose of analytic queries”

There is no self-join in this code. There is instead a predicate placed on the result of the inline view that contains the analytic function — a very different matter, and completely standard practice.

“The default window in Oracle is from the first row in the partition to the current one”

The windowing clause is only applicable in the presence of the order by clause. With no order by clause, no windowing clause is applied by default and none can be explicitly specified.

The code works.

14

  • 46

    When applied to a table having 8.8 million rows, this query took half the time of the queries in some the other highly voted answers.

    Apr 15, 2011 at 23:59

  • 5

    Anyone care to post a link to the MySQL equivalent of this, if there is one?

    – redolent

    Jan 10, 2015 at 2:35

  • 3

    Couldn’t this return duplicates? Eg. if two rows have the same user_id and the same date (which happens to be the max).

    – jastr

    Jun 15, 2016 at 19:30

  • 3

    @jastr I think that was acknowledged in the question

    Jun 17, 2016 at 15:47

  • 8

    Instead of MAX(...) OVER (...) you can also use ROW_NUMBER() OVER (...) (for the top-n-per-group) or RANK() OVER (...) (for the greatest-n-per-group).

    – MT0

    Jun 27, 2016 at 8:13


170

SELECT userid, MAX(value) KEEP (DENSE_RANK FIRST ORDER BY date DESC)
  FROM table
  GROUP BY userid

4

  • 5

    In my tests using a table having a large number of rows, this solution took about twice as long as that in the accepted answer.

    Apr 15, 2011 at 23:16

  • I confirm it’s much faster than other solutions

    Sep 12, 2012 at 1:02

  • 5

    trouble is it does not return the full record

    Aug 7, 2014 at 7:03

  • @user2067753 No, it doesn’t return the full record. You can use the same MAX()..KEEP.. expression on multiple columns, so you can select all the columns you need. But it is inconvenient if you want a large number of columns and would prefer to use SELECT *.

    Aug 7, 2014 at 19:54