SQL
Database Management
Programming
Data Analysis
SQL Select Statement

SQL select only rows with max value on a column

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Selecting rows that contain the maximum value of a column — especially the maximum per group — is one of the most common SQL tasks. There are several approaches: subquery with MAX(), JOIN with an aggregated subquery, window functions (ROW_NUMBER, RANK), and correlated subqueries. The best approach depends on your database engine and whether you need ties included.

Simple Case: Global Maximum

Get all rows where a column equals the overall maximum:

sql
SELECT *
FROM orders
WHERE amount = (SELECT MAX(amount) FROM orders);

This returns all rows with the maximum amount. If multiple rows share the max value, all are returned.

Per-Group Maximum: The Common Problem

The more common scenario is finding the row with the maximum value in each group. For example, finding the most recent order for each customer:

sql
1-- Sample data
2-- orders: id, customer_id, amount, order_date
3-- 1, 101, 50, 2024-01-15
4-- 2, 101, 75, 2024-03-20
5-- 3, 102, 30, 2024-02-10
6-- 4, 102, 90, 2024-04-05
7-- 5, 101, 60, 2024-05-01

Method 1: Subquery with GROUP BY

sql
1SELECT o.*
2FROM orders o
3INNER JOIN (
4    SELECT customer_id, MAX(order_date) AS max_date
5    FROM orders
6    GROUP BY customer_id
7) latest ON o.customer_id = latest.customer_id
8        AND o.order_date = latest.max_date;

This joins the table with a subquery that finds the max date per customer. If two orders share the same max date for a customer, both are returned.

sql
1SELECT *
2FROM (
3    SELECT *,
4           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
5    FROM orders
6) ranked
7WHERE rn = 1;

ROW_NUMBER() assigns 1 to the most recent order per customer. This guarantees exactly one row per group, even with ties — the database picks one arbitrarily.

Method 3: Window Function with RANK (Include Ties)

sql
1SELECT *
2FROM (
3    SELECT *,
4           RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rnk
5    FROM orders
6) ranked
7WHERE rnk = 1;

RANK() assigns the same rank to tied values. If two orders have the same highest amount for a customer, both get rnk = 1.

FunctionTiesResult
ROW_NUMBER()Breaks ties arbitrarilyExactly one row per group
RANK()Same rank for tiesMultiple rows possible
DENSE_RANK()Same rank for ties, no gapsMultiple rows possible

Method 4: Correlated Subquery

sql
1SELECT *
2FROM orders o
3WHERE order_date = (
4    SELECT MAX(order_date)
5    FROM orders
6    WHERE customer_id = o.customer_id
7);

This is readable but can be slow on large tables because the subquery runs once per row in the outer query.

Method 5: LEFT JOIN Anti-Pattern

Find rows where no other row in the same group has a higher value:

sql
1SELECT o.*
2FROM orders o
3LEFT JOIN orders o2
4    ON o.customer_id = o2.customer_id
5    AND o.order_date < o2.order_date
6WHERE o2.id IS NULL;

If no row o2 exists with a later date for the same customer, then o has the max date.

Max Value with Additional Columns

Get the maximum amount and the corresponding order details per customer:

sql
1-- Wrong: this only gives max amount, not the full row
2SELECT customer_id, MAX(amount) FROM orders GROUP BY customer_id;
3
4-- Right: use a window function to get the full row
5SELECT customer_id, amount, order_date, id
6FROM (
7    SELECT *,
8           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rn
9    FROM orders
10) t
11WHERE rn = 1;

Top N Per Group

Get the top 3 orders by amount for each customer:

sql
1SELECT *
2FROM (
3    SELECT *,
4           ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rn
5    FROM orders
6) ranked
7WHERE rn <= 3;

Database-Specific Syntax

MySQL 8.0+

sql
1-- MySQL 8.0+ supports window functions
2SELECT * FROM (
3    SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rn
4    FROM orders
5) t WHERE rn = 1;

MySQL 5.7 (No Window Functions)

sql
1SELECT o.*
2FROM orders o
3INNER JOIN (
4    SELECT customer_id, MAX(amount) AS max_amount
5    FROM orders
6    GROUP BY customer_id
7) m ON o.customer_id = m.customer_id AND o.amount = m.max_amount;

PostgreSQL (DISTINCT ON)

sql
SELECT DISTINCT ON (customer_id) *
FROM orders
ORDER BY customer_id, order_date DESC;

DISTINCT ON is PostgreSQL-specific and returns the first row per group based on the ORDER BY.

Performance Considerations

sql
1-- For the window function approach, ensure an index exists:
2CREATE INDEX idx_orders_customer_date ON orders (customer_id, order_date DESC);
3
4-- For the JOIN approach:
5CREATE INDEX idx_orders_customer_amount ON orders (customer_id, amount);
MethodPerformanceTiesPortability
ROW_NUMBERFast with indexNo tiesSQL standard
JOIN + MAX subqueryFast with indexIncludes tiesAll databases
Correlated subquerySlow on large tablesIncludes tiesAll databases
LEFT JOIN anti-patternModerateIncludes tiesAll databases
DISTINCT ONFastNo tiesPostgreSQL only

Common Pitfalls

  • Ambiguous max with GROUP BY: SELECT customer_id, MAX(amount), order_date FROM orders GROUP BY customer_id does not return the order_date of the row with the max amount — it returns an arbitrary order_date. Use window functions or a JOIN to get correct associated columns.
  • Performance on large tables: Correlated subqueries execute once per outer row and can be extremely slow. Prefer JOIN or window function approaches with appropriate indexes.
  • Ties: Decide upfront whether you want one row per group (use ROW_NUMBER) or all tied rows (use RANK or JOIN with MAX).
  • NULL values: MAX() ignores NULLs. If the column can be NULL, rows with NULL are excluded from the maximum calculation. Use COALESCE if NULLs should be treated as a specific value.
  • Index usage: Window functions benefit from composite indexes that match the PARTITION BY and ORDER BY columns.

Summary

  • Use ROW_NUMBER() OVER (PARTITION BY group ORDER BY col DESC) for exactly one row per group (most versatile)
  • Use RANK() or JOIN with MAX() when tied maximum values should all be returned
  • Avoid correlated subqueries on large tables — they are slow
  • PostgreSQL's DISTINCT ON is the most concise syntax for this pattern
  • Always index the partition and order columns for window function queries

Course illustration
Course illustration

All Rights Reserved.