ORDER BY and LIMIT in SQL
Contents:
Why sorting and limiting show up in every analyst loop
A PM at Stripe pings you on a Tuesday. "Pull our top 20 merchants by gross processed volume for last quarter, ranked highest to lowest." That ask is ORDER BY plus LIMIT. A growth analyst at DoorDash asks for "the ten most recent failed deliveries in the Bay Area" — same shape. A data engineer at Snowflake screens you with "second-highest salary per department" — same shape, cleverer twist.
ORDER BY and LIMIT look elementary, which is exactly why interviewers use them as a tripwire. Anyone can type LIMIT 10. The candidates who get hired know what happens when LIMIT is used without ORDER BY, how NULL sorts in PostgreSQL versus MySQL, why OFFSET 1,000,000 tanks a dashboard, and why the obvious SQL Server syntax differs from the obvious PostgreSQL one.
The mental model is two sentences. ORDER BY decides row order. LIMIT decides how many ordered rows you keep. Without ORDER BY, LIMIT picks any rows — engines like PostgreSQL and BigQuery return whatever is cheapest today, possibly a different answer tomorrow.
ORDER BY: the basic syntax
ORDER BY accepts a column name and an optional direction. The defaults are ASC for ascending and DESC for descending; ASC is implicit if you omit the direction.
SELECT name, salary
FROM employees
ORDER BY salary DESC;Every row sorted by salary from highest to lowest. Telling the engine you only want the first ten rows lets a sort with a heap data structure short-circuit at ten, rather than sorting all 50 million — which is one reason LIMIT and ORDER BY belong together.
You can sort by an expression as well as a bare column. Total compensation, LOWER(name) for case-insensitive sorts, EXTRACT(month FROM order_date) for calendar position — anything that evaluates per row is fair game.
SELECT name, salary, bonus
FROM employees
ORDER BY salary + bonus DESC;A shorthand in legacy code is sorting by a SELECT position: ORDER BY 2 DESC refers to the second SELECT column. It saves typing but silently breaks when someone reorders the SELECT list. In production code, name the columns.
Sorting by multiple columns
Real reports almost always need a tiebreaker. "Sort employees by department alphabetically, within each department by salary highest to lowest" is two ORDER BY keys.
SELECT department, name, salary
FROM employees
ORDER BY department ASC, salary DESC;Left to right: department ascending is the primary sort, salary descending only matters when departments tie. The directions are independent. A common cohort pattern is ORDER BY cohort_month ASC, retention_rate DESC.
If the second key also ties, you fall back to whatever the engine returns, which is undefined. Add a third tiebreaker (often the primary key) when ordering needs to be deterministic across runs — without it, two users reloading a dashboard at the same time can see rows shuffle.
NULL behavior in ORDER BY
Dialect differences bite here. SQL does not define how NULL sorts; each engine picks a default.
PostgreSQL and Oracle place NULL last in ascending order, first in descending. SQL Server, SQLite, and MySQL do the opposite. Same ORDER BY manager_id ASC query, two different reports.
The portable fix is NULLS FIRST or NULLS LAST, supported in PostgreSQL, Oracle, BigQuery, Snowflake, Trino, and Spark SQL:
SELECT name, manager_id
FROM employees
ORDER BY manager_id ASC NULLS LAST;MySQL still does not support this clause as of 8.4. The portable workaround is a synthetic key:
-- MySQL: NULL at the end for ASC
SELECT name, manager_id
FROM employees
ORDER BY manager_id IS NULL, manager_id ASC;manager_id IS NULL returns 0 for non-NULL rows and 1 for NULL rows, pushing NULLs after everything else.
LIMIT and the dialect zoo
LIMIT caps the result set at N rows. Used after ORDER BY, it gives you the top N by the ordering key.
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 10;The syntax is consistent across PostgreSQL, MySQL, SQLite, BigQuery, Trino, and Spark SQL. SQL Server is the outlier, using TOP after SELECT. ANSI SQL defines a third form, FETCH FIRST, supported by PostgreSQL, Oracle 12c+, DB2, and SQL Server 2012+:
-- SQL Server
SELECT TOP 10 name, salary FROM employees ORDER BY salary DESC;
-- Portable ANSI form
SELECT name, salary FROM employees
ORDER BY salary DESC
FETCH FIRST 10 ROWS ONLY;Interviewers at Microsoft and shops on SQL Server will accept TOP; most everyone else expects LIMIT. Flagging the dialect difference during a screen shows that you have written cross-warehouse SQL.
OFFSET and the pagination tax
OFFSET skips the first N rows of an ordered result. Combined with LIMIT, it is the textbook pagination recipe.
SELECT order_id, customer_id, created_at
FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET 40;Skip 40, return the next 20. Rows 41 through 60. This works fine when N is small. It falls apart when N is large: the engine has to materialize the full sorted prefix before throwing the first 40 rows away. OFFSET 1000000 scans and discards a million rows on every page load.
The production fix is keyset pagination, sometimes called cursor or seek pagination. Instead of "give me page 3", you ask "give me the next 20 rows after the last one I saw." The last-seen value becomes a WHERE filter:
-- Keyset pagination: rows after the last seen created_at
SELECT order_id, customer_id, created_at
FROM orders
WHERE created_at < '2026-04-12 14:30:00'
ORDER BY created_at DESC
LIMIT 20;Because WHERE restricts the scan to rows below the cursor, an index on created_at lets the engine jump directly to the relevant range. The cost stays constant whether the cursor is on page 3 or page 30,000. The tradeoff: you can no longer jump to an arbitrary page by number, only advance one page at a time. Infinite scroll and "next page" buttons fit this shape exactly.
Worked interview examples
Top 5 customers by total revenue
SELECT customer_id, SUM(amount) AS total_revenue
FROM orders
GROUP BY customer_id
ORDER BY total_revenue DESC
LIMIT 5;GROUP BY collapses orders into one row per customer, ORDER BY ranks them, LIMIT trims to five. The aggregate alias total_revenue is valid in ORDER BY because ORDER BY runs after SELECT.
Second-highest salary
A classic interview question with three or four right answers. The LIMIT/OFFSET form is the shortest:
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;DISTINCT collapses duplicates so ties at the top do not consume the second slot. The window-function alternative survives ties cleanly:
SELECT salary
FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
) ranked
WHERE rnk = 2;DENSE_RANK assigns rank 1 to the highest, rank 2 to the next distinct value, regardless of how many employees share each salary. Mention both forms in an interview and explain when each fits.
Latest event per user
SELECT user_id, event_type, event_time
FROM (
SELECT
user_id,
event_type,
event_time,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_time DESC) AS rn
FROM events
) ranked
WHERE rn = 1;ROW_NUMBER ranks events within each user, ORDER BY puts the latest first, the outer filter keeps rank 1. Solving this with ORDER BY + LIMIT alone is impossible without a correlated subquery — window functions are the canonical answer.
Common pitfalls
The first pitfall is LIMIT without ORDER BY. The result is "any 10 rows" — implementation-defined and free to change between runs, dialect versions, and the physical plans the optimizer picks. Reports that depend on a deterministic answer break silently when the table grows past a partition boundary and the scan order changes. Always pair LIMIT with ORDER BY when correctness matters; the only safe exception is exploratory SELECT * FROM table LIMIT 5 to peek at column shape.
A second trap is ORDER BY inside a subquery without an outer ORDER BY. The standard explicitly lets the optimizer ignore sub-query ordering, because a subquery returns a set, and sets do not have order. PostgreSQL strips it; BigQuery warns about it. If the final shape matters, ORDER BY belongs in the outermost SELECT. Rewrite SELECT * FROM (SELECT ... ORDER BY x DESC) sub LIMIT 5 as SELECT ... FROM table ORDER BY x DESC LIMIT 5.
A third pitfall is using OFFSET for deep pagination on production tables. With a small offset the cost is negligible; with a million-row offset, every page load reads and discards a million rows. This shows up as gradually worsening dashboard latency over months. Switch to keyset pagination on any user-facing list whose primary access pattern is sequential paging.
A fourth pitfall is forgetting that ORDER BY sees SELECT aliases but WHERE does not. ORDER BY total_revenue DESC works after SUM(amount) AS total_revenue. WHERE total_revenue > 100 fails — WHERE runs before SELECT. People who learn the ORDER BY trick first sometimes try it in WHERE and waste five minutes on the error message.
A fifth pitfall is mixed-case string sorts. Each engine has its own collation defaults; case-sensitive locales rank Zebra before apple. If the order matters to a stakeholder, normalize with LOWER(column) or pick an explicit collation.
Optimization tips
ORDER BY combined with LIMIT can be dramatically faster than ORDER BY alone, but only if the engine knows it can skip the full sort. A modern optimizer doing ORDER BY x DESC LIMIT 10 uses a bounded heap of size 10, looking at every row once and keeping a running top 10 — O(N log 10) instead of O(N log N). That speedup vanishes if you wrap the query in another SELECT or DISTINCT that hides the LIMIT, so keep LIMIT at the outermost level.
An index on the ORDER BY column is the single most impactful optimization for paginated queries. PostgreSQL and MySQL can walk a B-tree in order, returning rows pre-sorted without a separate sort step. This is what makes keyset pagination fast: the index seeks directly to the cursor position and walks forward.
For huge result sets, partitioning by the sort key turns global sorts into partition-local sorts. Snowflake's clustering keys, BigQuery's clustering, and Redshift's sort keys all play this role. Avoid ORDER BY RAND() for sampling — full table scan plus full sort. Use TABLESAMPLE or WHERE MOD(HASH(id), 100) < 5 instead.
If you sort by an expression often — say ORDER BY LOWER(name) — most engines let you create a functional index, and the optimizer treats it like a regular indexed column.
Related reading
- GROUP BY and HAVING in SQL
- SQL window functions interview questions
- DISTINCT vs GROUP BY in SQL
- NULL in SQL: the complete guide
If you want to drill ORDER BY and LIMIT patterns until they are automatic, NAILDD is shipping with 500+ SQL problems sorted by topic — pagination, ranking, top-N per group, NULL handling — each with the kind of detailed solution you would walk a junior teammate through.
FAQ
Can ORDER BY reference a column that is not in SELECT?
Yes, in most cases. ORDER BY can sort by any column from the underlying tables, even if SELECT does not return it. The exception is queries with DISTINCT or set operators (UNION, INTERSECT, EXCEPT): they only see the projected columns, because deduplication has already collapsed everything else. If the column survives into the logical row stream that ORDER BY sees, it is fair game.
Why does ORDER BY work with SELECT aliases but WHERE does not?
The logical execution order is FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY, LIMIT. WHERE runs before SELECT, so SELECT aliases do not exist yet. ORDER BY runs after SELECT, so the aliases are already in scope. PostgreSQL bends this slightly by allowing some SELECT references in HAVING, but the safe rule is "no aliases in WHERE, aliases fine in ORDER BY."
What happens when LIMIT exceeds the row count of the table?
SQL returns every row that exists, with no error. LIMIT is an upper bound, not a contract. A 50-row table queried with LIMIT 1000 returns 50 rows. This makes LIMIT safe to use defensively: LIMIT 100 on a query you expect to return one row still returns that one row, but caps the blast radius if your assumption is wrong.
When should keyset pagination replace OFFSET?
Whenever the page size times the maximum page number gets large — roughly when OFFSET starts exceeding a few thousand. Below that, OFFSET is convenient and fast enough. Above that, every page load scans linearly with the offset, and queries that worked when the table was small slow to a crawl as it grows. Infinite-scroll lists, mobile feeds, and admin tools over multi-million-row tables are textbook keyset candidates.
Does ORDER BY guarantee a stable sort?
It depends on the engine. PostgreSQL's sort is not guaranteed stable — rows with equal sort keys can appear in any relative order, and that order can change between runs. For deterministic output, add a tiebreaker that uniquely identifies each row, typically the primary key: ORDER BY created_at DESC, id DESC. This matters when downstream consumers diff your output across runs or when a UI needs predictable row positions.