Window functions in SQL: overview
Contents:
What a window function actually does
A window function computes a value over a set of rows that are "related" to the current row, without collapsing those rows into one. That last part is the bit that trips people up. With GROUP BY you trade rows for a smaller summary. With a window function you keep every row and add a computed column next to it.
Picture a tiny orders table with five rows split across two departments — Sales and Marketing — with order amounts in dollars. A manager asks: "show me each order, and next to it the total revenue for that order's department." A GROUP BY department collapses the table to two rows. A window function keeps all five rows and tacks the department total onto each one. That trick is why every interviewer at Google, Meta, Stripe, Airbnb, and Snowflake leans on window functions when probing for SQL fluency.
Every modern engine supports them: PostgreSQL, MySQL 8+, BigQuery, Snowflake, Databricks SQL, Redshift, ClickHouse. The syntax is nearly identical across vendors. Small dialect differences exist (Snowflake and BigQuery support QUALIFY, PostgreSQL has the WINDOW clause), but the shape transfers verbatim.
The OVER clause
Every window function ends with OVER (...). Inside the parentheses you can specify up to three things: how to slice the table (PARTITION BY), how to order rows inside each slice (ORDER BY), and which rows inside the slice to actually look at (the frame).
function_name(...) OVER (
[PARTITION BY col1, col2, ...]
[ORDER BY col3 [ASC|DESC], ...]
[ROWS | RANGE BETWEEN ... AND ...]
)function_name is either a regular aggregate (SUM, AVG, COUNT, MIN, MAX) or a window-specific function (ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE). PARTITION BY is the window-function version of GROUP BY — it splits the rows into independent buckets and the function restarts within each bucket. ORDER BY defines order inside the partition; it is mandatory for ranking and offset functions and controls the default frame for aggregates.
A typical example: number each order inside its department, biggest first.
SELECT
order_id,
department,
amount,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY amount DESC
) AS rank_in_dept
FROM orders;You will see this exact shape in every interview problem about top-N, deduplication, and "first event per user."
Ranking functions
SQL ships four ways to assign a number to each row inside a partition. They look identical from outside but behave differently when values tie.
ROW_NUMBER() gives every row a unique sequential number; ties are broken arbitrarily by the engine. RANK() assigns the same rank to ties and then skips — two rows tied for second both get rank 2, and the next row gets rank 4. DENSE_RANK() ties the equal rows together but does not skip; the next row gets rank 3. NTILE(n) divides the partition into n roughly equal groups, useful for quartiles or deciles.
The mental rule: pick ROW_NUMBER when you need an exact count, RANK for honest leaderboards with gaps, DENSE_RANK for consecutive bucket labels, and NTILE to split into N equal-ish chunks.
SELECT student, score,
ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num,
RANK() OVER (ORDER BY score DESC) AS rnk,
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rnk
FROM scores;Offset functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE
These pull a value from another row in the same window. LAG(col, n, default) returns the value n rows back. LEAD(col, n, default) returns it n rows forward. Default n is 1. If the offset falls outside the partition, the function returns NULL unless you pass an explicit default.
The canonical use is day-over-day change:
SELECT day, revenue,
LAG(revenue) OVER (ORDER BY day) AS prev_day_revenue,
revenue - LAG(revenue) OVER (ORDER BY day) AS delta
FROM daily_revenue;FIRST_VALUE(col) returns the value from the first row of the window. LAST_VALUE(col) returns the value from the last row — with a famous gotcha. When ORDER BY is present the default frame runs from the start of the partition to the current row, not to the end. So LAST_VALUE(...) returns the current row's value, which is almost never what you want. The fix is to spell the frame out:
LAST_VALUE(revenue) OVER (
ORDER BY day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)FIRST_VALUE does not have this problem because the default frame always starts at the partition's beginning.
Aggregates with OVER
Attach OVER (...) to any aggregate (SUM, AVG, COUNT, MIN, MAX) and you get one value per input row instead of one per group.
The running total:
SELECT day, revenue,
SUM(revenue) OVER (ORDER BY day) AS running_total
FROM daily_revenue;With ORDER BY day and no explicit frame, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The cumulative sum walks from the first day up to the current day.
A moving average over the trailing seven days. Note 6 PRECEDING, not 7 — the current row counts as one of the seven.
SELECT day, revenue,
AVG(revenue) OVER (
ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7d
FROM daily_revenue;A share of total uses the empty-OVER trick: no PARTITION BY, no ORDER BY, window covers the entire result set.
SELECT department, amount,
ROUND(100.0 * amount / SUM(amount) OVER (), 1) AS pct_of_total
FROM orders;Frames: ROWS BETWEEN and RANGE BETWEEN
The frame defines which rows inside the partition the function actually looks at. This is where casual users get burned.
ROWS BETWEEN <start> AND <END>Where <start> and <end> are one of: UNBOUNDED PRECEDING, n PRECEDING, CURRENT ROW, n FOLLOWING, UNBOUNDED FOLLOWING.
The default frame depends on ORDER BY. Without ORDER BY the frame is the entire window. With ORDER BY and no explicit frame, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. That is why SUM(x) OVER (ORDER BY day) gives a running total while SUM(x) OVER () gives the grand total.
ROWS counts physical rows: ROWS BETWEEN 2 PRECEDING AND CURRENT ROW is exactly three rows — the current one plus the two physically before it. RANGE counts by the value of the ORDER BY column, so ties on that column all enter the frame together. Use ROWS for predictable sliding windows. Use RANGE when your data has gaps and you want to define the window in real units rather than row counts.
The fence-post error is so common it has a name. A 7-day window is 6 PRECEDING AND CURRENT ROW. A 30-day window is 29 PRECEDING AND CURRENT ROW. Off by one and your trailing average secretly includes one extra day.
Five worked tasks
These five patterns cover roughly 90 percent of real-world window-function problems and the interview rounds at Stripe, Netflix, Uber, DoorDash, and Notion.
Top-N per group. Find the two biggest orders in each department. Use ROW_NUMBER, never RANK, because RANK can return more than N rows when there are ties.
WITH ranked AS (
SELECT order_id, department, amount,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY amount DESC) AS rn
FROM orders
)
SELECT order_id, department, amount FROM ranked WHERE rn <= 2;In Snowflake or BigQuery the QUALIFY clause skips the CTE: ... QUALIFY ROW_NUMBER() OVER (...) <= 2.
Running total by month. The double SUM(SUM(...)) is valid: the inner aggregate runs under GROUP BY, then the windowed aggregate runs on top.
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(amount) AS monthly_revenue,
SUM(SUM(amount)) OVER (ORDER BY DATE_TRUNC('month', order_date)) AS ytd_revenue
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY DATE_TRUNC('month', order_date);7-day moving average of order count.
SELECT
order_date,
COUNT(*) AS daily_orders,
AVG(COUNT(*)) OVER (
ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7d
FROM orders
GROUP BY order_date;Month-over-month change.
WITH monthly AS (
SELECT DATE_TRUNC('month', order_date) AS month, SUM(amount) AS revenue
FROM orders GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
month, revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_revenue,
ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month))
/ LAG(revenue) OVER (ORDER BY month), 1) AS change_pct
FROM monthly;Share of total and share of group. Two windows in one query, computed in a single scan.
SELECT
order_id, department, amount,
ROUND(100.0 * amount / SUM(amount) OVER (PARTITION BY department), 1) AS pct_of_dept,
ROUND(100.0 * amount / SUM(amount) OVER (), 1) AS pct_of_total
FROM orders;Common pitfalls
The most common bug is calling a ranking or offset function with no ORDER BY. ROW_NUMBER() OVER (PARTITION BY user_id) is syntactically valid in most engines but produces nondeterministic row numbers — re-run the query and the numbers shuffle. Always specify the order, even when natural order seems obvious; the engine has no obligation to honour your intuition.
The second pitfall is reaching for RANK when you wanted ROW_NUMBER. A WHERE rnk <= 3 filter looks like "top 3" but actually returns four or five rows the moment two scores tie. Interview graders set up the table data precisely to catch this. If the requirement is exactly N rows per group, use ROW_NUMBER. If ties should share a place, use RANK and accept that the count is variable.
The third pitfall is the fence-post error in frame definitions. A trailing 7-day window is ROWS BETWEEN 6 PRECEDING AND CURRENT ROW, not 7 PRECEDING. Writing 7 PRECEDING silently produces an 8-row window. The bug is invisible until someone reconciles your dashboard against a hand-checked spreadsheet. Always count: n - 1 PRECEDING AND CURRENT ROW gives a window of size n.
The fourth pitfall is LAST_VALUE without an explicit frame. With ORDER BY and the default frame, LAST_VALUE(col) returns the current row's value because the frame ends at the current row. Spell out ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to fix it. FIRST_VALUE does not have this problem because the frame always starts at the partition's beginning.
The fifth pitfall is putting a window function directly in a WHERE clause. Window functions are evaluated after WHERE, so the rank you computed cannot be filtered in the same statement. Wrap the query in a CTE or subquery and filter on the outer level, or use QUALIFY if your engine supports it.
Related reading
- SQL window functions interview questions
- Window functions in a data engineering interview
- GROUP BY vs PARTITION BY in SQL
- CTE WITH SQL guide
- How to calculate cumulative sum in SQL
If you want to drill window-function questions like these against a graded harness every day, NAILDD is launching with 500+ SQL problems built around exactly this pattern.
FAQ
What is the difference between ROW_NUMBER, RANK, and DENSE_RANK?
ROW_NUMBER always assigns unique sequential integers; ties are broken arbitrarily by the engine. RANK gives tied rows the same rank but skips the next integer (1, 2, 2, 4). DENSE_RANK gives tied rows the same rank without skipping (1, 2, 2, 3). For top-N problems where the count must be exact, use ROW_NUMBER. For leaderboards where ties share a place, use RANK. For consecutive bucket labels, use DENSE_RANK.
Can I use multiple window functions in one query?
Yes — a single SELECT can mix any number of window functions, each with its own OVER clause. If two functions share an identical window definition, most engines compute the sort once and reuse it. PostgreSQL and BigQuery support a named WINDOW clause: define WINDOW w AS (ORDER BY day) once and reuse OVER w across multiple functions, which tells the optimiser the windows are really the same.
Are window functions slow?
Window functions require sorting inside each partition, which is the main cost. On a billion-row table this is not free, but it is almost always faster than the alternative — correlated subquery, self-join, or repeated aggregation. Reuse the same PARTITION BY and ORDER BY across multiple functions so the engine sorts once; add an index on (partition_cols, order_cols) for read-heavy workloads; filter aggressively in a CTE before the window function runs, since window functions execute after WHERE.
What is the difference between ROWS and RANGE?
ROWS counts physical rows: ROWS BETWEEN 2 PRECEDING AND CURRENT ROW is exactly three rows. RANGE counts by the value of the ORDER BY expression: ties on the ORDER BY column all enter the frame together. For moving averages and rolling sums, ROWS is what you want. RANGE is useful when your time series has missing rows and you want to define the frame in real units rather than row counts.
Where can I use window functions in a query?
Window functions can appear in SELECT and ORDER BY. They cannot appear in WHERE, GROUP BY, or HAVING, because window functions are evaluated after those clauses. The standard workaround is to push the window function into a CTE or subquery, then filter the result on the outer level. Snowflake, BigQuery, Databricks SQL, and Teradata also support QUALIFY, which acts like WHERE for window-function results.
Do window functions work the same across engines?
The core syntax is portable: PARTITION BY, ORDER BY, ROWS BETWEEN, and all the standard functions behave identically in every modern engine. Dialect-specific extras include QUALIFY (Snowflake, BigQuery, Databricks SQL, Teradata), the named WINDOW clause (PostgreSQL, BigQuery, recent MySQL), and IGNORE NULLS / RESPECT NULLS on offset functions (Snowflake, BigQuery, Oracle). MySQL only added window functions in version 8.0, so legacy MySQL 5.7 still requires self-join workarounds.