Window functions in the DE interview

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

What a DE-level question actually looks like

The analyst version of window functions is ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_at) for deduplication. The data-engineer version is "explain why ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is safer than the default RANGE, and what happens to your running total when two events share a timestamp." If you can't answer that on a whiteboard, the staff engineer from Snowflake, Databricks, Stripe, or Airbnb moves on to a less senior loop.

The reason this question keeps showing up: the default frame in SQL is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW whenever you supply ORDER BY without an explicit frame. With RANGE, all rows that share the same ordering key collapse into one window slice, silently inflating running totals at every duplicate timestamp. Three months later an analyst notices a 2 percent drift between the report and the ledger, and you spend a quarter rolling back partitions.

This post walks the parts of window functions DE interviewers at Snowflake, Databricks, Meta, Uber, and DoorDash actually probe: frames, ranking, analytic helpers, dedup patterns, and engine-specific edge cases.

Syntax and the frame clause

Every window function has three independent pieces and one easy-to-miss default. The skeleton looks like this:

SELECT
    user_id,
    event_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY user_id
        ORDER BY event_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM events;

Three pieces: PARTITION BY (groups the splits — optional), ORDER BY (ordering inside the partition — required for ranking and offset functions), and the frame clause (which rows feed the aggregate).

The default frame is the part most candidates miss. With ORDER BY present, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Without ORDER BY, it expands to ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. That implicit switch is why SUM(amount) OVER () returns the full partition total, while SUM(amount) OVER (ORDER BY event_date) returns a running total. Same function, different math, silent default.

ROWS vs RANGE

ROWS counts physical rows in the partition. RANGE counts logical distance along the ORDER BY key — so any rows that tie on that key fall into the same bucket.

-- ROWS: exactly N physical rows
SUM(amount) OVER (
    ORDER BY event_date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)

-- RANGE: every row whose event_date sits inside the interval
SUM(amount) OVER (
    ORDER BY event_date
    RANGE BETWEEN INTERVAL '2 day' PRECEDING AND CURRENT ROW
)

The danger of default RANGE shows up on tied keys. If three rows share event_date = 2026-05-01, RANGE pulls all three into every running-total step for that date:

event_date  | amount | running_rows | running_range
2026-05-01  | 100    | 100          | 250
2026-05-01  | 50     | 150          | 250
2026-05-01  | 100    | 250          | 250
2026-05-02  | 30     | 280          | 280

For a true row-by-row running total, always pin the frame with ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The visible cost is six extra keywords; the invisible cost of skipping them is a metric that drifts on high-volume days.

Engine support to memorize: Postgres 11+ and Spark 3+ support RANGE BETWEEN N PRECEDING with numeric and interval offsets. ClickHouse supports a narrower subset through RANGE BETWEEN INTERVAL. Snowflake supports the full ANSI grammar. BigQuery accepts RANGE with numeric offsets but not arbitrary intervals.

LAG, LEAD, FIRST_VALUE, LAST_VALUE

LAG(col, N) returns the value from N rows back inside the partition. LEAD(col, N) looks forward. Both default N to 1 and return NULL at the partition edge unless you pass a third default argument.

-- delta from the previous purchase per user
SELECT
    user_id,
    event_date,
    amount,
    amount - LAG(amount, 1, 0) OVER (
        PARTITION BY user_id ORDER BY event_date
    ) AS delta
FROM events;

FIRST_VALUE and LAST_VALUE look at the literal frame. With the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, LAST_VALUE returns the current row, because the window ends at CURRENT ROW. To get the actual last value in the partition, widen the frame manually:

LAST_VALUE(amount) OVER (
    PARTITION BY user_id
    ORDER BY event_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_amount

The ranking family covers most analytical-warehouse questions:

SELECT
    user_id,
    amount,
    ROW_NUMBER()   OVER w AS rn,
    RANK()         OVER w AS rk,
    DENSE_RANK()   OVER w AS drk,
    NTILE(4)       OVER w AS quartile,
    PERCENT_RANK() OVER w AS pr,
    CUME_DIST()    OVER w AS cd
FROM events
WINDOW w AS (PARTITION BY user_id ORDER BY amount DESC);

ROW_NUMBER is always unique. RANK skips after ties (1, 1, 3). DENSE_RANK does not skip (1, 1, 2). NTILE(N) allocates rows into N roughly equal buckets. PERCENT_RANK and CUME_DIST give continuous-looking distributions. The trap follow-up is "what happens if I use RANK instead of ROW_NUMBER in a dedup CTE?" — covered next.

Dedup with ROW_NUMBER

The canonical pattern for collapsing duplicates to one row per business key is a CTE plus ROW_NUMBER:

WITH ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY event_id
            ORDER BY ingested_at DESC
        ) AS rn
    FROM raw_events
)
SELECT * FROM ranked WHERE rn = 1;

The PARTITION BY is your dedup key. The ORDER BY picks the winner: latest ingested_at, highest _version, longest payload — whatever the contract says. The most common DE-side mistake is using RANK instead of ROW_NUMBER. If two rows tie on the ordering key, RANK returns 1 for both, and WHERE rn = 1 keeps both rows, which defeats the point. Use ROW_NUMBER when you need exactly one survivor.

Second trap: you cannot reference a window function directly in WHERE. Wrap it in a CTE or subquery and filter on the alias. Snowflake, BigQuery, Databricks, and ClickHouse 23+ provide QUALIFY as a syntactic shortcut:

SELECT *
FROM raw_events
QUALIFY ROW_NUMBER() OVER (
    PARTITION BY event_id
    ORDER BY ingested_at DESC
) = 1;

Postgres does not have QUALIFY — there you stick with the CTE form. Mentioning QUALIFY and its engine support is a fast signal that you have written real warehouse code.

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Running totals and moving averages

A safe running total per user:

SUM(amount) OVER (
    PARTITION BY user_id ORDER BY event_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS lifetime_revenue

Pinning ROWS avoids the duplicate-timestamp drift. A trailing 7-day moving average in rows:

AVG(daily_amount) OVER (
    PARTITION BY metric_name ORDER BY event_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS ma_7d

That is "the last 7 rows" — correct only if the source has exactly one row per day. If days can be missing, you want a calendar-based moving average:

AVG(daily_amount) OVER (
    PARTITION BY metric_name ORDER BY event_date
    RANGE BETWEEN INTERVAL '6 day' PRECEDING AND CURRENT ROW
) AS ma_7d_calendar

The first form averages 7 physical rows even if calendar days are missing. The second averages whatever rows exist in the last 7 calendar days. The distinction maps to whether you pad missing days with zero before the window function or after.

A common follow-up: count distinct users in a rolling 28-day window. Postgres does not support COUNT(DISTINCT ...) OVER:

-- not supported on Postgres
COUNT(DISTINCT user_id) OVER (
    ORDER BY day
    RANGE BETWEEN INTERVAL '27 day' PRECEDING AND CURRENT ROW
)

The standard workaround is a correlated subquery over a generated calendar:

SELECT
    d.day,
    (SELECT COUNT(DISTINCT e.user_id)
     FROM events e
     WHERE e.created_at >= d.day - INTERVAL '27 day'
       AND e.created_at <  d.day + INTERVAL '1 day') AS mau_28d
FROM (
    SELECT generate_series('2026-01-01'::DATE, '2026-05-01', '1 day') AS day
) d;

Snowflake supports COUNT(DISTINCT ...) OVER () without a frame, but not with a sliding frame — same workaround applies. ClickHouse provides uniqExact() OVER (...) and approximate uniqCombined() OVER (...), dramatically faster on billion-row tables.

Common pitfalls

The default RANGE frame on a tied ORDER BY key is the most expensive mistake you can make. Two rows on the same day, an unannotated running total, and the metric inflates at every tied seam. The fix is to write ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW every time you compute a running total — production has more ties than test data.

LAST_VALUE without an explicit frame is the related trap. The default frame ends at the current row, so LAST_VALUE(amount) OVER (ORDER BY event_date) returns the current row's amount, which is almost never what you want. If you mean "the last value in the partition," widen the frame to ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

Putting a window function in WHERE is illegal in standard SQL because WHERE runs before window functions. Wrap in a CTE or subquery, or use QUALIFY on engines that support it (Snowflake, BigQuery, Databricks, ClickHouse 23+). Naming which engines support QUALIFY is a frequent gating signal.

Using RANK for deduplication is a subtle bug that survives code review. RANK returns the same rank for tied rows, so WHERE rk = 1 keeps every tied winner. Use ROW_NUMBER and let the ORDER BY tie-breaker decide deterministically.

COUNT(DISTINCT ...) OVER (...) with a sliding frame fails on Postgres and on Snowflake when a frame is added. The workaround is a self-join on a daily distinct table, a correlated subquery, or on ClickHouse the faster uniqExact() OVER (...).

Optimization tips

The cheapest optimization is to compute the window once and reuse it. If you find yourself selecting three running totals partitioned the same way, fold them into a single named WINDOW and reference it three times:

SELECT
    user_id,
    event_date,
    SUM(amount)  OVER w AS running_sum,
    AVG(amount)  OVER w AS running_avg,
    COUNT(*)     OVER w AS running_count
FROM events
WINDOW w AS (
    PARTITION BY user_id
    ORDER BY event_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
);

Most planners reuse the underlying sort, but the named clause makes intent legible and prevents accidental frame drift between copy-pasted OVER clauses.

Sort cost is usually the dominant factor. If the table is clustered or partitioned on the PARTITION BY key, the engine can skip the global sort. On Snowflake, clustering by user_id lets a PARTITION BY user_id ORDER BY event_date window scan nearly streaming. On Databricks Delta, OPTIMIZE ... ZORDER BY (user_id, event_date) does the same. On Postgres, a btree index aligned with the partition key keeps the sort within work_mem. Inspect with EXPLAIN ANALYZE and watch for WindowAgg followed by Sort (external) — that external sort is where queries fall off a cliff at 100M rows.

Minimize ORDER BY columns inside the window — extra columns "for stability" widen the sort key. If you need many windows partitioned differently across a 500M-row table, a precomputed staging table per partition scheme often beats six window functions in one query — the engine cannot reuse sorts across different PARTITION BY clauses.

If you want to drill DE window-function questions exactly like the ones above, NAILDD is launching with a queue of advanced SQL problems targeted at staff-level data engineering loops.

FAQ

What is the difference between PARTITION BY and GROUP BY?

GROUP BY collapses rows so the output has one row per group. PARTITION BY keeps every row and computes the aggregate inside each group as an extra column. For a per-customer total alongside every order, use SUM(amount) OVER (PARTITION BY customer_id). For one row per customer, use SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id. Same numbers, different shape.

Can a window function appear inside HAVING?

No. HAVING filters aggregated groups before window functions run. To filter by a window result, wrap the query in a CTE or subquery and apply the filter in an outer WHERE. Engines with QUALIFY (Snowflake, BigQuery, Databricks, ClickHouse 23+) provide a shortcut.

Is a window function or a subquery with MAX faster?

The window form is usually faster because it scans the source once and lets the planner reuse the sort. A subquery like LEFT JOIN (SELECT user_id, MAX(event_date) FROM ... GROUP BY user_id) t USING (user_id) scans the source twice and runs an extra join. On a 100M-row events table the window form often wins by an order of magnitude. The exception is when the join feeds a much narrower probe side. Read the EXPLAIN plan before assuming.

How do you compute percentiles inside a window?

In Postgres and BigQuery use PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) OVER (PARTITION BY ...) for an interpolated median, or PERCENTILE_DISC(0.5) for an exact dataset value. Snowflake supports both. ClickHouse uses quantile(0.5)(amount) OVER (...) and quantileTDigest(amount) for approximate results in fixed memory. On Spark, percentile_approx(amount, 0.5) is the streaming-friendly form. Use exact for compliance, approximate for dashboards over billions of rows.

Can you nest one window function inside another?

Not directly — SUM(SUM(x) OVER (...)) OVER (...) is not legal SQL. Split them into two CTEs: the first computes the inner window, the outer references the result and applies the second window. This also prevents recomputing the same expensive window twice.

How do I dedup keeping the latest row when timestamps tie?

Use ROW_NUMBER with a multi-column ORDER BY containing a deterministic tie-breaker — for example, ORDER BY ingested_at DESC, source_id DESC. ROW_NUMBER always returns a unique rank, so WHERE rn = 1 keeps exactly one row per business key.