How to calculate funnel drop-off in SQL
Contents:
What funnel drop-off tells you that conversion does not
Overall funnel conversion is a single number — twelve percent of signups end in a first purchase, say. That headline hides where the damage happens. Drop-off rate breaks the funnel into stages and tells you that fifty percent abandoned at email verification, then another thirty percent abandoned before completing the profile, and only a small slice reached checkout. The bottleneck is rarely where the team thinks it is, and conversion rate alone will never tell you which step to fix first.
Picture the Monday morning where your PM at Stripe or DoorDash drops a question into Slack: "Our signup-to-purchase rate dropped four points last week — which step broke?" If you answer with one aggregate number, you have answered nothing. The PM needs to know whether the regression sits in identity verification, in payment method linking, or in the merchant-selection step. Funnel drop-off in SQL gives you that answer in one query, and it is the same query the analyst will run thirty more times before the quarter ends.
The metric also reframes product conversations. Tell a designer "checkout converts at eighteen percent" and they hear an abstract goal; tell them "forty-one percent of users who reached the address-entry step never advanced" and they see a specific screen with a specific problem.
The SQL formula
The definition is straightforward and worth memorizing exactly as written:
Drop-off Rate (stage N) = (users_at_stage_N - users_at_stage_N+1) / users_at_stage_N * 100%Read it once more: the denominator is the population at the current stage, not the very top of the funnel. That distinction is the most common interview trap and we will return to it in the pitfalls section. Assume an events table with the shape events(user_id, event_type, event_date) and a fixed ordered set of stages — signup, verify_email, profile_complete, first_purchase. The base query in Postgres reads like this:
WITH stages AS (
SELECT
COUNT(DISTINCT CASE WHEN event_type = 'signup' THEN user_id END) AS signup,
COUNT(DISTINCT CASE WHEN event_type = 'verify_email' THEN user_id END) AS verify,
COUNT(DISTINCT CASE WHEN event_type = 'profile_complete' THEN user_id END) AS profile,
COUNT(DISTINCT CASE WHEN event_type = 'first_purchase' THEN user_id END) AS purchase
FROM events
WHERE event_date >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
signup, verify, profile, purchase,
(signup - verify)::NUMERIC * 100 / NULLIF(signup, 0) AS drop_signup_to_verify_pct,
(verify - profile)::NUMERIC * 100 / NULLIF(verify, 0) AS drop_verify_to_profile_pct,
(profile - purchase)::NUMERIC * 100 / NULLIF(profile, 0) AS drop_profile_to_purchase_pct
FROM stages;The conditional COUNT(DISTINCT ...) collapses raw events into per-stage unique users in a single pass. The NULLIF(signup, 0) guard saves you from the embarrassing division-by-zero outage when a brand-new market launches with no signups yet — the metric returns NULL instead of crashing the dashboard. Run this against thirty days of data and the row that comes back has three drop-off percentages; the largest one is your current bottleneck.
A subtle point: COUNT(DISTINCT ... CASE WHEN ...) does not enforce ordering. A user who triggered first_purchase before verify_email because of a legacy guest-checkout path still appears in the purchase bucket. Whether you want that depends on the product. At an e-commerce shop like Shopify or a marketplace like Airbnb, guest purchases are real revenue and you keep them. At a SaaS like Notion or Linear, the same pattern indicates a data-quality bug worth filtering out.
Drop-off by acquisition segment
Channel-level drop-off is where the analysis earns its keep. The aggregate funnel might convert at twelve percent, but paid social converts at six percent and organic search at twenty-two — the average tells you nothing about either. Join the events table to users.acquisition_channel and re-bucket:
WITH stages_by_channel AS (
SELECT
u.acquisition_channel,
COUNT(DISTINCT CASE WHEN e.event_type = 'signup' THEN e.user_id END) AS signup,
COUNT(DISTINCT CASE WHEN e.event_type = 'verify_email' THEN e.user_id END) AS verify,
COUNT(DISTINCT CASE WHEN e.event_type = 'first_purchase' THEN e.user_id END) AS purchase
FROM events e
JOIN users u ON u.user_id = e.user_id
WHERE e.event_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY u.acquisition_channel
)
SELECT
acquisition_channel,
signup,
verify,
purchase,
(signup - verify)::NUMERIC * 100 / NULLIF(signup, 0) AS drop_to_verify_pct,
(verify - purchase)::NUMERIC * 100 / NULLIF(verify, 0) AS drop_verify_to_purchase_pct
FROM stages_by_channel
HAVING COUNT(DISTINCT CASE WHEN e.event_type = 'signup' THEN e.user_id END) > 100
ORDER BY drop_to_verify_pct DESC;The HAVING clause is doing real work — it filters out channels with too few signups to have a stable rate. A boutique partnership with thirty signups can show a fifty-percent drop one week and zero the next, and the noise drowns the signal from the channels that actually move the business. Pick a threshold that matches your traffic: one hundred for a fast-moving consumer app, five hundred for a slower B2B funnel.
The shape of the output is what marketing and growth teams want on a recurring report: one row per channel, two drop-off columns, ordered by where the bleeding is worst. Pair this with cost-per-acquisition by channel and you have a quarter's worth of allocation conversations in one query.
Time-bounded drop-off
A user who verified email and then made a first purchase eighteen months later is technically converted, but in any practical sense that user dropped out of the funnel and was reacquired by a future campaign. Time-bounded drop-off attaches a window — usually seven, fourteen, or thirty days — to each stage transition so the metric reflects intent rather than eventual return.
The trick is to pivot from event-row counts to one row per user, with MIN(event_date) per stage. Then you can compare timestamps directly:
WITH user_stages AS (
SELECT
user_id,
MIN(CASE WHEN event_type = 'signup' THEN event_date END) AS signup_date,
MIN(CASE WHEN event_type = 'verify_email' THEN event_date END) AS verify_date,
MIN(CASE WHEN event_type = 'first_purchase' THEN event_date END) AS purchase_date
FROM events
GROUP BY user_id
)
SELECT
COUNT(*) AS users,
COUNT(*) FILTER (WHERE verify_date IS NOT NULL) AS verified,
COUNT(*) FILTER (
WHERE purchase_date IS NOT NULL
AND purchase_date <= verify_date + INTERVAL '7 days'
) AS purchased_within_7d,
COUNT(*) FILTER (
WHERE purchase_date IS NOT NULL
AND purchase_date <= verify_date + INTERVAL '7 days'
)::NUMERIC * 100
/ NULLIF(COUNT(*) FILTER (WHERE verify_date IS NOT NULL), 0) AS verify_to_purchase_7d_pct
FROM user_stages
WHERE signup_date >= CURRENT_DATE - INTERVAL '90 days';Note the ninety-day floor on signup_date. Cohorts that signed up yesterday have not had time to convert within seven days, and including them tanks the rate artificially. The rule of thumb: your cohort window must be at least as old as your conversion window, plus a small buffer. If you measure seven-day conversion, only count users who signed up at least seven days ago.
Common pitfalls
The first and most expensive mistake is the window mismatch. A user who signed up this morning has not yet had time to verify email or complete a purchase, but a naive query counts them in the denominator anyway. The fix is the cohort approach: define a signup cohort with an end date that gives every member time to finish the funnel, and only measure drop-off within that closed cohort. Without this, your drop-off rate looks worse every Monday and recovers every Friday — a calendar artifact, not a product change.
The second pitfall is treating skipped stages as drop-off. A user who purchases without verifying email might be a bug — your signup flow is supposed to force verification — or might be a legitimate path through a partner integration. Decide which it is before you write the query, and either filter the bad data out or change the stage definition. The worst outcome is shipping a metric that quietly conflates two stories.
The third pitfall is ignoring event ordering. Real-world events arrive out of order — a delayed mobile push, a backfilled batch, an idempotent retry — and an analyst who uses COUNT(DISTINCT) per stage will count a user who fired the purchase event before the verify event as fully converted at every stage. If ordering matters to your product, switch to a per-user pivot with MIN(event_date) and enforce monotonicity in the WHERE clause. If ordering does not matter, document that decision in the query comment so the next analyst does not "fix" it.
The fourth pitfall is small-sample noise in segment buckets. A channel with thirty signups gives a drop-off rate that swings twenty points week over week purely from variance, and stakeholders will demand explanations for movement that is not real. Add a HAVING filter for minimum cohort size — one hundred or five hundred users — and surface anything below the threshold as "insufficient data" in the dashboard rather than as a percentage.
The fifth pitfall is conflating drop-off with conversion rate when reporting. Drop-off at stage N is the inverse of stage-conversion at N — report ten-percent drop-off in one chart and ninety-percent conversion in another, and every reader will misread one of them. Pick one direction, name the column accordingly (drop_to_verify_pct versus verify_conv_pct), and stay consistent across the report.
Optimization tips
The events table is almost always your biggest table, and funnel drop-off queries scan it at least once. Three things help. First, partition events by event_date — daily or monthly — and the planner will prune everything outside the thirty-day window automatically. On Snowflake or BigQuery, partition pruning is the difference between a query that costs cents and one that costs dollars. Second, create a composite index or clustering key on (event_type, event_date) so the CASE WHEN event_type = ... filters hit a narrow path rather than a full scan. Third, if the same funnel is reported daily, materialize the user_stages CTE — one row per user with MIN event dates per stage — and refresh it incrementally. The materialization turns a multi-minute query into a sub-second lookup, and incremental refresh keeps the storage footprint bounded.
For very large event tables, replace COUNT(DISTINCT user_id) with APPROX_COUNT_DISTINCT (BigQuery, Snowflake) or HLL_COUNT.MERGE patterns. Exact distinct counts are expensive at scale; HyperLogLog approximations are within one percent and ten to fifty times faster. Use exact counts for executive reports and approximate counts for self-service exploration.
Related reading
- SQL window functions interview questions
- How to calculate cart abandonment in SQL
- How to calculate bounce rate in SQL
- A/B testing peeking mistake
If you want to drill funnel and conversion SQL like this every day with real interview prompts, NAILDD is launching with 500+ SQL problems across exactly this pattern.
FAQ
What counts as a "high" drop-off rate?
It depends on the stage and the product category. In onboarding, a drop between signup and activation of thirty to fifty percent is broadly normal — verification, profile completion, and permission grants are real friction. In checkout, a drop of ten to thirty percent between cart and payment is the typical range; anything above forty signals a UX bug, a pricing surprise, or a payment-provider outage. Benchmark against your own historical baseline before benchmarking against industry numbers, since category and audience swamp any external average.
Drop-off rose week over week — what is the playbook?
Start with a drill-down by segment: channel, device, country, and app version. One of those dimensions usually owns the regression. From there, check the obvious external causes — a marketing campaign that flooded the top of funnel with low-intent traffic, a release that broke a specific OS version, a payment-processor incident that hit a country. If none of those explain it, scope a UX session-replay review of the offending stage and a quick A/B test of the suspected fix. Resist the temptation to ship a hotfix without measuring; you will spend a quarter unable to prove whether it worked.
How do I handle cross-session drop-off?
A user who abandoned in session one and returned in session three to complete the step is, strictly speaking, not a drop-off. The right model is per-user rather than per-session: pivot to one row per user with MIN(event_date) per stage and ask whether they ever reached the next stage within the conversion window. If your product is session-anchored — checkout that times out at thirty minutes, a payment flow that expires — then per-session drop-off is the correct metric and you should not collapse sessions.
Does this approach work for long B2B funnels?
Yes, with two adjustments. The time window stretches from days to weeks or months — a B2B funnel between demo-requested and contract-signed can run sixty to ninety days, so a seven-day window is meaningless. The stages are also typically defined in the CRM rather than in product events, so the source table is more likely salesforce_opportunities than events. The formula and cohort logic are identical.
Should mobile and desktop be reported separately?
Almost always yes. Mobile funnels drop off more heavily at every stage that requires text entry, payment-method capture, or document upload, and the gap widens at the bottom of the funnel. Reporting a blended rate hides a mobile-specific UX issue under a desktop-heavy average. Run the segmented query, surface mobile and desktop as separate rows, and let product teams own each independently.
Is drop-off the same as exit rate?
No. Drop-off is the percentage of users who reached stage N but failed to reach stage N+1 within the conversion window. Exit rate, as defined in web analytics, is the percentage of sessions that ended on a specific page — independent of whether that page ends any defined funnel. Exit rate suits content pages and SEO landings; drop-off is the right metric for conversion flows.