How to calculate Cart Abandonment in SQL

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

Why cart abandonment matters

The Baymard Institute pegs the industry-average shopping cart abandonment rate at roughly 70%. Seven out of every ten carts created on a typical e-commerce site never become orders. At Stripe, Shopify, DoorDash, and every consumer marketplace, this is the second-most-watched funnel metric after gross merchandise value, and the gap between your number and the benchmark translates directly into millions of dollars of unrealized revenue.

Picture an Amazon-sized retailer. The growth PM walks into Monday standup and says checkout revenue is down nine percent week-over-week, but no major release shipped. The first query to pull is abandonment by stage. If abandonment ballooned at the payment step on mobile but stayed flat on desktop, you have a payment integration regression on iOS. If abandonment spread evenly across stages, you probably have a traffic-mix shift. The same pattern shows up in product analyst interviews at Airbnb, Etsy, Uber Eats, and Stripe.

This post walks through cart abandonment end to end: the base formula, a stage-by-stage funnel, breakdowns by device and channel, the session-scoped variant interviewers love to probe, and traps that distort the number. Queries assume an events table with one row per user action and a users dimension. Postgres dialect; Snowflake, BigQuery, and Databricks need minor syntax swaps.

The SQL formula

The textbook definition every e-commerce analyst should recite cold:

Cart Abandonment Rate = (Carts Started - Carts Completed) / Carts Started × 100%

The fight is over what counts as a "cart started" and what counts as "completed." The strictest definition treats a cart as started the moment a shopper adds the first item, and completed the moment a purchase event fires for that shopper inside a defined window. Looser definitions treat any visit to the checkout page as a started cart, which moves the rate by ten to twenty points and is the single biggest reason your number does not match a competitor's. Name the definition out loud before you write any SQL.

Base query: site-wide abandonment

Start with the headline number — total carts started versus total carts completed over the last thirty days, no segmentation. This is what you put on the executive dashboard and what you compare against the Baymard benchmark.

SELECT
    COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END) AS carts_started,
    COUNT(DISTINCT CASE WHEN event_type = 'purchase'    THEN user_id END) AS completed,
    (COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END)
     - COUNT(DISTINCT CASE WHEN event_type = 'purchase'  THEN user_id END))::NUMERIC * 100
        / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN user_id END), 0)
        AS abandonment_rate_pct
FROM events
WHERE event_date >= CURRENT_DATE - INTERVAL '30 days';

NULLIF prevents a divide-by-zero crash on a day with no add_to_cart events. The ::NUMERIC cast prevents Postgres from truncating the result to zero. The COUNT(DISTINCT user_id) shape matters: a user who adds three items still counts as one cart started, and a user who retries payment twice still counts as one completed purchase. Counting events instead of distinct users overstates the funnel by twenty to forty percent at most retailers.

The query leaks one subtle assumption: a user who added on May 1 and purchased on May 28 counts as completed even if the original cart was abandoned and rebuilt mid-month. That is usually what you want for a thirty-day headline metric, but it is not what marketing means in the context of an email retargeting campaign. Always confirm the window with the requester.

Breakdown by checkout stage

The next question every PM asks: "Where in the funnel are we losing them?" The answer is a stage-by-stage funnel. Most e-commerce analytics stacks instrument five canonical events — add_to_cart, checkout_start, shipping_info, payment_info, purchase — and the query below collapses the funnel into one row per stage.

SELECT
    event_type AS stage,
    COUNT(DISTINCT user_id) AS users
FROM events
WHERE event_type IN ('add_to_cart', 'checkout_start', 'shipping_info', 'payment_info', 'purchase')
  AND event_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY event_type
ORDER BY CASE event_type
    WHEN 'add_to_cart'    THEN 1
    WHEN 'checkout_start' THEN 2
    WHEN 'shipping_info'  THEN 3
    WHEN 'payment_info'   THEN 4
    WHEN 'purchase'       THEN 5
END;

The output is a small table with users-per-stage in funnel order. Wrap it in a CTE and use LAG() to compute stage-to-stage drop. The biggest drop is usually between payment_info and purchase. That last leg captures every payment decline, three-D Secure failure, address validation error, and tax-shock abandonment. If payment_info → purchase retention is below seventy percent, your highest-leverage fix is probably on the payment processor side, not the page UX side. Stripe and Adyen both publish guidance on retry behavior and network token rotation.

Breakdown by device and channel

Site-wide abandonment is the headline; device and channel breakdowns are where the action lives. Mobile shoppers typically abandon ten to fifteen points more than desktop shoppers, and paid social audiences abandon ten to twenty points more than organic search. Both gaps are real, structural, and impossible to fix completely — but every percentage point matters when the underlying volume is large.

WITH cart_users AS (
    SELECT DISTINCT user_id
    FROM events
    WHERE event_type = 'add_to_cart'
      AND event_date >= CURRENT_DATE - INTERVAL '30 days'
),
purchase_users AS (
    SELECT DISTINCT user_id
    FROM events
    WHERE event_type = 'purchase'
      AND event_date >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
    u.device_type,
    COUNT(DISTINCT c.user_id) AS carts,
    COUNT(DISTINCT p.user_id) AS purchases,
    (COUNT(DISTINCT c.user_id) - COUNT(DISTINCT p.user_id))::NUMERIC * 100
        / NULLIF(COUNT(DISTINCT c.user_id), 0) AS abandonment_pct
FROM cart_users c
JOIN users u ON u.user_id = c.user_id
LEFT JOIN purchase_users p ON p.user_id = c.user_id
GROUP BY u.device_type
ORDER BY abandonment_pct DESC;

Swap u.device_type for u.acquisition_channel to get the channel cut, or extend the GROUP BY to both to surface the worst combinations. Paid social on mobile is almost always the worst single bucket — high impulse traffic, high friction at checkout, low intent. Organic search on desktop is almost always the best. If your data violates either pattern, double-check channel attribution before drawing a strategic conclusion.

A note on guest checkout: anonymous shoppers do not have a stable user_id until they enter an email. The query above silently underestimates abandonment by dropping every guest who started a cart and bounced before the email field. Replace user_id with COALESCE(user_id, session_id, device_id) and document the substitution.

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

Session-scoped abandonment

The strictest interview-grade definition scopes the funnel to a single session. A shopper who adds on Monday, leaves, and purchases the same item on Wednesday counts as one abandoned plus one completed. Most teams report the looser window-based number externally, but session-scoped abandonment is the cleanest measure of in-session friction and the version interviewers test for.

WITH session_flags AS (
  SELECT
    session_id,
    MAX(CASE WHEN event_type = 'add_to_cart' THEN 1 ELSE 0 END) AS added,
    MAX(CASE WHEN event_type = 'purchase'    THEN 1 ELSE 0 END) AS purchased
  FROM events
  WHERE event_date >= CURRENT_DATE - INTERVAL '30 days'
  GROUP BY session_id
)
SELECT
  SUM(added)                                              AS sessions_with_cart,
  SUM(CASE WHEN added = 1 AND purchased = 1 THEN 1 END)   AS sessions_completed,
  SUM(CASE WHEN added = 1 AND purchased = 0 THEN 1 END)::NUMERIC * 100
    / NULLIF(SUM(added), 0)                                AS session_abandonment_pct
FROM session_flags;

This number is usually three to five points higher than the user-window version because it cannot give credit to cross-session recovery. If you publish both, label the difference clearly. External benchmark comparisons require the looser version because that is what Baymard, Contentsquare, and Salesforce report.

Common pitfalls

The biggest pitfall is the time window. A shopper who adds an item and returns three days later to complete is missed unless your window is wide enough. Pick deliberately — twenty-four hours for in-session diagnostics, seven days for retargeting funnels, thirty days for executive dashboards — and document it in the query. The number is meaningless without the window.

The second pitfall is cross-device behavior. Shoppers regularly add on phone during a commute and complete on a laptop at home. Without identity resolution, both halves look like an abandoned cart plus a fresh purchase. If your team has a stitched identity table, use it on both sides of the funnel. Mobile abandonment will drop by five to ten points overnight — not because anything changed, but because the math finally reflects reality.

The third pitfall is inconsistent stage definitions. "Cart started" can mean adding the first item, clicking the cart icon, or opening the checkout page, and each produces a different abandonment number. Before publishing a comparison to last quarter, confirm the upstream definition has not silently changed because an engineer renamed an event. This costs analysts more credibility than any other class of mistake.

The fourth pitfall is mixing sessions and users in the same ratio. If the numerator counts sessions but the denominator counts users, you have sessions per user, not a rate. Pick one granularity for both halves and stick with it. Dashboards built with this mistake produce abandonment rates over one hundred percent on busy days, which is the surest way to lose trust in the metric.

The fifth pitfall is treating refunds and chargebacks as nonexistent. A purchase refunded forty-eight hours later should arguably count toward abandonment, not completion. Most teams ignore this because volume is small, but on luxury and high-AOV verticals refunds drag completed counts down by three to five percent. Be ready to produce "true abandonment net of refunds" in one query.

Optimization tips

For event tables exceeding a few hundred million rows, partition by event_date. On Snowflake that means a clustering key on the date column; on BigQuery, native date partitioning; on Postgres, declarative partitioning by month plus a B-tree index on event_type. Without partitioning, even the simplest abandonment query scans the whole table on every dashboard refresh.

Materialize the daily aggregate. A nightly job that produces (event_date, device_type, channel, stage, users) cuts dashboard query cost by fifty times or more and lets the interactive layer query thousands of rows instead of billions. A dbt incremental model on a modern lakehouse takes an afternoon and pays back in a week.

For session-scoped abandonment, compute session flags once at session-close and store them in a session_summary table. On BigQuery, replace COUNT(DISTINCT ...) with APPROX_COUNT_DISTINCT for headlines where you can tolerate one-percent error — often a ten-to-twenty-times speed-up. Reserve exact counts for finance-reported numbers.

If you want to drill SQL questions like this every day, NAILDD is launching with 500+ SQL problems across exactly this pattern.

FAQ

What is a good cart abandonment rate?

The Baymard average sits around seventy percent across all e-commerce. Apparel and general retail land in the sixty-five to seventy-five percent band. Travel and ticketing run eighty to ninety percent because purchase decisions take days and shoppers use the cart as a price-comparison tool. Anything below sixty percent at meaningful volume is exceptional. The cheapest sanity check is your own historical trend — category mix, AOV, and traffic source profile usually explain more than any industry benchmark.

Cart abandonment dropped five points week over week. What should I check?

Treat it as a signal that something changed, not necessarily an improvement. First, confirm no upstream event definitions were modified — a renamed add_to_cart event silently halves the denominator and looks like a huge win. Second, check whether traffic mix shifted: a paused paid campaign lifts abandonment-quality metrics without any UX change. Third, decompose by device and channel and look for a single bucket carrying the movement. If the win is real and concentrated, propagate the underlying change.

What window should I use?

For in-session diagnostics, twenty-four hours. For email-retargeting funnels, seven days. For executive dashboards, thirty days. For SaaS-adjacent verticals with annual purchase consideration, ninety days. The number drifts down as the window widens because more cross-session recoveries get captured. Always publish both the window and the number; the window-less number invites comparison errors.

How do I reduce cart abandonment?

The best-documented interventions are reducing checkout steps, supporting guest checkout, persisting cart state across sessions, sending recovery emails within an hour of abandonment, and displaying trust signals at the payment step. Baymard's research quantifies each. The biggest lever varies by site: removing forced account creation alone often recovers ten to fifteen points on consumer marketplaces. Always pair an intervention with a clean A/B test — most "best practices" are correlation, not causation, and your shoppers may behave differently from the survey sample.

Why is mobile abandonment so much higher than desktop?

Smaller screens, fat-finger errors on form fields, weaker autofill, and a higher rate of in-the-moment distraction all add up to ten to fifteen extra abandonment points on mobile. Apple Pay and Google Pay narrow the gap by five to eight points but never close it. The honest answer is that mobile abandonment is a structural property of the channel, and the fix is to optimize mobile checkout as a first-class surface, not a shrunk-down desktop page. Always decompose by device before benchmarking against a competitor.