CASE WHEN SQL cheat sheet

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

Why analysts live in CASE WHEN

CASE WHEN is the swiss-army knife of analytical SQL. It turns a continuous column into a segment (young / middle / senior), expresses a conditional metric as a one-line aggregation (AVG(CASE WHEN paid THEN 1.0 ELSE 0 END)), pivots a tall events table into a wide weekly report, and bends ORDER BY to put pending rows above completed ones. Once you read and write CASE WHEN fluently you stop reaching for client-side post-processing and solve problems directly in the query.

Imagine your PM at Stripe pings you at 5pm: split last week's payments into four bands by amount, broken out by country, and flag which countries had over 40 percent in the smallest band. Without CASE WHEN that is a tangle of subqueries, lookup joins, and hand-rolled mappings. With it the answer is one CTE and a GROUP BY 1. The same goes for live operational metrics, AB test slicing, churn cohorts, and RFM segmentation.

The pattern has sharp edges. CASE WHEN col = NULL THEN ... never matches because comparing to NULL returns NULL, not TRUE. The WHEN clauses are evaluated top to bottom and the first match wins, so a misordered ladder silently drops branches. All result expressions must share a type. COUNT(CASE WHEN cond THEN 1 ELSE 0 END) counts every row, not just rows where the condition holds. This cheat sheet walks through every pattern I reach for in interviews and production, and every mistake I have seen analysts make on a whiteboard at Meta or in a take-home for Notion.

Syntax

There are two flavors. The general form, also called searched CASE, evaluates a boolean expression in each branch:

CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    ELSE default_result
END

The simple form takes a single expression once and compares it to each branch value:

CASE column
    WHEN value1 THEN result1
    WHEN value2 THEN result2
    ELSE default
END

The searched form covers everything because the simple form cannot express ranges or compound conditions. I use the simple form only for short equality lookups.

Bucketing a numeric column

Splitting a continuous variable into named bands is the most common CASE usage in dashboards and interviews:

SELECT
    user_id,
    age,
    CASE
        WHEN age < 18 THEN 'teen'
        WHEN age < 35 THEN 'young'
        WHEN age < 55 THEN 'middle'
        ELSE 'senior'
    END AS age_group
FROM users;

The ladder is mutually exclusive only because the bands are in order — a user with age = 30 matches the < 35 branch and exits, so the < 55 branch never fires. The ELSE covers everyone past the last WHEN, including users with NULL ages. If you want unknown ages in their own bucket, branch on age IS NULL first.

Conditional aggregation

This is the pattern that earns CASE WHEN its rent. To compute a share or a filtered sum without a subquery you wrap the metric in a CASE inside the aggregate:

SELECT
    AVG(CASE WHEN total > 100 THEN 1.0 ELSE 0 END) AS big_order_share,
    SUM(CASE WHEN total > 100 THEN total ELSE 0 END) AS big_order_revenue,
    COUNT(CASE WHEN total > 100 THEN 1 END) AS big_order_count
FROM orders;

A few details that bite people in interviews. The 1.0 in AVG is deliberate: integer division in some engines truncates to zero, so force a float. The COUNT(CASE WHEN cond THEN 1 END) form omits the ELSE because COUNT ignores NULLs — that is how you count only matching rows. COUNT(CASE WHEN cond THEN 1 ELSE 0 END) counts every row, which is almost never what you want.

Pivot via CASE

Postgres, MySQL, and ClickHouse do not have a built-in PIVOT clause. CASE inside SUM is the standard workaround:

SELECT
    DATE_TRUNC('day', created_at) AS day,
    SUM(CASE WHEN status = 'paid'     THEN total ELSE 0 END) AS paid_revenue,
    SUM(CASE WHEN status = 'refunded' THEN total ELSE 0 END) AS refunded_revenue,
    COUNT(CASE WHEN status = 'paid'     THEN 1 END) AS paid_orders,
    COUNT(CASE WHEN status = 'refunded' THEN 1 END) AS refunded_orders
FROM orders
GROUP BY 1
ORDER BY 1;

The output gives you one row per day with parallel columns per status — exactly what a wide dashboard or downstream join expects. Past six branches, template the SQL instead of hand-typing columns; that is where copy-paste bugs creep in.

Custom ORDER BY

ORDER BY accepts any expression, so you can sort by an inline CASE that encodes your priority:

SELECT *
FROM orders
ORDER BY
    CASE status
        WHEN 'pending'   THEN 1
        WHEN 'paid'      THEN 2
        WHEN 'refunded'  THEN 3
        ELSE 4
    END,
    created_at DESC;

This produces pending first, then paid, then refunded, then everything else, with created_at DESC as the secondary sort. It is the cleanest way to get a non-alphabetical category order without a lookup join.

Multi-condition segmentation

Real-world segments combine several signals. CASE handles this with AND and OR inside each branch:

SELECT
    user_id,
    CASE
        WHEN age < 18 AND country = 'US'           THEN 'us_teen'
        WHEN age < 18                              THEN 'intl_teen'
        WHEN age >= 18 AND is_premium              THEN 'adult_premium'
        ELSE                                            'adult_free'
    END AS segment
FROM users;

Order matters. us_teen must come before intl_teen because both match a US user under 18 and CASE takes the first match. Useful trick: read your ladder aloud as if-elif-else and check that every intended segment can actually be reached.

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

CASE with a subquery

A branch can call out to a subquery — handy for comparing a row to a global statistic:

SELECT
    user_id,
    total,
    CASE
        WHEN total > (SELECT AVG(total) FROM orders) THEN 'above_avg'
        ELSE 'below_avg'
    END AS vs_average
FROM orders;

Most planners run the subquery once and reuse the result, so the cost is fixed. If the subquery is correlated, performance falls off a cliff — rewrite as a window function: total > AVG(total) OVER ().

NULL in CASE

NULL = NULL evaluates to NULL, which CASE treats as not-matched. Always use IS NULL:

-- wrong: always falls through to 'has_value'
CASE WHEN column = NULL THEN 'empty' ELSE 'has_value' END

-- correct
CASE WHEN column IS NULL THEN 'empty' ELSE 'has_value' END

A subtler trap: without an IS NULL branch, a NULL column makes every comparison return NULL, so the row falls through to ELSE. That quietly merges "unknown" into "default", which is almost never intended.

CASE vs COALESCE vs NULLIF

Task Best choice
Replace NULL with a default COALESCE(col, default)
First non-null from a list COALESCE(a, b, c)
Return NULL on match NULLIF(col, value)
Branchy multi-condition logic CASE WHEN

Reach for COALESCE and NULLIF first when they fit — they read more clearly. The moment you need more than two branches or a non-equality condition, drop to CASE.

CASE in window functions

You can put CASE inside SUM, AVG, or COUNT of a window expression. This is how you compute per-user cumulative or partitioned metrics with a filter applied:

SELECT
    order_id,
    user_id,
    SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END)
        OVER (PARTITION BY user_id) AS user_paid_total,
    COUNT(CASE WHEN status = 'refunded' THEN 1 END)
        OVER (PARTITION BY user_id) AS user_refund_count
FROM orders;

The same idea generalizes to running totals with ORDER BY in the window.

Common pitfalls

The pitfall that hurts the most is forgetting ELSE when the metric will be aggregated. Without an ELSE, the unmatched rows return NULL. For SUM that is fine because SUM skips NULLs, but for AVG it changes the denominator — AVG(CASE WHEN x > 100 THEN 1 END) averages only the matching rows, which is always 1. The fix is to spell out the zero branch: AVG(CASE WHEN x > 100 THEN 1.0 ELSE 0 END).

A close second is misordered ladders. Because CASE takes the first matching branch, a more permissive condition placed above a more specific one silently swallows the specific case. Anyone under 18 matches age < 35, so the age < 18 branch must come first. When you write a ladder, go from narrowest to broadest and read it back as if-elif-else to confirm every branch can actually be hit.

Type drift across branches is a third trap. SQL engines vary in how forgiving they are, but a CASE WHEN x THEN 'text' ELSE 123 END is undefined behavior in the standard and outright fails in strict engines like BigQuery and Snowflake. Cast explicitly: CASE WHEN x THEN 'text' ELSE CAST(123 AS TEXT) END. The same applies when one branch returns NULL — sometimes the engine cannot infer the result type and you need CAST(NULL AS INT).

A fourth and underrated pitfall is using COUNT(CASE WHEN cond THEN 1 ELSE 0 END) when you mean to count only matching rows. COUNT counts everything that is not NULL, so by writing ELSE 0 you guarantee the count equals the row total. Drop the ELSE and COUNT does the right thing: COUNT(CASE WHEN cond THEN 1 END). Or use SUM(CASE WHEN cond THEN 1 ELSE 0 END), which gives the same number through a different path.

Finally, mind the difference between row-level filtering and conditional counting. WHERE status = 'paid' drops every non-paid row, so every aggregate in the query only sees paid rows. SUM(CASE WHEN status = 'paid' THEN 1 ELSE 0 END) counts paid rows but keeps the rest of the table intact for parallel metrics. For paid count, refunded count, and total side by side, only the CASE form works.

Performance notes

CASE itself is a constant-time expression, not a performance concern in any modern engine. What can be slow is using CASE in a WHERE clause in a way that defeats indexing. WHERE CASE WHEN country = 'US' THEN age > 18 ELSE age > 21 END runs but the planner often cannot push the condition through, so a sequential scan replaces the index lookup. Rewrite as two boolean branches joined by OR: WHERE (country = 'US' AND age > 18) OR (country <> 'US' AND age > 21). Same logic, planner-friendly.

In Postgres, FILTER (WHERE ...) and SUM(CASE WHEN cond THEN x ELSE 0 END) produce essentially identical plans, so pick whichever reads better in context. In Snowflake and BigQuery, conditional aggregation via CASE is the canonical idiom and is well optimized. For very wide pivots over hundreds of categories, consider whether a GROUP BY category followed by a downstream pivot in your BI tool is more maintainable than a 200-line CASE block.

If you want to drill SQL patterns like CASE WHEN every day with instant feedback, NAILDD ships with hundreds of analyst-grade SQL problems that lean directly on this pattern.

FAQ

Can you nest CASE inside CASE?

Yes, every modern engine supports nested CASE — the inner CASE returns a value the outer CASE matches against. The problem is readability. Two levels are fine, three are a code smell, four mean you should lift the inner logic into a CTE or a derived column. A CTE with an intermediate segment column almost always wins on review.

What is the difference between CASE and IIF?

IIF(condition, value_if_true, value_if_false) is shorthand for a two-branch CASE and exists in SQL Server, BigQuery, and a few other engines. It is not in Postgres or standard SQL. CASE is universal and supports any number of branches, so portable code stays on CASE. Use IIF only for a single boolean check in a single dialect where it improves readability.

CASE or PIVOT?

If your engine has a real PIVOT clause — SQL Server and Oracle do — it is more concise for pure transposition when you know the category values up front. Postgres, MySQL, ClickHouse, and most cloud warehouses have no native PIVOT, so SUM(CASE WHEN ... END) ... GROUP BY is the idiom. Even with PIVOT, CASE wins the moment you need a custom expression per column.

Can you use CASE inside GROUP BY?

Yes, grouping by a CASE expression is a clean way to aggregate by an inline segmentation without first materializing a derived table. The trick is to repeat the expression in SELECT or use the positional GROUP BY 1:

SELECT
    CASE WHEN age < 18 THEN 'teen' ELSE 'adult' END AS segment,
    COUNT(*) AS users
FROM users
GROUP BY 1;

For readability past two or three branches, lift the CASE into a CTE so the main query reads as GROUP BY segment.

Does CASE short-circuit?

Yes, in every major engine. CASE evaluates WHEN conditions top to bottom and stops at the first match, so you can rely on guarding a branch with an earlier WHEN. The most common use is preventing a divide-by-zero: CASE WHEN denominator = 0 THEN NULL ELSE numerator / denominator END. The right-hand side of the second branch is never evaluated when denominator = 0, so the division is safe.

What return type does CASE produce?

The engine picks a common supertype across all branch results. If every branch is INT, the result is INT. Mix INT and FLOAT and you get FLOAT. Mix INT and VARCHAR and you get an error in strict engines or an implicit cast you did not intend in lenient ones. Make the cast explicit any time branches return different declared types.