Quantiles and percentiles

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

What a percentile actually is

A percentile is the value below which a given share of your data sits. The 90th percentile of session length is the duration that 90% of your sessions are shorter than. The median is just the 50th percentile under a friendlier name.

Picture your PM at Stripe pinging you on a Friday: "what's the P99 checkout latency for the new flow, and is it within SLA". You cannot answer that with a mean. You need a single number that says "99% of checkouts complete in under X milliseconds", and that number is by definition the 99th percentile. Once you internalize that framing, percentiles stop being academic and start being the language of every service-level conversation.

The reason analysts lean on percentiles harder than averages is that real product data is rarely symmetric. Revenue per user has whales. Latency has tail events. Time-to-first-action has people who never act at all. Means hide this skew under a single misleading number; percentiles expose it.

Quantiles, quartiles, deciles, percentiles

These four words name the same idea sliced into different pieces. A quantile is the general concept — a cut point that divides data by share. Quartiles cut it into four parts (Q1, Q2, Q3). Deciles cut it into ten. Percentiles cut it into one hundred. The 50th percentile equals Q2 equals the fifth decile equals the median — pick the vocabulary that fits the audience.

Data: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

Q1 (25th percentile) = 27.5
Median (50th percentile) = 55
Q3 (75th percentile) = 82.5
P90 = 91
P99 = 99.1

One more term you will see in tickets: IQR, the interquartile range. It is just Q3 minus Q1, a measure of spread that ignores the tails. IQR sits at the heart of the Tukey rule for flagging outliers — anything below Q1 minus 1.5 IQR or above Q3 plus 1.5 IQR. That rule is what your box plot draws.

    Q1    median    Q3
    |       |        |
----+-------+--------+----
   25%     50%      75%
    |<--- IQR --->|

Computing percentiles in Python

NumPy is the workhorse. np.percentile accepts a percentile number from 0 to 100; np.quantile accepts the same input as a fraction from 0 to 1. Both interpolate by default, matching what Postgres calls PERCENTILE_CONT.

import numpy as np

data = [15, 20, 35, 40, 50, 60, 70, 80, 85, 95]

np.percentile(data, 50)    # 55.0  — median
np.percentile(data, 25)    # 36.25 — Q1
np.percentile(data, 75)    # 78.75 — Q3
np.percentile(data, 90)    # 86.5  — P90

# Several percentiles at once
np.percentile(data, [25, 50, 75, 90, 99])

# Quantile form (0 to 1)
np.quantile(data, 0.5)     # 55.0
np.quantile(data, 0.25)    # 36.25

Pandas wraps the same logic. The quantile method takes one number or a list, and describe spits out Q1, median, and Q3 — the fastest way to eyeball a fresh dataset.

import pandas as pd

df = pd.DataFrame({
    'user_id': range(1, 11),
    'revenue': [100, 200, 500, 800, 1200, 2000, 3500, 5000, 8000, 50000]
})

df['revenue'].quantile(0.5)            # median
df['revenue'].quantile(0.90)           # P90
df['revenue'].quantile([0.25, 0.5, 0.75, 0.9])

df['revenue'].describe()
# count       10.0
# mean      7130.0
# std      15000+
# min        100
# 25%        575
# 50%       1600
# 75%       4625
# max      50000

Group operations follow the same shape. If you want P90 revenue per acquisition channel, you group, then call quantile.

# P90 revenue per segment
df.groupby('segment')['revenue'].quantile(0.9)

# Several percentiles, one call
df.groupby('segment')['revenue'].describe(percentiles=[.25, .5, .75, .9, .95])

Computing percentiles in SQL

In Postgres the canonical function is PERCENTILE_CONT(fraction) WITHIN GROUP (ORDER BY column). The same syntax works in Snowflake, Redshift, and BigQuery standard SQL.

-- Quartiles and tail latency in one pass
SELECT
    PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY latency_ms) AS q1,
    PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY latency_ms) AS median,
    PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY latency_ms) AS q3,
    PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY latency_ms) AS p90,
    PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY latency_ms) AS p99
FROM checkout_events;

-- Percentiles per segment
SELECT
    segment,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median_revenue
FROM users
GROUP BY segment;

Postgres also ships PERCENTILE_DISC, which returns one of the actual values rather than interpolating. Use CONT for continuous metrics — revenue, latency, duration. Use DISC when interpolation makes no sense, like with discrete ratings.

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY val) FROM (VALUES (10), (20)) t(val);
-- 15.0 — interpolated

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM (VALUES (10), (20)) t(val);
-- 10  — nearest actual value

Window functions cover the bucketing flavor. NTILE(N) assigns each row to one of N equal-size buckets — perfect for "give me the top decile of accounts by ARR". PERCENT_RANK() returns the percentile rank of every row when you want a continuous score rather than a bucket label.

SELECT user_id, revenue,
    NTILE(4) OVER (ORDER BY revenue) AS quartile,
    ROUND(PERCENT_RANK() OVER (ORDER BY revenue) * 100, 1) AS pct_rank
FROM users;

ClickHouse uses a different family — quantile(0.5)(col) for an approximate median, quantileExact(0.5)(col) for an exact one, quantiles(0.25, 0.5, 0.75)(col) for several at once. MySQL before version 8 has no built-in function and forces a ROW_NUMBER() workaround.

Where percentiles show up in product analytics

The most common place is latency and SLA tracking. An average response time of 120 ms means almost nothing — one cron that hangs for ten seconds drags it up while the typical user is unaffected. A P50 of 80 ms with a P99 of 1800 ms tells the real story: most requests are snappy, but the top 1% are painfully slow. SLAs are written as percentile thresholds for exactly this reason.

response_times = [50, 80, 100, 120, 150, 200, 250, 500, 800, 2000]

p50 = np.percentile(response_times, 50)
p90 = np.percentile(response_times, 90)
p99 = np.percentile(response_times, 99)

print(f'P50: {p50} ms, P90: {p90} ms, P99: {p99} ms')

The second hot spot is revenue segmentation. "Top 10% of customers by lifetime value" is a P90 cut, and you express it cleanly with a window function plus a CASE statement, or with NTILE if you also want the bottom slices.

SELECT
    user_id,
    total_revenue,
    NTILE(10) OVER (ORDER BY total_revenue DESC) AS decile
FROM user_revenue;

The first decile here is your top 10% — the accounts Stripe or Notion call out by name in QBR slides. Deciles two through five are the healthy middle; the bottom half is where churn risk lives.

The third use case is outlier detection. The Tukey rule uses Q1, Q3, and IQR to flag suspicious values, and it sits at the core of the standard box plot.

q1 = df['revenue'].quantile(0.25)
q3 = df['revenue'].quantile(0.75)
iqr = q3 - q1

lower = q1 - 1.5 * iqr
upper = q3 + 1.5 * iqr

outliers = df[(df['revenue'] < lower) | (df['revenue'] > upper)]
print(f'Outliers flagged: {len(outliers)}')
Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Percentile vs mean

The mean answers "what would each user get if I divided the pie equally". The median answers "what does a typical user actually see". Those are different questions, and the second one is almost always what a stakeholder really wants when they ask for a number.

Metric Mean Median (P50) P90 / P99
Outlier-robust No Yes Yes
Describes The pooled total The typical case The tail
Use when Distribution is roughly normal Distribution is skewed SLAs, worst-case planning

For a real-world example, ship a pricing experiment at Airbnb. 80% of guests spend $80-$300 per booking; 5% spend over $5,000. The mean lands near $450; the median sits closer to $180. Reporting the mean makes the average guest sound four times richer than they are; the median tells the truth.

Common pitfalls

The most frequent mistake is using the mean for a skewed distribution and treating it as the typical value. Revenue, session length, time-to-purchase, latency — all have fat tails, and the mean is dominated by the upper end. Lead with the median and only report the mean alongside when the audience specifically wants the total-per-user implication.

The second trap is confusing "the 90th percentile" with "the top 10%". P90 is a value — a single number at the boundary. The top 10% is a set of rows — every observation above that boundary. When a stakeholder asks for "P90 revenue", clarify which they mean. Half the time they want the number P90 itself; half the time they want the average revenue of the top decile, which is a different metric.

The third pitfall lives in SQL. People reach for AVG when computing tail metrics because the syntax is shorter, then they pull a number that lies about latency. Whenever the metric you care about is a tail, use a percentile function — tail regressions hide perfectly inside average ms numbers and only surface when a customer escalates.

The fourth is mixing interpolation methods between tools. NumPy, Pandas, Postgres PERCENTILE_CONT, and ClickHouse quantile all interpolate slightly differently in edge cases. If you compute P90 in Python and the dashboard later shows a different number from Snowflake, that gap is almost always the interpolation rule, not a real data drift. Pick one tool as canonical for any given metric and document it.

The fifth is sampling-driven instability for high percentiles. A reliable P99 needs roughly one hundred times more data than a reliable P50, because by definition only 1% of your sample contributes. For a stable P99 you want tens of thousands of observations at minimum, or switch to a parametric model of the tail.

Interview questions

What is the median and how does it differ from the mean? The median is the 50th percentile — the value with half the data below it. It is robust to outliers; a single extreme observation barely moves it. The mean is sensitive to extremes; one whale shifts it noticeably. For skewed distributions like revenue or latency, the median better describes the typical observation while the mean better describes the pooled total.

What is IQR and how do you use it to flag outliers? IQR is Q3 minus Q1 — the spread of the middle 50% of the data. The Tukey rule labels anything below Q1 minus 1.5 times IQR or above Q3 plus 1.5 times IQR as an outlier. It draws the whiskers on a box plot, and because it relies on quartiles, it stays robust to the very outliers it is trying to detect.

Why does P99 matter for SLAs? P99 captures tail behavior that the mean hides. A service can have an average response time of 100 ms while still serving the slowest 1% of users in five seconds — and that 1% is who escalates to support. SLAs are written as percentile thresholds because they bound the experience of every user, not the average user.

How do you compute a median in SQL? In Postgres, Snowflake, BigQuery, and Redshift use PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY column). In older MySQL you use a self-join with ROW_NUMBER() to find the middle row. In ClickHouse it is quantile(0.5)(column) for an approximate answer or quantileExact(0.5)(column) for a precise one.

If you want to drill percentile questions like these every day under interview pressure, NAILDD is launching with hundreds of SQL and statistics problems built around exactly this pattern.

FAQ

Are quartiles, deciles, and percentiles the same thing?

They are all quantiles — the same idea cut into different numbers of pieces. Quartiles split the data into four parts (Q1, Q2, Q3). Deciles split it into ten. Percentiles split it into one hundred. The 25th percentile equals Q1, the 50th percentile equals the median equals Q2 equals the fifth decile, and so on.

How do I pick between PERCENTILE_CONT and PERCENTILE_DISC?

PERCENTILE_CONT interpolates between adjacent values, which is what you want for continuous metrics like revenue, latency, or duration. PERCENTILE_DISC returns one of the actual values in the data, which is what you want for discrete metrics like star ratings or integer counts where an interpolated value would not make physical sense.

How do I compute percentiles in ClickHouse?

ClickHouse uses the quantile family rather than SQL standard PERCENTILE_CONT. quantile(0.5)(revenue) returns an approximate median using a reservoir-sampling sketch. quantileExact(0.5)(revenue) returns the exact median but is slower on large tables. quantiles(0.25, 0.5, 0.75, 0.9, 0.99)(revenue) returns several percentiles in one tuple.

Why is my P99 so unstable between runs?

Because by definition only 1% of your sample feeds into the P99 estimate. With a thousand rows, ten rows determine the answer, and the next ten give a different number. Stable P99 estimates need tens of thousands of observations at minimum — or bootstrap a confidence interval around the percentile instead.

Can I use percentiles in an A/B test?

Yes, and you often should. When the metric of interest is a tail — latency, basket size in a long-tailed market, time-to-completion — comparing means hides exactly the effect you are testing for. The two cleanest ways to test a percentile are a quantile regression on the experiment data or a bootstrap of the percentile difference between groups. Both produce a confidence interval you can act on.