How to calculate Average Handle Time in SQL
Contents:
What AHT actually measures
Average Handle Time looks simple and rewards anyone who treats it that way with the wrong picture of their contact center. The formal definition is talk plus hold plus wrap-up, divided by calls. But ship a leaderboard ranking agents on AHT alone and you have started training your team to close calls without solving them, skip wrap-up notes, and push customers through the queue at the cost of every quality metric that pays the bills.
The pattern that works is to treat AHT as a speed gauge inside a larger basket: AHT for speed, CSAT for satisfaction, FCR for resolution quality, contact volume for throughput. A VP at Stripe, DoorDash, or Notion who asks on Monday "why is AHT up twenty percent" is almost never asking about handle time in isolation — they want to know whether it broke the unit economics. If AHT is up and CSAT held, agents are spending more time per customer and the customer is happier. If AHT is up and CSAT dropped, you have a queue health problem.
When an interviewer at Snowflake, Databricks, or Airbnb asks you to compute AHT, volunteer the three components in your first sentence. Do not just write AVG(handle_seconds). Walk them through talk + hold + wrap, explain why all three matter, and explain why you will report median and p90 alongside the mean.
The base AHT formula
For the rest of this post, assume a calls table with one row per call and the columns you would expect in any production telephony schema: call_id, agent_id, call_reason, channel, talk_seconds, hold_seconds, wrap_seconds, csat_score, and call_ended_at. The base daily rollup is straightforward — sum the three components per call, average across calls, and emit one row per day.
SELECT
DATE_TRUNC('day', call_ended_at) AS day,
COUNT(*) AS total_calls,
AVG(talk_seconds + hold_seconds + wrap_seconds) AS aht_seconds,
AVG(talk_seconds + hold_seconds + wrap_seconds) / 60.0 AS aht_minutes,
AVG(talk_seconds) AS avg_talk,
AVG(hold_seconds) AS avg_hold,
AVG(wrap_seconds) AS avg_wrap
FROM calls
WHERE call_ended_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1
ORDER BY 1;The query returns the components alongside the total. If AHT spikes from seven minutes to nine, you want to see immediately whether the extra two minutes lived in talk, hold, or wrap. A jump in hold usually points at a downstream dependency that went slow: a CRM lookup, a payment processor, a knowledge-base search. A jump in wrap usually points at a new ticket template or process change. A jump in talk usually points at a product launch where agents are explaining something unfamiliar.
The report is daily. Hourly is too noisy unless you run a tier-one carrier; weekly hides the day-of-week effect any real CX team needs to see — Monday and the first business day after a holiday will always run hot. Daily with a seven-day moving average overlaid scales from a small inbound team to a multi-thousand-seat operation.
AHT by agent
Daily rollups tell you whether the team is healthy in aggregate. The next slice down is per agent, and here the difference between mean and median starts to matter.
SELECT
agent_id,
COUNT(*) AS calls,
AVG(talk_seconds + hold_seconds + wrap_seconds) AS mean_aht,
PERCENTILE_CONT(0.5) WITHIN GROUP (
ORDER BY talk_seconds + hold_seconds + wrap_seconds
) AS median_aht,
PERCENTILE_CONT(0.9) WITHIN GROUP (
ORDER BY talk_seconds + hold_seconds + wrap_seconds
) AS p90_aht
FROM calls
WHERE call_ended_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY agent_id
HAVING COUNT(*) >= 50
ORDER BY mean_aht;The HAVING COUNT(*) >= 50 clause filters out agents with too few calls to draw a fair comparison. Without it, an agent who took two calls in thirty days — both four-hour escalations — leads the leaderboard as the slowest in the company, and managers spend half a week reorganizing around a sample of two.
Always include median and p90 next to the mean. AHT distributions are deeply right-skewed: most calls are short, a handful are very long, and the mean lags the median on every team. If an agent's mean AHT is twelve minutes and their median is six, half their calls are closing at twice the speed the mean implies and a long tail is dragging the average up. That tail is usually one call reason — a billing dispute, an integration debug, an escalation — routed disproportionately to that agent. The fix is rerouting, not coaching.
Top performers sit below the team median AHT and above the team median on CSAT. Outliers with low AHT and low CSAT are rushing; outliers with high AHT and high CSAT are over-investing. Both are coaching opportunities — in opposite directions.
AHT by call reason
The single biggest source of noise in AHT comparisons is call mix. A password reset takes ninety seconds. A refund dispute takes fifteen minutes. An enterprise integration debug takes an hour. Averaged together without controlling for reason, an agent who picks up the easy tickets looks faster than one who carries the long ones, and the team's collective AHT number swings whenever the call mix shifts by a few percentage points.
SELECT
call_reason,
COUNT(*) AS calls,
AVG(talk_seconds + hold_seconds + wrap_seconds) AS aht,
PERCENTILE_CONT(0.5) WITHIN GROUP (
ORDER BY talk_seconds + hold_seconds + wrap_seconds
) AS median_aht,
AVG(csat_score) AS avg_csat
FROM calls
WHERE call_ended_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY call_reason
ORDER BY calls DESC;Three patterns to look for. First, the AHT-to-CSAT ratio per reason. A reason where AHT is short and CSAT is high is a well-oiled flow. A reason where AHT is short and CSAT is low closes prematurely — agents get customers off the phone before the problem is solved, which surfaces days later as a reopen. A reason where AHT is long and CSAT is high is fine at low volume but expensive at high volume; that is a knowledge-base or self-service opportunity.
Second, variance within reason. If "billing dispute" averages twelve minutes but the p90 is forty, you have a small cluster of disputes swallowing agent time. Split by sub-reason and you usually find one or two scenarios — refund-with-chargeback, subscription-pause-and-restart, multi-account-merge — driving the tail. Build a playbook for those and the p90 drops.
Third, reason mix per agent. If one agent handles forty percent password resets and another handles forty percent integration debugs, you cannot compare their headline AHT numbers. Always report AHT within reason when comparing agents.
AHT by channel
Channel mix is the other major confounder. Voice and chat and email behave entirely differently, and the same agent on the same problem will produce very different AHT numbers depending on the medium.
SELECT
channel,
COUNT(*) AS calls,
AVG(talk_seconds + hold_seconds + wrap_seconds) AS aht_seconds,
AVG(talk_seconds + hold_seconds + wrap_seconds) / 60.0 AS aht_minutes,
AVG(csat_score) AS avg_csat
FROM calls
WHERE call_ended_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY channel
ORDER BY calls DESC;Healthy bands by channel for most consumer products: voice sits at five to ten minutes per call; chat is three to five minutes per session because agents handle two or three sessions concurrently, so per-session AHT is shorter even though per-agent throughput is higher; email is usually tracked as response time rather than handle time because the asynchronous nature breaks the "one continuous handle" assumption.
When building a unified scorecard across channels, do not collapse them into one AHT number. Report each channel as its own row and rank within channel. A chat-heavy agent and a voice-heavy agent will never compare apples-to-apples on a combined leaderboard.
Common pitfalls
When teams stand up AHT reporting for the first time, the most damaging mistake is to track only talk time. Talk time is the visible part — the duration of the live voice connection — and the easiest field to pull on most telephony platforms. Hold and wrap require more instrumentation, so teams ship a "talk-only AHT" as a placeholder and never replace it. A call where the agent puts the customer on hold for five minutes while looking up an order, then takes three minutes of wrap to log notes, is an eight-minute real handle dressed up as a three-minute talk. Multiply that across a thousand agents and the cost picture is off by twenty to thirty percent.
A second trap is letting outliers drag the average. One call that runs four hours because an agent forgot to close the ticket can move a team's daily mean by minutes. The defense is to report median and p90 next to the mean, and to cap implausible durations in the ETL layer — anything above ninety minutes flags for review before it lands in the reporting table.
A third pitfall is optimizing AHT in isolation. If AHT is down ten percent and FCR is also down ten percent, you have not made the contact center faster — you have made it shallower. Customers are getting off the phone faster and coming back more often. Always pair AHT changes with FCR and reopen rate. If AHT is dropping and reopens are climbing, agents are gaming the metric, knowingly or not.
A fourth trap is treating wrap time as optional. Some teams pressure agents to skip wrap-up notes to keep AHT low. Skipping wrap means the next agent picking up the same customer has no context, which means a longer call next time and a worse experience. Wrap is investment in future AHT.
A fifth pitfall is confusing AHT with utilization. AHT measures time inside a handle. Utilization measures the share of an agent's shift spent on handles versus idle. An agent with low AHT and low utilization is fast and underused — give them more calls. An agent with high AHT and high utilization is slow and saturated — coach them or rebalance the queue.
Optimization tips
For contact centers running tens of millions of calls per quarter, the queries above start to chew through partitions. The first lever is partitioning calls by call_ended_at — Snowflake, BigQuery, and Databricks push date predicates into a partition prune, so a thirty-day rollup reads thirty days instead of the full table. The second lever is materializing an agent_daily rollup nightly, one row per agent per day with precomputed sums of talk, hold, and wrap. Scorecards then read a few thousand rows instead of millions.
For percentile queries, PERCENTILE_CONT is expensive at scale. Switch to approximate percentiles — APPROX_PERCENTILE on Snowflake and Databricks, APPROX_QUANTILES on BigQuery — for an order-of-magnitude speedup. On Postgres-backed reporting databases, indexes on (agent_id, call_ended_at) and (call_reason, call_ended_at) cover the queries above.
Related reading
- SQL window functions interview questions
- How to calculate agent productivity in SQL
- How to calculate churn in SQL
- A/B testing peeking mistake
If you want to drill SQL questions like this every day, NAILDD is launching with 500+ SQL problems built around exactly this pattern.
FAQ
What is a healthy AHT for tech support?
It depends on product complexity and tier. Tier-one chat support runs three to five minutes per session because agents handle multiple sessions concurrently. Tier-two voice support on billing runs five to seven minutes. Tech support on developer tools runs eight to twelve minutes per call because each call requires loading logs, errors, and environment context. Enterprise support engineers debugging integrations at Snowflake or Databricks can run thirty-plus minutes per call and that is healthy — the calls are complex.
Is dropping AHT always a good thing?
No. AHT going down only matters if CSAT and FCR are holding or rising at the same time. A team that drops AHT from ten minutes to seven while CSAT falls from four-point-five to three-point-eight has made things worse, not better — they are pushing customers off the phone faster and the customers are noticing. Treat a drop in AHT with a drop in CSAT as an active alarm.
What is wrap-up time, exactly?
Wrap-up is the post-call work after the customer disconnects: updating the ticket, logging notes, sending follow-up email, classifying the call reason. Healthy wrap is one to three minutes and should be at most equal to talk time. Wrap consistently longer than talk usually points at a poor CRM workflow, and the fix is upstream of the agent.
How is AHT different by channel?
Voice AHT runs five to ten minutes per call as a single continuous handle. Chat AHT runs three to five minutes per session, but agents handle two to four sessions in parallel, so per-agent throughput is higher than voice. Email is typically measured as first response time and thread duration, not AHT, because the asynchronous nature breaks the continuous-handle assumption. Do not combine channels into a single AHT number on a leaderboard.
Should I report median or mean AHT?
Both, always, side by side. AHT distributions are right-skewed because a small number of very long calls drag the mean up while leaving the median unchanged. Mean alone reacts to outliers the median says are unrepresentative. Median alone misses expensive long-tail behavior. The pair is the right report; adding p90 gives a clean view of the tail.
How do I separate AHT from idle time?
Idle is the period between calls when an agent is logged in but not on a handle. It lives in utilization — the share of shift spent on handles versus idle — not AHT. An agent with low AHT and low utilization is a candidate for more volume; an agent with high AHT and high utilization is saturated and needs coaching or queue rebalancing.