Skip to main content

SLA Wait-Time, Erlang B, and Abandonment Calculator

Hit a contractual percentile target (e.g., 90/30: 90% of calls answered in 30 seconds). Built around Erlang C for queueing, Erlang B for loss systems, and Erlang A for caller abandonment, with agent-count solving for any SLA threshold.

Last updated:
Formulas verified by Ishfaq Ur Rahman, Industrial Engineer
For educational purposes only. Not professional capacity-planning advice.

Queue Parameters

Average number of arrivals per time unit

Average number of customers served per time unit

Utilization (ρ = λ/μ)83.3%

SLA Target

Maximum acceptable wait time for customers

%

e.g., 80% means "80% of customers should wait 2 minutes"

Configure Your Queue Parameters

Enter your arrival rate, service rate, and SLA target to calculate the probability that customers wait less than your threshold.

Quick Tips

  • Arrival rate (λ): How many customers arrive per time unit
  • Service rate (μ): How many customers can be served per time unit
  • Keep μ > λ: Service rate must exceed arrival rate for a stable system
  • Try a preset example to see typical configurations

P(Wq t) = 1 - ρ × e^{−(μ - λ)t}

M/M/1 Wait Time CDF Formula

Your SLA Penalty Clause Doesn’t Care About Average Wait Time

Your call center signed a 90/30 SLA: 90% of calls answered within 30 seconds. The contract attaches a $2,000-per-percentage-point penalty if monthly compliance slips below 87%. Hit 86% in October and you owe $4,000. Hit 80% and you owe $20,000. The math isn’t optional anymore.

Average wait time is the wrong metric for that contract because the penalty is keyed to a percentile, not a mean. An average of 18 seconds sounds great until you realize 22% of callers wait over a minute. The 90/30 SLA is breaching at 78%, and you owe $24,000 for the month. The mean hides the tail. SLA-bound work always lives in the tail.

This page solves three problems that contract-bound operations face every month: hitting a percentile target with Erlang C, sizing for Erlang B (when callers can’t queue and rejected requests are lost), and accounting for abandonment with Erlang A. For a general M/M/1 or M/M/c primer covering utilization, queue length, Little’s Law, and average wait time, see the general queueing theory and M/M/c primer. This tool is for the contract.

Three contract patterns dominate. 80/20 is the legacy benchmark from 1990s telephony: 80% of calls answered in 20 seconds. 90/30 is the modern enterprise standard. 95/30 or 95/15 show up in regulated industries (financial services, healthcare triage) where slow service becomes a compliance issue, not just a customer-experience one. Each tightening costs disproportionately more headcount, and the math tells you exactly how much.

Solving for Agent Count to Hit a Contractual SLA

Most ops leads work the problem backward: given a contractual percentile target and a known arrival rate, what’s the minimum agent count? Fix λ (arrival rate from your historical data, peak hour, not daily average), set the target P(W ≤ t) (e.g., 0.90 for 30 seconds at a 90/30 SLA), and iterate c upward. At each value of c, compute the Erlang C probability and the resulting P(W ≤ t). Stop when the probability clears the target. The calculator does this with a bisection search.

Three numbers that practitioners get wrong here. Peak-hour arrival rate, not daily average. If you receive 500 calls/day across a 9-hour window, the average is 56/hour, but the peak hour might handle 110 calls. Staff to peak. Effective service rate, not gross talk time. Add wrap-up time, after-call work, and post-call notes to the average handle time. An agent who talks for 4 minutes and wraps for 90 seconds has μ ≈ 10.9/hour, not 15/hour. Shrinkage. The c the model returns is the number of agents available right now. Schedule for c divided by (1 minus shrinkage). With 25% shrinkage (breaks, training, admin time, sick leave averaged across the workforce), 13 agents available means roughly 17 agents on the roster.

The cost of tightening the SLA target is convex. Going from 80/20 to 90/20 with a fixed λ might add 25% to your headcount. Going from 90/30 to 95/30 might add another 25%. The last 5 percentage points of compliance get expensive. Always compute compliance at c, c+1, c+2 and present the table to the contract owner before signing anything.

Erlang B: When Callers Hang Up If There’s No Available Server

Erlang C assumes infinite queue capacity: if all servers are busy, callers wait until one frees up. That’s a useful approximation for call centers with patient customers, but it’s wrong for systems where blocking is the actual outcome.

Erlang B handles the loss case. If all c servers are busy, the next arrival is rejected. In telephony that’s a busy signal. In modern systems it’s a 503 Service Unavailable, a “no available agent” message, or a hangup. The blocking probability with offered load A and c servers is B(c, A) = (Ac / c!) divided by the cumulative sum of Ai/i! from i = 0 to c.

Where Erlang B is the right model: emergency-room bed allocation (no one waits in a hallway forever, they get diverted), legacy phone networks where there are physical trunk limits, modern API gateways with strict concurrency caps where excess requests get rejected with a backoff header, and high-end retail support where overflow callers get routed to an IVR or callback queue rather than left waiting.

The intuition difference: Erlang C asks “how long will I wait?” Erlang B asks “what fraction of arrivals get turned away?” If your contract specifies a maximum block rate (B ≤ 0.02 means at most 2% of calls rejected), Erlang C doesn’t model that constraint at all. It assumes everyone eventually gets served, which guarantees zero block rate by construction.

Practitioner shortcut: for the same offered load A, Erlang B always returns a higher probability of immediate service than Erlang C, because rejected callers don’t pile up in a queue and crowd out new arrivals. If your real system has callers who hang up after 10-15 seconds of waiting, the truth lies somewhere between the two formulas. Erlang A is the bridge.

Erlang A: When Callers Wait, Then Abandon

Real call-center data almost never matches Erlang C exactly because real callers don’t wait indefinitely. After 30, 60, or 90 seconds, a fraction of them hang up. That fraction grows with wait time. Erlang A (sometimes called the M/M/c+M model in academic queueing literature, with origins in the Palm distribution work of the 1940s) extends Erlang C with an abandonment rate θ: each waiting caller abandons at rate θ per unit time, modeled as exponential patience.

The formula is messier than Erlang B or C because the abandonment process competes with the service process for each waiting caller. Numerical solution methods (continued fractions, Krishnamoorthi's 1963 derivation) replace the closed forms used in Erlang B and C. For practitioners, the operational interpretation matters more than the derivation.

Two parameters change the answer. The abandonment rate θ usually comes from your call-center data: if 50% of callers who wait more than 60 seconds hang up, θ ≈ ln(2)/60 ≈ 0.69 per minute. Most consumer call centers run at θ ≈ 0.5 to 1.0 per minute. The other parameter is the wait threshold callers tolerate before they start leaving (often 15 to 45 seconds), after which the abandonment rate climbs steeply.

Why this matters for the SLA: an Erlang C model that ignores abandonment overstates capacity needs. The model thinks all those callers wait the full duration. In reality, half of them hung up and the queue is shorter than predicted. Conversely, abandonments are revenue-affecting on their own. A caller who hangs up doesn’t get served at all, which means they don’t count against the SLA percentile (because they weren’t answered) but they do count against your customer-retention numbers. Erlang A captures both effects. WFM tools that use Erlang A (NICE IEX, Genesys WFM, Verint Monet) typically run 10 to 20% leaner than tools that default to Erlang C.

SLA Contract Patterns: Penalties That Match the Math

How an SLA penalty is structured determines whether the math actually protects either party. Three common patterns, each with a real implementation gotcha.

Per-percentage-point penalties. Compliance below the target triggers a per-point penalty: $2,000 per point under 87% on a 90/30 SLA. The math is clean. The implementation gotcha is the measurement window. Monthly aggregates can hide a really bad week. A 30-day window with one disastrous Monday morning that breached for 4 hours can still hit 89% overall while costing 12,000 customers a poor experience. Negotiate measurement windows to be no longer than the period of business impact, often weekly, not monthly.

Tiered breach thresholds. Tier 1 (compliance 85 to 89%) costs $2,000/point. Tier 2 (80 to 84%) costs $5,000/point. Tier 3 (below 80%) becomes a service credit equal to a month’s fee. The penalty discontinuities are intentional: the cost becomes punishing exactly where the customer experience falls off a cliff. The implementation gotcha is what counts as the “primary” call type. If your contract excludes after-hours and emergency callbacks, those calls effectively run uncapped from a penalty standpoint. Read the exclusions before quoting capacity.

Make-good provisions. Failure to hit the SLA two months in a row triggers free service for the third month. The maximum loss is fixed and easy to model financially, but it incentivizes the wrong behavior on the provider side: hitting the SLA in alternating months becomes the optimal strategy from a pure-cost standpoint. Real customers notice that pattern fast.

Workforce-management software in production today (NICE IEX, Verint Monet, Genesys Cloud WFM, Calabrio) computes Erlang C, Erlang B, or Erlang A internally and exposes the percentile target as a contract input. They don’t change the math. They just make it operational at 15-minute intervals across thousands of agents, with shrinkage, schedule adherence, and skill-routing layered on top.

SLA Wait-Time, Erlang B, and Erlang A Equations

The full set of formulas the calculator uses, with assumptions baked in:

Erlang C (probability of waiting at all)
C(c, A) = (Ac/c!) × P0 / (1 − ρ)
A = λ/μ (offered load), ρ = A/c
P(wait ≤ t) under Erlang C
P(Wq ≤ t) = 1 − C(c,A) × e−(cμ − λ)t
For t = 0: P(immediate service) = 1 − C(c,A)
Erlang B (probability of blocking, loss model)
B(c, A) = (Ac/c!) / Σi=0..c (Ai/i!)
Use when rejected arrivals are lost, not queued
Erlang A (with abandonment rate θ)
No closed form. Solved numerically (Krishnamoorthi 1963, continued-fraction methods)
Inputs: λ, μ, c, θ (per-unit-time abandonment rate)
Required c (inverse solve)
Bisection on c until P(Wq ≤ ttarget) ≥ SLA%

Call Center 90/30 SLA: Full Worked Example

A B2B SaaS support team signed a 90/30 SLA on premium customers with a $2,000-per-percentage-point penalty below 87% compliance, measured weekly. Peak-hour arrival rate λ = 90 calls/hour (verified from CRM exports of the busiest hour over the last six weeks). Average handle time 5.5 minutes including wrap-up. Service rate per agent μ = 10.9/hour.

Offered load: A = 90/10.9 = 8.26 Erlangs. Stability requires c ≥ 9.

c = 11: ρ = 0.751. Erlang C ≈ 0.30. P(W ≤ 30s) ≈ 0.77. Below 90% target, well below the 87% breach threshold. This staffing level breaches every week.

c = 12: ρ = 0.689. Erlang C ≈ 0.20. P(W ≤ 30s) ≈ 0.86. Still below the 87% breach threshold. Penalty would be roughly $2,000 × (87 − 86) = $2,000/week.

c = 13: ρ = 0.636. Erlang C ≈ 0.13. P(W ≤ 30s) ≈ 0.91. Clears the 90% target, well above the 87% breach threshold. Safe.

After 25% shrinkage, schedule 13 / 0.75 ≈ 17 agents on the roster during peak. The cost of the 13th agent (vs. 12) might be $5,000/week loaded. The avoided penalty is $2,000/week plus the goodwill of staying compliant. Net case: +$3,000/week of staffing cost is a clear win against the recurring penalty exposure.

If abandonment is meaningful (say 30% of callers waiting longer than 60 seconds hang up), an Erlang A model would let you stay at c = 12 and still hit compliance because some long-wait callers no longer count against the percentile. Whether to plan around that reality depends on what the contract counts: if abandoned calls count as “not answered” in the breach math, you can’t use Erlang A for sizing. Read the contract before picking the model.

Sources

Mitan, Erlang C Mathematics. Derivation of Erlang C and P(wait ≤ t) for staffing calculations.

Call Centre Helper, Erlang C Formula With Examples. Step-by-step staffing computation with shrinkage and SLA targeting.

ScienceDirect, Erlang B Formula (Telecommunications). Loss-model derivation and applications to trunk-line capacity.

Garnett, Mandelbaum & Reiman (2002), “Designing a Call Center with Impatient Customers,” Manufacturing & Service Operations Management 4(3): 208 to 227. Canonical academic treatment of Erlang A.

COPC, Contact Centre Performance Standards. Industry benchmarks for SLA targets, shrinkage factors, and staffing best practices.

Frequently Asked Questions

What's the difference between Erlang C and Erlang B?

Erlang C assumes infinite queue capacity: callers wait until a server frees up. Erlang B assumes zero queue: if all servers are busy, the next arrival is rejected (a busy signal, a 503 response, or a hangup). Same input variables, very different output. For the same offered load A and server count c, Erlang B always reports a higher probability of immediate service because rejected callers don't pile up and crowd out new arrivals. Use Erlang C for call centers where customers wait. Use Erlang B for systems with strict concurrency caps, trunk-line limits, or overflow routing.

How do I price an SLA penalty against staffing cost?

Two numbers carry the decision. Marginal staffing cost: what does adding one more agent (or one more server) cost per measurement period, fully loaded with benefits and shrinkage? Avoided penalty: how much breach exposure does that agent eliminate, given the percentile compliance shift the agent buys? Compute compliance at c, c+1, and c+2 servers. The agent that takes you above the breach threshold is usually a clear win against per-percentage-point penalties of $1,000+/week. The agent above that is the judgment call. Anything beyond the contract's ceiling penalty (e.g., a service credit cap) is just over-staffing.

Does my SLA need to account for abandonment?

It depends on what the contract counts. If the SLA is measured against answered calls only and abandoned calls are excluded from the denominator, abandonment improves your reported compliance because slow-wait callers self-eject before they breach. If the SLA counts abandoned calls as not-answered, abandonment hurts you twice: customers leave, and the metric breaches. Read the contract definition before picking a model. If abandonment is excluded from the math, plan with Erlang A and run leaner. If it's included, plan with Erlang C and overstaff slightly.

Why do call centers use Erlang C even though it assumes infinite queue?

Three reasons. First, Erlang C is closed-form and has been computed in spreadsheets and WFM tools since the 1950s. Erlang A requires numerical methods and only became standard in modern WFM platforms in the last decade. Second, Erlang C is conservative: it overstates capacity needs, which means staffing decisions made from it tend to err on the safe side of the SLA. Third, abandonment data is often unreliable. If the call center can't accurately measure abandonment rate, Erlang A's leaner outputs are based on a parameter that isn't trustworthy. Most modern WFM tools (NICE IEX, Genesys, Verint) offer Erlang A as an option but default to Erlang C for new deployments.

What arrival rate should I use: peak hour or daily average?

Peak hour, almost always. SLA contracts are typically measured per interval (15 or 30 minutes is common). The daily average will be 30 to 60% lower than peak, and staffing to the average guarantees breaches during the busy hour. Pull arrival counts in 15-minute buckets across the busiest day of the week, take the 75th or 90th percentile of those buckets, and use that as your design lambda. If the contract allows averaging across the day, you can soften this, but 90% of contracts I've seen measure within an interval.

How is the suggested service rate calculated?

Bisection search. The calculator iterates over candidate service rates (or candidate agent counts), computes P(W ≤ t) for each, and converges on the minimum value that meets the target SLA percentile. This is the inverse of the forward Erlang C calculation. Internally it caps iterations to avoid stalling on edge cases (utilization approaching 1, infeasible targets), and it returns 'No solution' when the SLA can't be hit at any reasonable parameter setting.

How does shrinkage interact with SLA staffing?

The agent count the model returns is the number of agents available right now, taking calls. Shrinkage is everything that takes an agent off the phone: breaks, lunch, training, coaching, schedule adherence misses, sick leave averaged across the workforce. Most consumer call centers run 25 to 35% shrinkage. To staff for c agents available, schedule c divided by (1 minus shrinkage). At 25% shrinkage, 13 agents available means roughly 17 on the roster. WFM software handles this automatically, but if you're sizing manually, multiply by 1/(1 - shrinkage) before quoting headcount to the budget owner.

What's a typical SLA target across industries?

80/20 (80% of calls answered in 20 seconds) is the legacy telephony benchmark and still the default for most general consumer support. 90/30 is the modern enterprise standard, especially for B2B SaaS support and premium-tier consumer support. Financial services and healthcare triage often run 95/30 or 95/15 because regulatory and patient-safety considerations make slow answering its own compliance issue. Web-service SLAs (API response time, page-load) run at 95% or 99% of requests at sub-second thresholds because the cost of waiting in a UI is much higher than in a phone queue.

What does it mean when the model says my system is unstable?

Utilization (ρ) at or above 1.0. Arrivals are coming in at least as fast as servers can clear them, so the queue grows without bound and steady-state metrics don't apply. Three fixes: increase service rate (faster handling, better tooling, simpler workflows), decrease arrival rate (deflect traffic to self-service, IVR, async channels), or add servers. If a real system runs at ρ ≥ 1, no SLA percentile is achievable in steady state. The instability flag is the model telling you the system needs structural change, not a tweak to an SLA target.

Can I model multiple skill groups (e.g., billing vs tech support)?

Not in this calculator directly. Each skill group is its own queue with its own arrival rate, service rate, and agent pool, and the SLA math runs independently for each. To handle multi-skill operations, run the calculator once per skill group and sum the agent requirements. For shared skills (an agent who can handle billing and tech support), workforce-management software (NICE IEX, Genesys WFM) uses linear-programming-based allocation models that this tool doesn't replicate. Pooling skill groups into one queue underestimates wait time for the busier group.

Explore More Operations & Planning Tools

Build essential skills in queueing theory, operations research, and data-driven capacity planning

Explore All Data Science & Operations Tools

How helpful was this calculator?