Your vendor contract says 99.9% SLA uptime. Sounds nearly perfect — until you convert it: that is 43 minutes of allowed downtime per month. A single bad deploy on a Friday afternoon can blow the budget before anyone pages on-call. Most teams sign SLAs without doing the per-month math, then act surprised when a 40-minute outage triggers service credits.
Enter your actual downtime and measurement period to see the uptime percentage, compare against common SLA tiers, and check how much error budget remains.
The Gap Between Uptime Percentage and Real Availability
99.9% and 99.99% look almost identical on paper. In practice the gap is 39 minutes per month versus 4 minutes. Four minutes barely covers detection and escalation — forget about a rollback. Any team promising four nines needs automated failover, not a human paging tree.
The percentage also hides what went down. A login endpoint returning 503 for two minutes hits every user; an internal batch job failing for an hour hits nobody externally. The Google SRE Book frames this as “not all minutes are equal” — worth reading before you set a target.
Error Budgets: Spending Downtime Like Currency
An error budget flips the conversation. Instead of “avoid all downtime” you ask “how much can we spend this month and still hit the target?” A 99.95% SLA over 30 days gives about 21 minutes. Every deploy, config push, and maintenance window draws from that pool.
Budget low? Fewer risky deploys, more bake time on canary releases. Budget flush? Ship faster — you can absorb a brief regression. That loop is the core of Google-style SRE practice, and it only works if you actually track the number.
Per-Month Windows vs Annual Averages — Why the Math Diverges
A 99.9% annual SLA allows roughly 8.7 hours across 365 days. You could burn all of it in January and still pass — but January users would not care about your annual average. Most SaaS contracts measure monthly because billing and credits land monthly. A 99.9% monthly target allows only ~43 minutes, with no carryover.
Some providers measure a rolling 720-hour window instead of the calendar month, so a late-month outage can count against two billing cycles. Before negotiating, confirm whether the window is calendar, rolling, or billing-cycle aligned.
At-a-Glance Output: What Each Nines Tier Actually Buys You
| SLA Tier | Downtime / Month | Realistic For |
|---|---|---|
| 99% (two nines) | ~7.2 hours | Internal tools, staging |
| 99.9% (three nines) | ~43 minutes | Most SaaS products |
| 99.99% (four nines) | ~4.3 minutes | Payment, auth services |
| 99.999% (five nines) | ~26 seconds | Telecom, 911 systems |
If your result lands between tiers, you are likely over-promising or under-selling — committing to four nines without the redundancy, or running three nines but only advertising two.
Troubleshooting Notes for SLA Negotiations
- Planned maintenance exclusions. Confirm maximum hours per month and whether the provider can declare maintenance retroactively.
- Partial degradation. A latency spike from 200ms to 3 seconds is not an “outage” by most definitions, but it is one for your users. Push for latency thresholds alongside up/down metrics.
- Compound SLA. Service A at 99.9% depending on Service B at 99.9% yields 99.8% combined — not 99.9%. Every dependency multiplies the risk.
Mistakes that blow SLA reviews: comparing monthly targets against annual totals, forgetting rolling windows can double-count one incident, and assuming the provider’s monitoring agrees with yours on what counts as “down.”
Related tools: API Rate Limit Planner for sizing throughput so your service stays within budget, File Transfer Time Calculator when backup or replication speed affects recovery time, CIDR Subnet Calculator for the network layer underneath, and Password Entropy Estimator for credential hygiene on the services you are monitoring.
Uptime percentages and error budgets from this tool are planning estimates — they do not replace contractual SLA definitions, provider-side monitoring data, or legal review of service-credit terms.