Your vendor contract says 99.9% SLA uptime. Sounds nearly perfect. Then you convert it: 43 minutes of allowed downtime per month. A single bad deploy on a Friday afternoon can blow the budget before anyone pages on-call. Most teams sign SLAs without doing the per-month math, then act surprised when a 40-minute outage triggers service credits.
Enter your actual downtime and measurement period. The output shows uptime percentage, comparison against common SLA tiers, and how much error budget remains.
The Gap Between Uptime Percentage and Real Availability
99.9% and 99.99% look almost identical on paper. In practice the gap is 39 minutes per month versus 4 minutes. Four minutes barely covers detection and escalation. Forget about a rollback. Any team promising four nines needs automated failover, not a human paging tree.
The percentage also hides what went down. A login endpoint returning 503 for two minutes hits every user. An internal batch job failing for an hour hits nobody externally. The Google SRE Book frames this as “not all minutes are equal,” which is worth reading before you set a target.
Error Budgets: Spending Downtime Like Currency
An error budget flips the conversation. Instead of “avoid all downtime” you ask “how much can we spend this month and still hit the target?” A 99.95% SLA over 30 days gives about 21 minutes. Every deploy, config push, and maintenance window draws from that pool.
Budget low? Fewer risky deploys, more bake time on canary releases. Budget flush? Ship faster — you can absorb a brief regression. That loop is the core of Google-style SRE practice, and it only works if you actually track the number.
Per-Month Windows vs Annual Averages: Why the Math Diverges
A 99.9% annual SLA allows roughly 8.7 hours across 365 days. You could burn all of it in January and still pass, but January users wouldn't care about your annual average. Most SaaS contracts measure monthly because billing and credits land monthly. A 99.9% monthly target allows only about 43 minutes, with no carryover.
Some providers measure a rolling 720-hour window instead of the calendar month, so a late-month outage can count against two billing cycles. Before negotiating, confirm whether the window is calendar, rolling, or billing-cycle aligned.
At-a-Glance Output: What Each Nines Tier Actually Buys You
| SLA Tier | Downtime / Month | Realistic For |
|---|---|---|
| 99% (two nines) | ~7.2 hours | Internal tools, staging |
| 99.9% (three nines) | ~43 minutes | Most SaaS products |
| 99.99% (four nines) | ~4.3 minutes | Payment, auth services |
| 99.999% (five nines) | ~26 seconds | Telecom, 911 systems |
If your result lands between tiers, you're either over-promising (four nines without the redundancy) or under-selling (running three nines while advertising two).
Troubleshooting Notes for SLA Negotiations
- Planned maintenance exclusions. Confirm maximum hours per month and whether the provider can declare maintenance retroactively.
- Partial degradation. A latency spike from 200ms to 3 seconds isn't an “outage” by most definitions, but it is one for your users. Push for latency thresholds alongside up/down metrics.
- Compound SLA. Service A at 99.9% depending on Service B at 99.9% yields 99.8% combined, not 99.9%. Every dependency multiplies the risk.
Mistakes that blow SLA reviews: comparing monthly targets against annual totals, forgetting rolling windows can double-count one incident, and assuming the provider’s monitoring agrees with yours on what counts as “down.”
Related on EverydayBudd's developer utilities hub: the API Rate Limit Planner for the production-reliability decisions that interact with availability targets, and the File Transfer Time Calculator for the network-side reliability math.
Uptime percentages and error budgets from this tool are planning estimates. They don't replace contractual SLA definitions, provider-side monitoring data, or legal review of service-credit terms.