Skip to main content

Confidence Intervals for a Single Proportion

Compute a confidence interval for a single proportion using Wald (normal approximation) and Wilson (score) methods. Enter the number of successes, sample size, and confidence level to see interval bounds, margin of error, and method comparisons.

Last Updated: February 13, 2026

Inputs: Successes, n, and Confidence Level

A confidence interval for a proportion starts with counting successes out of total trials. An e-commerce site might track 312 purchases from 4,200 visitors; a medical researcher might record 47 adverse events among 1,500 patients. The sample proportion p-hat equals successes divided by n. Plug those numbers into a formula alongside your chosen confidence level, and you get a range estimating the true population proportion.

Confidence level controls interval width. At 95%, the method captures the true proportion in 95 of every 100 repeated samples—not a statement that this specific interval has a 95% probability of containing the truth. Raising the level to 99% stretches the interval; dropping it to 90% tightens the bounds. Pick a level that matches the stakes: medical device clearances might demand 99%, while a quick marketing poll might tolerate 90%.

Sample size n drives precision. Doubling n shrinks the standard error by a factor of roughly 1.4, which narrows the interval. But sample collection costs money and time, so practical work involves balancing acceptable margin of error against budget. Pre-study power analysis helps decide n before data collection begins, but the calculator works with whatever sample you already have.

Successes must be whole numbers between zero and n. Zero successes or n successes require special handling—some methods produce degenerate intervals in these edge cases. If your count sits at an extreme, the calculator may flag a warning or switch methods automatically.

Wald vs Wilson: When Methods Diverge

The Wald interval is the formula most textbooks introduce first: p-hat plus or minus z times the square root of p-hat times (1 − p-hat) divided by n. Simple to compute, easy to teach. But its coverage can fall short when p-hat is near zero or one, or when n is small. A study with 3 successes out of 10 trials can produce a Wald interval that dips below zero—impossible for a proportion.

Wilson's score interval fixes these problems by inverting a hypothesis test rather than relying purely on the normal approximation. The formula looks more complicated, but the payoff is intervals that stay bounded between zero and one and maintain actual coverage closer to the stated confidence level. Simulation studies show Wilson outperforms Wald almost universally, especially when p-hat is extreme.

When n is large—say, thousands of observations—and p-hat sits near 0.5, Wald and Wilson give almost identical results. The divergence grows as p-hat approaches the boundaries or n shrinks. A survey of 50 people finding 2 positive responses will see substantially different intervals depending on the method. If your software offers both, run them side by side on edge cases to understand how they differ.

Agresti-Coull is a compromise: add two pseudosuccesses and two pseudofailures before computing a Wald-like interval. This simple adjustment often achieves coverage comparable to Wilson without changing the formula structure. Some practitioners default to Agresti-Coull because it's easy to explain while avoiding the worst Wald pitfalls.

Continuity and Small-n Caveats

Proportion data are inherently discrete—you can observe 5 successes or 6, not 5.3. The normal distribution underlying Wald and Wilson is continuous, so there's a mismatch. Continuity correction adds 0.5 to or subtracts 0.5 from the success count before calculating, bridging discrete counts to a continuous curve. Some textbooks recommend it; others argue Wilson already handles the discreteness adequately.

Small samples amplify every approximation error. With n = 15, a single extra success can shift p-hat by nearly 7 percentage points. Interval width balloons because standard error is inversely proportional to the square root of n. Below about n = 30, many statisticians suggest exact binomial methods (Clopper-Pearson), which invert the cumulative binomial distribution rather than leaning on normal theory.

Clopper-Pearson intervals are wider than Wilson for the same data because they guarantee coverage at or above the stated level. That conservatism can feel frustrating when you want a tight estimate, but for high-stakes applications—like estimating the failure rate of a safety system—overestimating uncertainty beats underestimating it.

If your sample is small and p-hat is extreme, examine the calculator's output critically. A Wilson interval of [0.02, 0.35] might look wide, but it honestly reflects the data. Resist the temptation to pick whichever method gives the narrowest result—choose the method that best fits your assumptions and risk tolerance.

Interpreting a Proportion Interval Correctly

A 95% confidence interval for a proportion does not mean there is a 95% probability the true value lies inside. Once you compute the bounds, the parameter is either captured or not—probability is 0 or 1. The 95% refers to long-run coverage: repeat the survey many times, and about 95 of 100 intervals will contain the truth. This distinction matters when communicating results to decision-makers.

A poll reports 48% support with a margin of error of 3 points. Readers often think "the true support is somewhere between 45% and 51%." That's roughly correct for practical purposes, but technically the interval is a property of the method, not a probability envelope around the unknown parameter. Bayesian credible intervals do offer that probability interpretation, but they require specifying a prior distribution.

When comparing two proportions—say, conversion rates between landing pages—check whether the intervals overlap. Non-overlap suggests a statistically significant difference at roughly the stated confidence level, though a formal two-sample test is more precise. Overlapping intervals do not prove equality; the difference might still be real but smaller than your precision can detect.

Context shapes interpretation. An interval of [0.08, 0.12] for defect rate might be excellent in consumer electronics but alarming in aerospace. Always pair the statistical result with domain knowledge about acceptable thresholds and practical consequences of estimation error.

Method Notes: What's Assumed

The binomial model underpins all standard proportion intervals. It assumes each trial is independent, with a constant probability of success. Drawing names from a hat without replacement violates independence once the pool shrinks noticeably. Surveying friends of friends can cluster responses, inflating apparent precision. Check whether your sampling design matches the model before trusting the output.

Normal approximation methods—Wald, Wilson, Agresti-Coull—work best when both np and n(1 − p) exceed about 5. Below that threshold, the binomial distribution is too skewed for a symmetric normal curve to mimic. The calculator may warn you, but vigilance on your part catches edge cases the software misses.

Cluster sampling, stratified sampling, and other complex designs need design-based variance estimators. Standard formulas assume simple random sampling. Using the wrong variance estimator can underestimate uncertainty by a factor of two or more, producing falsely narrow intervals that mislead stakeholders.

If you have paired or matched data—say, pre-and-post measurements on the same subjects—proportion intervals for independent samples do not apply. You would need McNemar's test framework or a paired-proportion interval, which accounts for within-subject correlation. Applying the wrong model leads to incorrect inference.

Proportion CI Questions

Which method should I default to?

Wilson or Agresti-Coull are safer choices than Wald for general use. They maintain coverage across a wider range of p-hat and n values. Reserve Wald for large samples with moderate proportions, or when you need to match legacy reports using that formula.

Why does my Wald interval go below zero?

Because the formula can produce negative lower bounds when p-hat is near zero and n is small. Clipping to zero is one fix, but switching to Wilson or exact methods is cleaner. A negative bound signals the method is unsuitable for your data.

How do I choose between 90%, 95%, and 99%?

Consider the cost of missing the true value. High-stakes safety applications often warrant 99%. Exploratory analyses can accept 90%. Convention and regulatory guidance also play a role—many journals expect 95% unless justified otherwise.

Can I compare two intervals by looking at overlap?

Rough rule: non-overlapping intervals suggest a significant difference at approximately the stated level. Overlapping intervals don't guarantee no difference—they just can't rule it out easily. A formal two-sample proportion test is more accurate for significance.

What if I have zero successes?

Zero out of n presents a special case. Wald gives [0, 0], which is misleading—absence of successes doesn't mean the true rate is exactly zero. Wilson and exact methods produce one-sided intervals starting at zero with a positive upper bound, honestly reflecting uncertainty.

Limitations & Assumptions

• Normal Approximation Conditions: The Wald interval relies on normal approximation, which requires both np and n(1 − p) to exceed roughly 5. When proportions are extreme or sample sizes are small, the Wald interval may have poor coverage. Wilson and exact methods are more robust in these scenarios.

• Simple Random Sampling Assumption: Standard formulas assume simple random sampling where each observation is independent. For complex sampling designs with clustering or stratification, standard formulas underestimate variance—design-based methods are required.

• Independence of Observations: The binomial model assumes each observation is independent—one outcome does not affect another. If observations are correlated, standard intervals underestimate true uncertainty.

• Fixed Sample Size and Constant Probability: Formulas assume sample size n was fixed before data collection and the true probability of success is constant across all trials.

Important Note: This calculator is for educational and informational purposes only. It demonstrates how confidence intervals for proportions work mathematically, not for clinical decisions, regulatory compliance, or production testing. Professional applications require proper consideration of sampling design, multiple comparisons, and domain expertise. For medical studies, market research, or quality control, use professional statistical software and consult with qualified statisticians.

Sources & References

The mathematical formulas and statistical concepts used in this calculator are based on established statistical theory and authoritative academic sources:

Frequently Asked Questions

Common questions about confidence intervals for proportions, Wald interval, Wilson score interval, sample proportion, margin of error, and how to use this calculator for homework and statistical inference practice.

What does a confidence interval for a proportion mean?

A confidence interval for a proportion gives a range of plausible values for the true population proportion based on your sample data. For example, a 95% CI of [0.40, 0.56] means that if we repeated the sampling process many times, 95% of the calculated intervals would contain the true population proportion. It's a way to express uncertainty about our estimate.

How is this different from a confidence interval for a mean?

A CI for a proportion deals with categorical data (success/failure, yes/no), while a CI for a mean deals with continuous numerical data. For proportions, we use the binomial distribution and normal approximations, while for means we typically use the t-distribution. The formulas and assumptions differ accordingly.

When is the normal approximation (Wald) unreliable?

The Wald interval can be unreliable when: (1) the sample size is small (n < 30), (2) the sample proportion is close to 0 or 1, or (3) the rule of thumb n·p̂ < 5 or n·(1-p̂) < 5 is violated. In these cases, the interval may be too narrow or extend outside [0,1]. The Wilson interval is more robust in such situations.

Why might the Wilson interval be preferred?

The Wilson (score) interval has better coverage properties than the Wald interval, especially for small samples and extreme proportions. It never produces intervals outside [0,1] naturally, and its actual coverage probability is closer to the nominal confidence level across a wider range of scenarios. Many statisticians recommend Wilson as the default choice.

Can I use this calculator for A/B testing decisions?

This calculator computes a confidence interval for a single proportion, which is educational for understanding uncertainty. However, A/B testing typically requires comparing two proportions and involves additional considerations like multiple testing, sequential analysis, and practical significance. For production A/B testing, use dedicated statistical tools with proper power analysis.

What does 'margin of error' mean in this context?

The margin of error is half the width of the confidence interval. If your 95% CI is [0.42, 0.54], the margin of error is 0.06 (or 6 percentage points). It represents the maximum expected difference between the sample proportion and the true population proportion at the given confidence level. Larger samples generally produce smaller margins of error.

How do I interpret the z-critical value?

The z-critical value comes from the standard normal distribution and determines how wide the confidence interval is. For a 95% CI, z ≈ 1.96, meaning the interval extends about 1.96 standard errors in each direction from p̂. Higher confidence levels require larger z values: 90% uses z ≈ 1.645, and 99% uses z ≈ 2.576.

What assumptions does this calculator make?

The calculator assumes: (1) simple random sampling where each unit has equal probability of selection, (2) independent trials where one outcome doesn't affect another, (3) a fixed sample size n, and (4) a constant probability of success for each trial. Violations of these assumptions may affect the validity of the interval.

Proportion CI Calculator: Wald vs Wilson Interval