Probability Calculations in One Toolkit
Compute probabilities, combinatorics, discrete distributions (Binomial, Geometric, Poisson, Hypergeometric), Bayes' theorem, and run Monte Carlo simulations.
Compute probabilities, combinatorics, discrete distributions (Binomial, Geometric, Poisson, Hypergeometric), Bayes' theorem, and run Monte Carlo simulations.
Probability calculators turn abstract chance into concrete numbers—whether you need the odds of drawing two aces, the likelihood a medical test is correct, or the expected defects in a batch. A quality engineer analyzing a production line wanted to know: if the defect rate is 2%, what's the probability of finding exactly 3 defects in 100 items? She entered n=100, p=0.02, k=3 into the binomial mode and got 0.182. The common mistake is confusing "at least k" with "exactly k"—the first requires cumulative probability (CDF), the second uses the point probability (PMF). When reading results, check which you need: P(X=k) answers "exactly this many," while P(X≤k) or P(X≥k) answers "up to" or "at least."
Three operations cover most probability combinations. The complement P(A') = 1 − P(A) gives the probability something doesn't happen—often easier to compute than the direct probability. If there's a 30% chance of rain, there's a 70% chance of no rain.
Intersection P(A ∩ B) is the probability both events occur. For independent events, multiply: P(A) × P(B). Two independent coin flips both landing heads: 0.5 × 0.5 = 0.25. For dependent events, use the conditional formula: P(A ∩ B) = P(A) × P(B|A).
Union P(A ∪ B) is the probability at least one event occurs. The inclusion-exclusion formula prevents double-counting: P(A) + P(B) − P(A ∩ B). If P(rain) = 0.3 and P(traffic) = 0.4, and they're independent, P(rain or traffic) = 0.3 + 0.4 − 0.12 = 0.58.
Quick reference:
• Complement: P(A') = 1 − P(A)
• Independent intersection: P(A ∩ B) = P(A) × P(B)
• Union: P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Conditional probability P(A|B) answers: given B happened, what's the chance of A? The formula is P(A|B) = P(A ∩ B) / P(B). If 40% of customers buy coffee and 30% buy both coffee and pastry, then among coffee buyers, P(pastry|coffee) = 0.30/0.40 = 0.75.
Bayes' theorem flips the conditioning: P(A|B) = P(B|A) × P(A) / P(B). This is essential when you observe evidence (B) and want to update belief about a cause (A). Medical diagnosis uses this heavily: given a positive test result, what's the probability of actually having the disease?
The base rate matters enormously. A test with 99% accuracy can still produce mostly false positives if the disease is rare. If disease prevalence is 1% and the test has 99% sensitivity and 95% specificity, a positive result gives only about 17% probability of disease—far from the 99% many assume.
Bayes formula: P(Disease|Positive) = [P(Positive|Disease) × P(Disease)] / P(Positive). The denominator expands using the law of total probability.
Events A and B are independent if knowing one happened tells you nothing about the other. Mathematically: P(A|B) = P(A), or equivalently, P(A ∩ B) = P(A) × P(B). Coin flips are independent; cards drawn without replacement are not.
To test independence, check if the multiplication rule holds. If P(A) = 0.3, P(B) = 0.4, and P(A ∩ B) = 0.12, then 0.3 × 0.4 = 0.12 matches—events are independent. If P(A ∩ B) = 0.20 instead, they're dependent (positively correlated—knowing one increases the other's probability).
Common confusion: mutually exclusive events (cannot both occur) are maximally dependent, not independent. If A and B are mutually exclusive, P(A ∩ B) = 0, so P(A|B) = 0 ≠ P(A) unless P(A) itself is zero.
Warning: Mutually exclusive ≠ independent. In fact, mutually exclusive events with nonzero probabilities are dependent by definition.
Probability often requires counting outcomes. Factorials (n!) count arrangements of n distinct items: 5! = 120 ways to arrange 5 books. Permutations (nPr) count ordered selections: how many ways to award gold, silver, bronze to 3 of 10 athletes? P(10,3) = 720.
Combinations (nCr) count unordered selections: how many 5-card hands from 52 cards? C(52,5) = 2,598,960. The formula C(n,r) = n! / [r!(n−r)!] divides out the arrangements within each selection.
With repetition allowed, formulas change. Permutations with repetition: n^r (like 4-digit PINs: 10^4 = 10,000). Combinations with repetition use the stars-and-bars formula: C(n+r−1, r).
Counting formulas:
• Permutation: P(n,r) = n! / (n−r)!
• Combination: C(n,r) = n! / [r!(n−r)!]
• With repetition: n^r (ordered), C(n+r−1,r) (unordered)
Binomial: Fixed number of independent trials (n), constant success probability (p), counting successes (k). Example: 10 coin flips, probability of exactly 7 heads. Use when trials are fixed and outcomes are binary.
Poisson: Counting events in a fixed interval when events occur at a known average rate (λ) independently. Example: 5 calls per hour on average, probability of exactly 8 calls this hour. Use for rare events or counts without a fixed number of trials.
Geometric: Trials until first success. Example: how many calls before first sale if conversion rate is 10%? Expected value is 1/p = 10 calls. Use when asking "how many tries until success."
Hypergeometric: Sampling without replacement from a finite population. Example: drawing 5 cards, probability of exactly 2 hearts. Use instead of binomial when sampling fraction is large (more than 5% of population).
Quick decision: Fixed trials + replacement → Binomial. Rates/counts per interval → Poisson. Trials to first success → Geometric. Without replacement → Hypergeometric.
What's the difference between PMF and CDF?
PMF (probability mass function) gives P(X=k)—the probability of exactly that value. CDF (cumulative distribution function) gives P(X≤k)—the probability of that value or less. For "at least k," compute 1 − P(X≤k−1).
When does P(A or B) = P(A) + P(B)?
Only when A and B are mutually exclusive (can't both happen). Otherwise you must subtract the intersection to avoid double-counting: P(A) + P(B) − P(A ∩ B).
How do I know if events are independent?
Check if P(A ∩ B) = P(A) × P(B). If equal, independent. If not, dependent. Physical intuition helps: does knowing one event's outcome change your belief about the other?
Why does my Bayes result seem counterintuitive?
Usually because base rates matter more than you expect. A 99% accurate test applied to a 1% prevalence disease yields many false positives. The posterior probability depends heavily on the prior probability, not just test accuracy.
What if my combinatorial numbers overflow?
Factorials grow astronomically fast (20! exceeds 2 quintillion). Use logarithms for large calculations, or exploit cancellation in C(n,r) by computing iteratively rather than factorials directly. Many calculators handle this internally.
Sample spaces: the standard formulas assume well-defined sample spaces and known event probabilities. Real-world setups rarely give you these for free.
Independence: assumed in many shortcut formulas. Two events being "unrelated" intuitively doesn't make them independent in the formal sense. Check P(A∩B) = P(A)·P(B) before assuming.
Numerical limits: factorials and large-n combinatorics overflow float64 past 171!. The page uses log-gamma to handle this, but extremely sparse probabilities can still underflow.
Theory vs frequency: theoretical probability and observed frequency in finite samples are not the same thing. Don't expect 50 coin flips to give exactly 25 heads.
Note: For symbolic combinatorics and arbitrary-precision work, Stata's binomial() and poissonp() functions, Wolfram Alpha, or SymPy are the standard tools. Don't use this page for gambling decisions or financial risk modeling.
Probability formulas and concepts follow standard references:
Different concepts that get mixed up constantly. Independent: P(A ∩ B) = P(A)·P(B). Knowing B happened tells you nothing about A. Coin flips on the same coin, dice rolls. Mutually exclusive: P(A ∩ B) = 0. Both can't happen at once. Drawing a heart and drawing a spade in one draw. The kicker: if A and B are mutually exclusive and both have positive probability, they can't be independent, since A occurring forces B not to occur. So mutually exclusive events with non-zero probability are dependent. The intuition that mutually exclusive feels like independent is one of the most reliable wrong instincts in intro probability.
Use the complement. P(at least one) = 1 − P(none). If each event has probability p and they're independent, P(none) = (1 − p)^n, so P(at least one) = 1 − (1 − p)^n. The classic case: probability of at least one head in 10 coin flips is 1 − (1/2)^10 = 1023/1024 ≈ 0.999. The complement form avoids inclusion-exclusion across all the pairwise, triple-wise, and higher overlaps. For non-identical probabilities pᵢ that are still independent, P(none) = ∏(1 − pᵢ).
P(H | E) = P(E | H) · P(H) / P(E). The piece that trips people: P(E) on the bottom uses the law of total probability, P(E) = P(E | H)·P(H) + P(E | ¬H)·P(¬H). Disease testing example. Prevalence P(D) = 0.01, sensitivity P(+ | D) = 0.95, false positive rate P(+ | ¬D) = 0.10. Then P(+) = 0.95·0.01 + 0.10·0.99 = 0.1085. So P(D | +) = (0.95·0.01)/0.1085 ≈ 0.088. A positive test only puts the disease probability at 9%, because the base rate is so low. Base rates dominate, especially for rare conditions.
Joint P(A ∩ B) is the probability that both A and B happen. Conditional P(A | B) is the probability that A happens given that B already happened. They're related by P(A ∩ B) = P(A | B)·P(B). The conditional rescales the joint by the size of the conditioning event. Example: in a deck of 52, P(king ∩ heart) = 1/52, P(king | heart) = 1/13 (because conditioning on heart shrinks the sample space to 13 cards). Mixing these up is the most common mistake in early probability work.
Three doors, prize behind one, you pick door 1. Host opens a goat door from {2, 3}. Switching wins 2/3 of the time, staying wins 1/3. The reason: your initial pick had probability 1/3 of being right. The other two doors collectively had probability 2/3. The host's reveal doesn't change the original 1/3 versus 2/3 split, it just consolidates that 2/3 onto a single remaining door. Switching transfers your bet to the door that absorbed all the original probability mass from the doors you didn't pick. The host's information is conditional, not random, which is what people miss.
Probability is P, a number in [0, 1]. Odds are P / (1 − P), a number in [0, ∞). Odds of 3:1 mean the event is three times more likely to happen than not, so P = 3/4. Odds of 1:5 mean P = 1/6. The conversion: odds = P/(1−P), and P = odds/(1+odds). Bookmakers and logistic regression both work in odds (or log-odds) because the unbounded scale makes multiplication clean. Adding a feature with a coefficient β multiplies the odds by e^β, regardless of starting probability.
If B₁, B₂, ..., Bₙ partition the sample space (mutually exclusive and cover everything), then P(A) = Σᵢ P(A | Bᵢ)·P(Bᵢ). It's the workhorse for problems where the route to A depends on which Bᵢ happened first. Example: 60% of emails are spam, 40% legitimate. 90% of spam contains "free." 5% of legitimate emails contain "free." Then P("free") = 0.90·0.60 + 0.05·0.40 = 0.56. Bayes' theorem uses this on the denominator: P(B_j | A) = P(A | B_j)·P(B_j) / Σᵢ P(A | Bᵢ)·P(Bᵢ).
Faster than exponential. 10! ≈ 3.6 million, 20! ≈ 2.4 × 10¹⁸, 100! ≈ 9.3 × 10¹⁵⁷, 171! exceeds float64. Stirling's approximation says n! ≈ √(2πn)·(n/e)^n, which makes the e^(n ln n) growth explicit. The practical implication: combinations like C(50, 25) need log-gamma computation, not naive factorial division. The page does this internally. For exact arbitrary-precision factorials, SymPy or Python's math.factorial (which returns arbitrary-precision integers in Python 3) handle it correctly without overflow.