Logistic Regression Probability Curve Visualizer
Visualize the S-shaped sigmoid curve of a logistic regression model. Set the intercept and slope parameters, define an x-range, and see how predicted probabilities change across the feature space.
Configure Your Logistic Regression Model
Set the intercept and slope parameters to visualize the S-shaped probability curve. See how different coefficients affect the predicted probabilities across your chosen x-range.
Quick Tips:
- 1.Choose a preset or enter custom model parameters
- 2.Set the x-range to focus on your region of interest
- 3.Adjust the decision threshold (default: 0.5)
- 4.View the probability curve and key model insights
Positive Slope:
- - Probability increases with x
- - Higher x = more likely positive
- - S-curve rises left to right
Negative Slope:
- - Probability decreases with x
- - Higher x = less likely positive
- - S-curve falls left to right
What the Sigmoid Curve Actually Tells You
You trained a logistic model to predict whether a lead will convert. The output for a specific lead is 0.73. That number is not a score or a rank — it is a probability estimated by a sigmoid function that squashes the linear predictor into the 0–1 range. A logistic regression probability curve visualiser plots that S-shaped mapping across the full input range so you can see exactly where the transition from “almost certainly no” to “almost certainly yes” happens, and how steeply it occurs.
The mistake that confuses most first-time users: reading the linear predictor (log-odds) as though it were the probability. A log-odds value of 2.0 does not mean “200% chance” — it maps to a probability of about 0.88 through the sigmoid. The curve visualiser makes this distinction concrete by showing both the straight line (log-odds) and the S-curve (probability) side by side.
How the Confusion Matrix Reads Out From the Curve
The probability curve does not classify anything on its own — you need a threshold to convert probabilities into yes/no decisions. Draw a horizontal line at threshold = 0.5 across the S-curve. Every data point whose predicted probability falls above that line is classified positive; every point below is negative. The resulting counts of correct and incorrect decisions fill the four cells of the confusion matrix: TP, FP, TN, FN.
Sliding the threshold line up or down shifts where the S-curve crosses it, which changes the x-value boundary between the two classes. Move the threshold from 0.5 to 0.3 and you classify more observations as positive — TP goes up, but so does FP. Move it to 0.7 and FP drops, but FN rises because you are now demanding higher confidence before predicting positive. The curve visualiser lets you see this mechanically: the threshold line intersects the S-curve at a single x-value, and everything on one side gets one label.
This connection matters because stakeholders often ask “why did the model miss that case?” The answer is usually that the predicted probability was just below the threshold — visible on the curve as a point sitting barely under the horizontal line. Adjusting the threshold by a small amount would have caught it, at the cost of more false positives elsewhere.
Tuning the Decision Threshold for Precision or Recall
The default threshold of 0.5 treats false positives and false negatives as equally costly. In most real problems they are not. A disease screening tool should minimise missed cases (high recall), even if that means more healthy patients get follow-up tests (lower precision). A fraud alert system that pages an analyst at 2 a.m. should avoid false alarms (high precision), accepting that some low-confidence fraud slips through (lower recall).
To find the right threshold, compute precision and recall at several candidate values — 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 — and build a table. Each row is a different confusion matrix. The operating point that best matches your cost structure is the one to deploy. There is no formula for this choice; it requires knowing the dollar cost (or human cost) of each error type.
A common trap: optimising the threshold on the training set. The sigmoid curve fit to training data is slightly overconfident, so a threshold tuned there will underperform on new data. Always tune on a held-out validation set, and report final metrics on a separate test set that was never used for any decision.
Odds Ratio vs. Probability — Two Views of the Same Model
The logistic model operates in three linked scales. The linear predictor z = β₀ + β₁x lives on the real line (−∞ to +∞). Exponentiate it and you get odds: eᵖ = P/(1−P), always positive. Apply the sigmoid and you get probability: P = 1/(1+e⁻ᵖ), bounded 0–1. Each scale answers a different question.
Odds ratios are how researchers report logistic regression results in journals: “each additional year of experience multiplies the odds of promotion by 1.15” means eᵝ₁ = 1.15. But decision-makers think in probabilities: “this candidate has a 68% chance of promotion.” The curve visualiser bridges the gap — you set the coefficients, and it shows the probability at any x-value, while the underlying odds ratio explains how much each unit of x shifts the odds.
One subtlety: the odds ratio is constant across all x-values (each unit increase in x multiplies odds by the same factor), but the probability change per unit of x is not constant. Near the midpoint of the S-curve, a 1-unit increase in x produces the largest probability jump. Near the tails, the same 1-unit increase barely moves the probability because the sigmoid flattens. This is why reporting “a 5-percentage-point increase in probability” requires specifying the baseline — the same coefficient produces different probability shifts at different starting points.
Logistic Curve Interpretation Mistakes
The slope β₁ is 0.5. Does that mean a 50% increase in probability per unit of x?
No. β₁ is in log-odds units, not probability units. A slope of 0.5 means each unit of x adds 0.5 to the log-odds — the equivalent probability change depends on where you are on the curve. At the midpoint it is roughly β₁/4 ≈ 12.5 percentage points; near the tails it is much smaller.
I set a negative intercept and the curve starts below 0.5. Is the model biassed?
Not necessarily. A negative intercept (β₀ < 0) means the baseline log-odds at x = 0 are negative, so the probability at x = 0 is below 50%. This is expected when the positive class is uncommon at the reference point. The midpoint of the curve simply shifts to x = −β₀/β₁, which may be far from zero.
Two models have different curves but similar AUC. Which is better?
AUC measures overall ranking ability, not calibration. One model might assign well-separated probabilities (steep curve) while the other assigns probabilities bunched near 0.5 (flat curve). If you need well-calibrated probabilities for risk scoring, the steeper curve is more useful even if AUC is identical. Check calibration plots alongside AUC.
Can I use this single-feature curve for a multi-feature model?
Only as a partial-effect visualisation. In a multi-feature model, the curve for one feature holds all other features at fixed values (often their means). Changing those held-out values shifts and reshapes the curve. The visualiser shows one slice of a higher-dimensional surface, not the full model.
Sigmoid, Log-Odds, and Probability Equations
Three equations define the logistic model:
Units note: β₀ and β₁ are in log-odds units. x can be in any measurement unit — the sigmoid converts everything to a dimensionless probability. The maximum slope of the probability curve occurs at the midpoint and equals β₁/4.
Lead-Scoring Model With Conversion Probability Curve
Scenario: A SaaS company models trial-to-paid conversion using a single feature: number of actions taken during the 14-day trial. The fitted logistic model has β₀ = −3.0 and β₁ = 0.12. Decision threshold is 0.5.
Step 1 — Midpoint.
Midpoint = −(−3.0) / 0.12 = 25 actions. At 25 actions the predicted conversion probability is exactly 50%. Below 25 the model predicts “will not convert”; above 25 it predicts “will convert.”
Step 2 — Probability at key points.
At 10 actions: z = −3.0 + 0.12×10 = −1.8, P = 1/(1+e¹⋅⁸) ≈ 0.14 (14%). At 40 actions: z = −3.0 + 0.12×40 = 1.8, P ≈ 0.86 (86%). The curve rises steeply between roughly 15 and 35 actions and flattens outside that range.
Step 3 — Odds ratio interpretation.
eᵝ₁ = e⁰⋅¹² ≈ 1.127. Each additional trial action multiplies the odds of conversion by about 1.13 — a 13% increase in odds per action, regardless of starting point. But the probability increase per action is largest near 25 actions (about 0.12/4 = 3 percentage points) and smaller near the extremes.
Step 4 — Business use.
The sales team targets leads with 15–25 actions — the steepest part of the curve where a nudge (webinar invite, feature walkthrough) could push the probability above the threshold. Leads below 10 actions are too cold; leads above 35 are likely converting on their own.
Sources
CMU Statistics — Logistic Regression: Sigmoid derivation, log-odds interpretation, and odds ratio properties.
scikit-learn — Logistic Regression: Implementation details, threshold tuning, and probability calibration methods.
NCBI — Interpreting Odds Ratios in Logistic Regression: Relationship between odds ratios, log-odds, and probability in applied research.
Penn State STAT 504 — Logistic Regression Model: Coefficient interpretation, curve shape, and threshold selection for binary outcomes.
Frequently Asked Questions
What do the intercept and slope parameters mean?
The intercept (beta0) is the log-odds when x = 0. It shifts the sigmoid curve left or right. A larger positive intercept shifts the curve left (higher probability at lower x values), while a negative intercept shifts it right. The slope (beta1) determines how steeply probability changes as x increases. A larger absolute slope means a sharper transition between low and high probability regions. Positive slopes create increasing probability curves; negative slopes create decreasing curves.
How do I interpret the probability output?
The probability P(y=1|x) represents the model's confidence that an observation belongs to the positive class given its x value. For example, P = 0.8 means the model predicts an 80% chance of the positive outcome. To make a binary prediction, compare this probability to your decision threshold (typically 0.5): if P > threshold, predict positive.
What is the midpoint and why does it matter?
The midpoint is the x value where probability equals exactly 0.5, where the model is maximally uncertain. It is calculated as x = -beta0 / beta1 when the slope is not zero. In a one-variable logistic model, that point marks the basic threshold crossing when your decision cutoff is 0.5. It gives you a clean reference for how the intercept and slope move the curve.
Why does my curve look almost flat?
A nearly flat curve occurs when the slope magnitude is very small (close to 0). This means x has little effect on the predicted probability and the model does not distinguish well between different x values. In practice, this might indicate that x is not a useful predictor for your outcome, or that you are viewing a narrow x-range where changes are subtle.
How should I choose the decision threshold?
The default threshold of 0.5 treats false positives and false negatives equally. In practice, adjust based on your costs: if missing a positive case is very costly (for example, disease screening), lower the threshold. If false positives are costly (for example, expensive interventions), raise it. Use ROC curves and domain knowledge to find the threshold that fits your application.
Can I use logistic regression with multiple features?
Yes. Real-world logistic regression typically uses multiple features: P(y=1|X) = 1 / (1 + e^-(beta0 + beta1x1 + beta2x2 + ...)). This visualizer shows single-feature models for educational purposes. With multiple features, the decision boundary becomes a hyperplane rather than a single point, and visualization usually requires dimensionality reduction or partial dependence plots.
What's the relationship between log-odds and probability?
Log-odds (also called logit) is log(P / (1 - P)), where P is probability. The linear predictor beta0 + beta1x directly gives log-odds, not probability. The sigmoid function converts log-odds to probability. Log-odds can be any real number, while probability is bounded between 0 and 1. Each unit increase in x changes log-odds by beta1.
How is this different from linear regression?
Linear regression predicts continuous values and can produce any output. Logistic regression predicts probabilities bounded between 0 and 1. Linear regression minimizes squared error; logistic regression maximizes likelihood. Linear regression is for 'how much?' questions, while logistic regression is for 'which category?' questions.
Does this tool train models from data?
No. This tool only visualizes a probability curve given user-provided coefficients (intercept and slope). It does not train models from data, estimate coefficients, evaluate model accuracy, or handle multiple features. Real logistic regression requires training data, coefficient estimation, and model evaluation in software built for that job.
Is this tool suitable for medical diagnosis or credit decisions?
No. Medical, credit, fraud, or compliance decisions need trained and validated models, documented thresholds, testing for bias and drift, and human oversight. This page is only a visual explainer, not a production decision system.
Related Operations & Planning Tools
F1 / MCC / ROC AUC Calculator
Threshold tuning with F1, MCC, ROC AUC, and balanced accuracy outputs.
Feature Scaling & Normalization Helper
Z-score versus Min-Max scaling with no-leakage train/test fitting.
Correlation & Coefficients Calculator
Pearson r, Spearman ρ, Kendall τ for understanding feature relationships.
Sample Size & Power Calculator
Required n, MDE, and power for fitting and validating logistic models.
A/B Test Significance & Lift Calculator
Lift, p-value, confidence interval, and MDE when comparing model variants.
Monte Carlo Simulator
Posterior simulation and bootstrap intervals for logistic-model coefficient uncertainty.
Dose-Response EC50 Estimator
Hill-equation EC50 fitting; mathematically the same shape as the sigmoid.
Explore More Data Science Tools
Build essential skills in data analysis, statistics, and machine learning concepts
Explore All Data Science & Operations Tools