Core inference · Estimation

Point Estimate Calculator

Eight successes out of ten — so the true rate is 80%? Not so fast. The raw proportion x ∕ n is only one of several point estimates of a population proportion, and at small samples or near 0% and 100% it is the worst-behaved of them. This gives the maximum-likelihood estimate beside the Laplace, Jeffreys and Wilson adjustments — and is blunt that a single best guess always hides a range, never more so than when you have seen very few trials.

Successes x

Trials n

Confidence level for the Wilson estimate & interval

x is the number of times the thing happened; n is how many chances it had. The maximum-likelihood, Laplace and Jeffreys estimates don’t depend on the confidence level — only the Wilson estimate and the interval do.

Result

In plain English

A point estimate is a single best-guess number for something you can’t observe directly — here, the true proportion in a whole population, guessed from a sample. The obvious guess is the proportion you actually saw, x ∕ n. It’s a perfectly good estimate when the sample is large and the rate is nowhere near 0 or 1, but it stumbles at the edges: from 0 successes it returns 0, blandly asserting the event can never happen. The other estimates fix this by nudging the guess a little toward the middle, by an amount that fades away as the sample grows.

point estimate: A single value offered as the best guess of an unknown parameter, as opposed to an interval. Useful, but on its own it conceals how uncertain it is.
maximum likelihood (MLE): The estimate that makes the observed data most probable. For a proportion that is simply x ∕ n — the sample proportion. Unbiased, but high-variance and badly behaved at the boundaries.
Laplace (rule of succession): (x + 1) ∕ (n + 2): pretend you saw one extra success and one extra failure. The classic answer to “the sun has risen n days running — what’s the chance it rises tomorrow?” — never quite 1.
Jeffreys: (x + 0.5) ∕ (n + 1): a gentler nudge, from a Bayesian Beta(½, ½) prior. Often the best all-round compromise.
Wilson: (x + z² ∕ 2) ∕ (n + z²): the centre of the Wilson score interval. The size of the nudge grows with the confidence level you ask for.
shrinkage: All three adjustments pull the estimate toward ½ by adding “pseudo-counts.” The pull is large when n is tiny and vanishes as n grows — which is exactly when the raw proportion needs help, and when it doesn’t.

Frequently asked

What is a point estimate?

A point estimate is a single number put forward as the best guess of an unknown population value — a proportion, a mean, a rate — calculated from sample data. It contrasts with an interval estimate, which gives a range. For a population proportion the natural point estimate is the sample proportion x ∕ n, but it is not the only one, and not always the best. The honest practice is to quote a point estimate and a confidence interval, because the point alone says nothing about how much it might be off.

Is the sample proportion x ∕ n the best point estimate?

It is the maximum-likelihood estimate — the value that makes your data most likely — and it is unbiased, so on average it is right. But it has high variance with small samples and it behaves badly near 0 and 1, where it can land exactly on the boundary and claim impossibility or certainty. When n is small or the proportion is extreme, a shrunk estimate (Wilson or Jeffreys) is usually a better single guess. When n is large and the proportion is middling, all of them agree and x ∕ n is fine.

What do I report when there are zero successes (or all successes)?

Not 0 (or 1). Observing 0 successes in n trials does not prove the event is impossible — it just means it didn’t happen this time. The useful figure is the upper confidence bound: the rule of three says that with 95% confidence the true rate is below roughly 3 ∕ n. So 0 in 10 is “under about 30%,” 0 in 100 is “under about 3%.” Report that bound, or a shrunk point estimate such as Jeffreys, rather than a flat zero. All-successes is the mirror image.

How do the Laplace, Jeffreys and Wilson estimates differ?

All three add “pseudo-counts” that pull the estimate toward ½, but by different amounts. Laplace adds a whole success and a whole failure, (x + 1) ∕ (n + 2) — the biggest nudge. Jeffreys adds half of each, (x + 0.5) ∕ (n + 1), corresponding to a Bayesian Beta(½, ½) prior — usually the best compromise. Wilson, (x + z² ∕ 2) ∕ (n + z²), ties the nudge to the confidence level through z, so asking for 99% confidence shrinks more than 90%. As n grows, all three converge on x ∕ n.

Point Estimate Calculator

Result

In plain English

Frequently asked

Watch out for

Confidence Interval →

Five-Star Rating →

The winner’s curse