Hypothesis testing · Comparing means
t-Test Calculator
One-sample, two-sample (Welch or Student), or paired t-tests — from summary statistics or your raw data. You get the t statistic, df, p-value, the confidence interval, an effect size, and a plain reading of what the p-value actually licenses you to say.
Result
In plain English
A t-test asks one simple question: is the difference you can see real, or could it just be random luck? It weighs the size of the difference against how noisy the data are.
- t statistic
- The difference between groups, measured in units of noise. The further from 0, the harder it is to wave away as chance.
- p-value
- If there were truly no difference, this is how often you'd see a gap at least this big just by luck. A small p (say under 0.05) means the “no difference” story looks unlikely.
- degrees of freedom (df)
- Roughly how much information your data carry — it grows with sample size.
- confidence interval
- The plausible range for the real size of the difference.
- Cohen's d
- How big the difference is in practical terms (small / medium / large) — a separate question from whether it's statistically “significant.”
Frequently asked
What's the difference between a paired and a two-sample t-test?
A paired t-test compares two measurements on the same subjects (before/after) and tests the average within-pair difference. A two-sample t-test compares two independent groups. Using the wrong one invents or discards the pairing and changes the answer.
Should I use Welch's or Student's t-test?
Welch's, by default. It doesn't assume the two groups have equal variances, and it performs just as well as Student's when they do — so there's rarely a reason to prefer the equal-variance version.
What does a significant t-test tell me?
That the observed difference is larger than you'd comfortably expect from chance alone — not that it is large or important. Always read the effect size (Cohen's d) and the confidence interval alongside the p-value.
What are the assumptions of a t-test?
That the data are roughly normal — or the sample is large enough for the central limit theorem to take over — that observations are independent, and, for the two-sample test, either equal variances (Student’s) or not (Welch’s, the safer default). The t-test is fairly robust to mild non-normality, but heavy skew, clear outliers or dependent observations call for a different approach, such as a nonparametric test.