unspurious.calculators

Comparing groups · Distribution-free

Nonparametric Tests Calculator

Mann–Whitney U, Wilcoxon signed-rank and Kruskal–Wallis — the rank-based tests for when your data aren't normal, are ordinal, or carry outliers a mean can't survive. Paste raw data; get the statistic, a tie-corrected p-value, an effect size and an honest reading.

Two independent groups. The nonparametric alternative to the two-sample t-test.

Result

In plain English

Nonparametric tests throw away the actual numbers and work with their ranks — who's biggest, second-biggest, and so on. That makes them robust to skew, outliers and non-normal shapes, at the cost of a little power when the data really are normal. They answer “does one group tend to sit higher than another?” without assuming any particular distribution.

Mann–Whitney U
Compares two independent groups by rank. A small or large U means one group's values systematically out-rank the other's. Also called the Wilcoxon rank-sum test.
Wilcoxon signed-rank
Compares two paired measurements (before/after). It ranks the differences by size and checks whether the positive and negative ones balance out.
Kruskal–Wallis H
Extends Mann–Whitney to three or more groups — the rank-based one-way ANOVA. A big H means at least one group differs.
rank-biserial r
An effect size from −1 to +1: how lopsided the comparison is. Near 0 = groups overlap heavily; near ±1 = almost complete separation.
epsilon-squared (ε²)
The Kruskal–Wallis effect size, 0 to 1 — the share of rank variation explained by the grouping.
ties
Equal values share the average of their ranks; the calculator then applies the standard tie correction to the variance.

Frequently asked

When should I use a nonparametric test instead of a t-test or ANOVA?

When the data are clearly non-normal (especially in small samples), ordinal, or carry outliers the mean can't handle. Because these tests use ranks, they're robust to skew and extreme values — but when the t-test's assumptions genuinely hold, it has slightly more power, so don't switch without a reason.

Does the Mann–Whitney test compare medians?

Only loosely. It tests whether one group tends to produce larger values than the other (stochastic dominance). It becomes a clean test of medians only when the two distributions have the same shape; otherwise read it as a general location/dominance test.

Kruskal–Wallis was significant — which groups differ?

It only tells you that at least one group differs. To find which, run a post-hoc test such as Dunn's, with a correction (e.g. Holm or Bonferroni) for the multiple comparisons — exactly as you'd follow a significant ANOVA with Tukey's HSD.

Are nonparametric tests less powerful than parametric ones?

Slightly — but only when the parametric assumptions actually hold. You pay a small price in power for not assuming normality. When the data are skewed, ordinal or carry outliers, those assumptions fail, and the nonparametric test can be both more powerful and far more trustworthy. The “loss of power” worry is overstated; the real question is whether your data meet the parametric assumptions in the first place.