Foundations · Grouped data

Grouped Data Standard Deviation Calculator

When you only have a frequency table, not the raw numbers. Enter the class intervals (or their midpoints) and how many values fall in each, and get the grouped mean, variance and standard deviation — sample and population — with every f·m and f·(m − x̄)² laid out. With the one caveat that matters: grouping pretends every value sits at its class midpoint, so these are close approximations, not the exact figures.

Classes intervals like 0-10, or bare midpoints — one per line/space

Frequencies how many values in each class, same order

Write classes as “0-10 10-20 …” (the midpoint is taken automatically) or just give midpoints “5 15 25 …”. Frequencies must line up one-for-one with the classes.

Result

In plain English

A grouped frequency table has already thrown away the individual values — you know that eight people scored “20 to 30”, but not whether they scored 21 or 29. To get a mean and spread anyway, we stand in for each class with its midpoint and weight it by its frequency. It is a sensible approximation, and usually a close one, but it is an approximation: the true standard deviation of the raw data would differ a little.

class midpoint (m): The middle of each interval — (lower + upper) ∕ 2 — used as a single stand-in for every value in that class.
grouped mean: The frequency-weighted average of the midpoints: Σ(f·m) ∕ Σf. Each class counts in proportion to how many values it holds.
grouped variance & SD: The frequency-weighted average squared distance of the midpoints from the mean: Σf(m − x̄)² over Σf (population) or Σf − 1 (sample); the SD is its square root, back in the data's units.
sample vs population: Divide by Σf − 1 if the table is a sample of a larger group (the usual case), or by Σf if it is the whole population.
the midpoint assumption: Pretending every value sits exactly at the midpoint slightly understates spread within wide classes; narrow classes make the approximation better.

Frequently asked

How do you find the standard deviation of grouped data?

Take each class midpoint m, multiply by its frequency f, and add up: the grouped mean is Σ(f·m) ∕ Σf. Then for each class compute f·(m − mean)², add those, and divide by Σf − 1 (sample) or Σf (population); the standard deviation is the square root. The calculator lays out every f·m and f·(m − mean)² so you can follow it line by line.

Why is grouped standard deviation only an approximation?

Because the table no longer contains the individual values. Replacing every value in a class with the class midpoint assumes they are all bunched at the centre, when really they are spread across the interval. That assumption usually understates the within-class spread a little, so the grouped standard deviation is typically close to — but not exactly — the standard deviation you would get from the raw data. Narrower classes shrink the gap.

Should I divide by n or n − 1 for grouped data?

The same rule as for raw data. Divide by Σf − 1 (the total frequency minus one) when the data are a sample meant to estimate a larger population — the usual case — and by Σf when they are the whole population. The calculator shows both so you can pick the one your problem calls for; they converge as the total frequency grows.

What is the difference between grouped and ungrouped standard deviation?

Ungrouped uses every individual value, so it is exact. Grouped works from a frequency table where the raw values are gone, standing each class’s midpoint in for all its members — which makes it an approximation that usually slightly understates the true spread. If you still have the original numbers, use the ungrouped calculation; grouping is only for when all you have been given is the table.

Grouped Data Standard Deviation Calculator

Result

In plain English

Frequently asked

Watch out for

Standard deviation (raw data) →

Relative frequency →

Class Width →

Lie With This Chart