Type to search lessons across every grade and subject.

MATH · GRADE 7Statistics

Populations, Samples, and Random Sampling

200 trees, and you only get to inspect 20. How close can your estimate get?

Grade 7
20200the stand…?how close can 20 get…
In this lesson

What a population is

A population is a complete set of elements (people, animals, objects, or events) that are the focus of a statistical question. The population isn't a number; it's the group you're asking about.

Populations are defined by shared characteristics, and the curriculum names four common kinds:

  • By age: "all Grade 7 students in Alberta."
  • By location: "all lodgepole pines in the Crowsnest Pass stand."
  • By time: "all weather readings in the Calgary airport during March 2026."
  • By type: "all hockey sticks manufactured by a single brand."

A clear population is the foundation of any statistical question. "What percent of Grade 7 students walk to school?" is incomplete until you say WHICH Grade 7 students. All of Alberta, just Edmonton, only the students in your school? The population pins down the scope.

Census vs sample

Once the population is named, the next question is whether you can collect data from every element. Two options:

  • A census collects data from the whole population. It's the ground truth, with no estimation involved.
  • A sample collects data from a subset. The sample stands in for the whole, and you estimate the population's properties from the sample's.

The rule of thumb:

Counting the students in one Grade 7 class: census (30 kids, takes a minute). Counting the lodgepole pines in a 1000-hectare forest: sample (millions of trees, no feasible way to inspect each one). Counting infestation in a single backyard tree: census of one. The right choice depends on the population's size and accessibility, not on a fixed rule.

Representative samples

A representative sample has the same defining characteristics as the population. If the population is 40% urban and 60% rural, a representative sample is roughly 40/60 too. If the population includes equal Grade 6, 7, 8, and 9 students, a representative sample should too.

A sample is most useful when it's representative. Otherwise the estimate you compute from the sample doesn't say what you think about the population. The classic warning is a sample drawn only from one corner of the population: a survey of Grade 7 walking habits done only at one downtown school would miss rural students entirely, and the estimate would be biased.

Two random-sampling methods

The curriculum names two methods of drawing samples that aim for representativeness:

  • Simple random sampling. Every element of the population has an equal chance of being picked, independent of the others. Concretely: number every element, then pick N distinct numbers at random.
  • Systematic sampling. Pick every kth element after a random starting point. For a population of 200 and a sample of 20, pick every 10th element after a random start.

Both methods avoid the bias of "picking convenient elements." Either can produce representative samples. Systematic sampling is simpler to execute in the field (count out every kth tree); simple random sampling has stronger statistical guarantees but needs a way to generate random numbers.

Try it: estimate a rate you can't see

The widget below has a 200-element population: 200 lodgepole pines in one stand, some infested by the Mountain Pine Beetle, the rest healthy. Here's the catch — and it's the same catch every real survey faces: you can't see which is which. A tree's condition shows only after you inspect it, and the true infestation rate stays hidden until you commit an estimate.

Sample the population

200 lodgepole pines in one stand — some are infested by the Mountain Pine Beetle, the rest are healthy. You can't tell which from the road: a tree's condition shows only once you inspect it. Draw samples, then estimate the infestation rate for the whole stand.

A 20-by-10 grid of 200 dots representing trees in a lodgepole pine stand. Every dot starts grey (uninspected). The student picks a sample size (5 to 100) and a method (simple random or systematic), then taps 'Draw sample' — sampled trees reveal their condition, orange for infested or green for healthy, and stay revealed across draws. A history strip lists each draw's sample proportion. When ready, the student types an estimate of the stand-wide infestation percentage and locks it in; only then does the whole stand reveal and the true proportion appear, with a verdict comparing estimate to truth and crediting how few trees the student needed to inspect.
Method
InfestedHealthyNot yet inspectedIn latest sample

Population proportion (infested)

?

Latest sample proportion

Your estimate − truth

The stand keeps its secret: tree colours appear only where you've inspected, and the true proportion stays hidden until you commit an estimate — the same position every real survey is in.

Try this sequence:

  1. Set the sample size to 5 and draw a few times. The sample proportion bounces wildly from draw to draw — would you stake your estimate on any one of those numbers?
  2. Increase to 20 and draw a few more. The draws cluster closer together. The history strip is your evidence.
  3. When the draws stop surprising you, lock in an estimate. The stand reveals, and the verdict tells you how close you landed — and how few of the 200 trees you actually inspected.

Larger samples give steadier estimates. The variability across repeated draws shrinks as the sample grows — and noticing when it has shrunk enough to commit is exactly the judgement real surveyors are paid for.

Two samples, same population

A second key idea: two independent samples drawn from the same population produce similar but distinct estimates. They don't land on the same number, but both land close to the truth.

Two samples from the same population

Sample A: 5/19 infested ≈ 26%

Sample B: 6/19 infested ≈ 32%

Sample A and Sample B were each drawn at random (N = 19) from the same 80-tree stand. Both estimate the same true infestation rate; both land close to it, but not on the same number.

Sample A and Sample B were each drawn from the same 80-tree stand, each with N = 20. Both estimate the same true infestation rate; both land near it, but on different specific numbers. Their similarity is what makes sampling trustworthy: if two random samples disagreed wildly, you'd be right to mistrust the method.

Where it shows up in real life

The Mountain Pine Beetle infestation of Alberta's lodgepole pine forests is the lesson's anchor, and a real, urgent example of why sampling matters. The province monitors infestation rates across roughly 6 million hectares of pine. Surveying every tree would take centuries; surveying nothing leaves the forest defenseless. Instead, Alberta Forestry biologists sample, typically through aerial-photo grids and ground-plot verifications, and use the sample rate to estimate the infestation level across the whole area. Management decisions (controlled burns, sanitation harvests, pheromone traps) get made from those sample-based estimates.

Other examples on the prairie: Statistics Canada's Canadian Community Health Survey samples roughly 65,000 Canadians out of 40 million to estimate national health indicators: chronic disease rates, smoking rates, mental-health metrics. A bushel-weight check for grain shipments samples a few cups out of a 60,000-bushel rail car to estimate moisture content. A routine election poll samples about 1,000 people to estimate how Canadians will vote.

In every case, you can't reach the whole population. The sample is what stands in. And it works, most of the time, because of how the sample was drawn.

Worksheet

These aren't graded. Get them right, get them wrong. The goal is to apply the vocabulary cleanly: identify populations, pick the right method, recognize representative samples.

Practice · Not graded

MA.7.STA.1

Practice the idea

01 / 09

Which statement correctly distinguishes a population from a sample?

Multiple choice: distinguish population from sample.
Show common mistakes

Student says

'A bigger sample is always better. If I survey 1000 people instead of 100, I get a more accurate estimate, period.'

What it reveals

Treats sample size as the only factor in estimation quality. Size matters, but representativeness matters more: a biased sample of 1000 is worse than an unbiased sample of 100.

Targeted response

In the PopulationSampler widget, draw a sample of 100 using simple random or systematic; the proportion lands close to truth. Now imagine drawing 100 only from the corner of the grid. Same size, worse estimate. The sample's SOURCE matters more than its size. A biased big sample is biased; a representative small sample isn't.

Student says

'A sample isn't reliable because it doesn't survey everyone.'

What it reveals

Treats any incomplete count as unreliable. Real sampling is reliable precisely because the math accounts for variability, and an unbiased sample's estimate converges to the truth as the sample grows.

Targeted response

The widget makes the convergence visible: sample 5 trees and the proportion bounces; sample 100 and the proportion stays within a percentage point or two of truth almost every time. Sampling isn't a guess; it's an estimate with a known reliability profile. Most real-world estimates (election polls, health surveys, beetle infestation rates) are sample-based for exactly this reason.

Student says

Picks a sample from the most convenient subset: 'I asked my friends at lunch.' 'I posted on social media.'

What it reveals

Confuses 'easy to reach' with 'representative.' Convenience sampling systematically excludes whoever isn't easy to reach, and the exclusion is usually correlated with the question.

Targeted response

The first question to ask: who is the POPULATION? Then design a method that gives every member of the population an equal chance of being picked (or at least a chance). 'My friends at lunch' samples only your friend group; the population was Grade 7 students. The mismatch is the bias. Use simple random or systematic methods to draw from the whole population, not just the part nearest at hand.

Going further

The next lesson uses sample data to compute four summary statistics (mean, median, mode, and range) and asks what those numbers say about the population the sample came from. The sampling logic from this lesson is the foundation: if the sample was drawn well, the statistics estimate the population's; if the sample was biased, the statistics inherit the bias.

In Grade 9, you'll start computing measures of spread beyond range (variance, standard deviation), and meeting the formal notion of a sampling distribution: the distribution of sample estimates if the sample is drawn many times. The two-sample comparison strip above is a first glimpse of that idea: each new random sample produces an estimate, and the estimates cluster around the truth.