A Useful Display of a Normal Population

ROBERT W. JERNIGAN

Most courses in basic statistics are filled with discussions of the mysterious normal population. Statistical tests assume it, students find areas for it, a theorem tells us that in the limit everything is it, but students still wonder what is it? Considering the importance and the repeated applications of normal populations this lack of understanding is unfortunate.

Part of this problem is that most students have never seen a normal population, much less actually worked with one. To attack this problem I was resolved to give my students direct exposure to, at least an approximation of, a normal population. Of course any physical model must be a discrete approximation to the normal distribution. This is somewhat unfortunate since one difficulty that students have is in understanding continuous probability distributions. But much insight and experience can still be gained by using a discrete approximation.

There are available commercially several approximate normal population models: from tags in a box to counting stripes on sunflower seeds, but I rejected these as too vague or too tedious. In the end, I constructed my own model of a normal population from 500 wooden beads, which are readily available in most hobby or teacher supply stores. The beads were grouped and stacked by colour on heavy wire poles mounted on a 3 foot long piece of 2" x 4" timber. The beads were then numbered to conform approximately to a normal distribution with a mean of 50 and a standard deviation of 10. This selection allowed a range of numbers from 20 to 80; all positive and of magnitudes that are easily comprehended. Figure 1 shows the model population. The frequencies of each number are listed in Table 1 from Li (1964).

Even if used as nothing more than a display, this model would help reinforce the idea of how a bell-shaped curve relates to a discrete population. In class, we derived descriptive statistics for the frequency distribution, many by simply counting beads, e.g. percentiles.
 
 

TABLE 1

Population frequency distribution


 
Bead numbers
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
80
79
78
77
76
75
74
73
72
71
70
69
68
67
66
65
freq'ncy
1
0
0
1
1
1
1
1
2
2
2
3
3
4
5
6
                                   
Bead numbers
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
 
64
63
62
61
60
59
58
57
56
55
54
53
52
51
   
freq'ncy
6
7
9
10
11
12
13
14
17
18
18
19
19
20
20
 

 

To demonstrate sampling and statistical inference we conducted sampling experiments. For example, the beads were removed, placed in a bowl, and each student was instructed to draw, with replacement, 10 samples each containing 5 beads. For each of the 10 samples the students calculated the mean, and these values were arranged in a frequency distribution shown in Table 2.

To demonstrate many of the well-known theorems of statistics, we noticed that the frequency distribution of the sample means followed the characteristic bell-shaped curve, suggesting that the sample means were themselves a normal population. Next, computing the mean and standard deviation of the sample means, we obtained the values 50.24 and 4.41, very close to the theoretical values of 50 and 10/5 = 4472, respectively.

The model proved the most useful in illustrating hypothesis testing. Here, I had the opportunity to demonstrate several difficult concepts for students of elementary statistics, namely Type I and Type II errors and confidence intervals.

Each student was asked to use their samples to test the assumption that the true mean of the population was 50, assuming that the standard deviation was known to be 10. This test was made against the alternative that the mean was not at the 5% level of significance. Each of 30 students performed this test on their first 5 sample means.
 
 

TABLE 2

Frequency distribution for the sample means
 
Class interval
frequency
39-40
3
41-42
4
43-44
16
45-46
27
47-48
39
49-50
49
51-52
36
53-54
40
55-56
29
57-58
11
59-60
3
61-62
1
63-64
2

 

Even though the mean of the population was truly 50, and the students know that they should not reject this true hypothesis, 10 out of 150 or 6.6% of the sample means indicated that the assumption was false. This is a Type I error and 0.066 is the probability of committing this error. The discrepancy between the stated significance level and the computed significance level is explained by the fact that we are dealing with a finite number of samples from a discrete approximation to a normal distribution.

The students were then asked to perform the same tests on five of their samples that had the largest means, thus adding a selection bias. The rejection percentage was now up to 10% = 15/150. This, of course, indicated the trouble of letting a group of data suggest a statistical test.

Type II errors were also demonstrated by testing that the population mean was 65, obviously false. On this test 7.3% = 11/150 did not refute this hypothesis.

Several 95 confidence intervals were also formed for the population mean from 5 different samples. Those calculations showed that 96.6% = 145/150 of these intervals contained the true population mean of 50.

The possibilities for sampling or probability experiments, applications, and demonstrations with this model are unlimited. With the aid of computers, similar and even more extensive sampling experiments could be carried out without the model population. There, one could demonstrate other concepts such as the central limit theorem since you would not be limited to one type of population from which to draw your samples. But this population model has the advantage of providing a visual, in-class demonstration. It also provides the students with a "hands on" approach to basic statistics. They have the opportunity to see statistics at work and gain some needed first-hand experience.

The American University, Washington D.C.
 
 
 
 

Reference

Li, Jerome C. R. (1964). Statistical Inference Vol. 1, Edwards Brothers, Ann Arbor, Michigan.

Back to top

Back to contents of The Best of Teaching Statistics
Back to main Teaching Statistics page