9. USING COMPUTER SOFTWARE TO HELP TEACH STATISTICS Derek Robinson, University of Sussex
9.1 Introduction
Statistics is a subject with great vitality! For a number of years Statistics has been developing a greater role in school and college curricula. There are various reasons for this. As indicated most forcibly by several talks in this book, Statistics is becoming more important in a wide range of businesses, so that for students with a mathematical bent, Statistics is increasingly seen as the natural outlet for their talents in employment. In schools and colleges, the subject lends itself naturally to the type of practical work which has been encouraged in recent years. And at the "frontiers of knowledge", Statistics is seeing many exciting developments, often associated with advances in computing power.
This is a great background, but it does not necessarily help in the day-to-day business of getting through a syllabus. It has to be admitted that learning the use of Normal tables or what a statistical test is all about are not in themselves necessarily the most riveting of activities.
9.2 Using Computers in Teaching
Adrian Bowman, of Glasgow University, and myself some years ago became convinced that computers can play a role in making the traditional classroom teaching more exciting. We were not so much thinking of the data analysis use of computers, provided by packages such as Minitab or Instat, though of course these can be extremely useful in taking the tedium out of calculations and allowing more realistic problems to be tackled. Rather our aim was to develop software to help explain statistical ideas and methods. We have published three textbook/software packages: Introduction to Probability, Introduction to Statistics and Regression and Analysis of Variance. In this article I will describe ways in which the software is used to motivate and explain ideas through the use of graphics, simulations, animation and a tutorial mode of presentation. Of course, in describing these approaches on paper much of the attraction of the method is lost e.g. the colour, the animation and the scope for interaction. The reader is urged to try out the software to fill in these gaps!
9.3 Using Graphics to Summarise Data
This is often described as the "electronic blackboard" use of the computer i.e. it is doing what teachers may already do on the board. The computer has the advantage that it can produce diagrams more quickly and accurately and also results in more time being made available for students to try out different variations.
We start with a simple example relating to histograms, using program HIST in Introduction to Statistics. The program can run with your own data, but for this example we use the default data set, the concentration of a mineral in some rock samples.
The program draws a histogram, Figure 1a, which we see is
skewed to the right. If the normal distribution is being studied,
students will, we hope, realise that the data do not look as if
they can be modelled by this distribution. They may also know
that this is a pity, since there is some useful theory about that
distribution. default data set, the concentration of a mineral in
some rock samples. The program draws a histogram, Figure 1a,
which we see is skewed to the right. If the normal distribution
is being studied, students will, we hope, realise that the data
do not look as if they can be modelled by this distribution. They
may also know that this is a pity, since there is some useful
theory about that distribution. Rather than lose hope, it seems
sensible to see if a transformation of the data might produce
something approximately normal; for instance, taking logarithms
is often helpful with data skewed to the right. We therefore
choose the "Transform" option of the program and type
"LOG(X)". The histogram of the transformed data is
shown in Figure 1b. Certainly it appears to be more symmetrical,
though there is a hint that the data are now skewed to the left.
Figure 1a: histogram of
mineral concentrations |
Figure 1b: histogram of
transformed data |
We could go on to try other transformations, such as taking square roots, until the transformed data do appear to have an approximately Normal distribution. Being able to see the changing shape of the histograms as different transformations are chosen greatly reinforces abstract discussion of the value of transformations. (Program HIST also demonstrates various other methods, such as the construction of ogives.)
9.4 Using Graphics to Teach Hypothesis Testing and Types of Error
In order to show that graphics can help demonstrate rather
more complex ideas, let us look at an aspect of teaching
hypothesis testing. We shall consider the case of Normal data
where the variance is known. Let us suppose that students have
been introduced to the ideas of null and alternative hypotheses,
type I errors and critical regions and we want now to start to
get them to think about type II error, power and sample size.
Figure 2a: hypothesis
testing; critical regions |
|
Figure 2b: shaded area is the
power at m
= 41 |
In Introduction to Statistics, we start by considering
simulations, but here we shall move on to the more formal stage.
Figure 2a, taken from the output of program POWER, shows a test
of Ho : m
= 40 against H1 : m
40 for Normal data, with variance equal
to 9, based on a sample of size 10. A 10% significance level has
been chosen. Figure 2a shows the probability density function (pdf)
of the sample mean under Ho and the critical
region of the test. We select the option to evaluate the power at
m = 41, say, and the
computer produces the display shown in Figure 2b. The curve on
the right is the pdf of the sample mean when m = 41, and the shading
shows the probability of rejecting Ho. The
power is evaluated as 0.281.
Several uses of this option with various values of m soon give the idea of the
meaning of the power of a test. The next step is to investigate
the power for different values of m . One option allows the
powers to be plotted as they are evaluated and once this has been
explored the student can move on to plot the complete power
function over the range of interest, Figure 2c. The strengths and
weaknesses of the test for detecting various changes in the value
of m from the null
hypothesis value can now be discussed. Also, we can explore the
effect of changing, say, the sample size. In Figure 2d, the
second power curve shows the effect of increasing the sample size
to 20. The student can now go on to consider the use of power in
determining sample size. All these quite sophisticated ideas have
been developed without introducing mathematical notation.
Figure 2c: power
function for sample size 10 |
Figure 2d: power function
for sample size 20 added |
9.5 Simulations
Monte Carlo simulation is a natural tool to use in probability and statistics. It should be appreciated, though, that to be interesting to students a degree of user control is needed to allow exploration and it is particularly useful if unexpected behaviour occurs under some conditions.
The Central Limit Theorem is a key result in Statistics, but a
proof is beyond the scope of first courses. Perhaps because of
this, students often misunderstand what is going on, the most
common mistake being to think that all large samples must have a
Normal distribution. The student needs to appreciate not only the
true meaning of the result but also when it is reasonable to use
the Normal approximation. Simulation provides a way of
illustrating the Theorem and showing what sample sizes are needed
to make the asymptotic result relevant. Program CLT, from Introduction
to Probability, allows a number of predetermined
distributions to be simulated or for the student to specify a
distribution by giving its pdf. Having chosen a distribution, the
student enters the sample size and number of simulations and the
computer then carries out the simulations, at the same time
constructing a histogram. In Figure 3a, the Uniform distribution
has been chosen, with sample size 1. We see that, subject to some
random variation, the histogram has similar shape to the Uniform
pdf. Now suppose that the student chooses sample size 2. The
resulting histogram, of the sample means of pairs of simulated
values, is shown in Figure 3b and we see that the shape is
already reminiscent of a Normal distribution, which is
superimposed for comparison. (The true pdf is actually a triangle.)
The student can now explore the shapes for other sample sizes and
draw conclusions about how large the sample size needs to be for
the Normal approximation to be reasonable.
Figure 3a: simulating a Uniform
distribution |
Simulations can be rather unexciting if things always go right!
So the student might go on to the outlier distribution, whose pdf
is at the tope of Figure 3c. The histogram obtained when
the sample size is 10 is shown at the bottom of Figure 3c and,
clearly, this is nowhere near a Normal. The student can be
asked to think why the shape is like this and can again go on to
explore if and when the Normal approximation does apply.
Figure 3b: simulating a
Uniform distribution; histogram of sample means, size 2
|
At the very least, these demonstration should serve to
convince students of the plausibility of the Central Limit
Theorem and may moticate some to explore it more deeply.
One spin-off from this type of simulation exercise is to make
students aware of the cariability of the shapes of histograms for
small samples even when the underlying distribution is fixed.
Students should not expect, for instance, the 50 observations
from a Normal distribution should necessarily have a histogram
with a perfect bell shape.
Figure 3c: simulating an
outlier distribtuion; histogram of sample means, sample
size 10 |
9.6 Use of Animation
Animation is used in a number of programs in
the Computer Illustrated Text series, for example in
demonstrating the construction of histograms and stem-and-leaf
plots and in rearranging data when carrying out nonparametric
tests. The following example, making use of program TRNSPDF from Introduction
to Probability, will, I hope, give some of the flavour of the
value of this type of approach. The program concerns the
transformation of distributions and illustrates how, for instance,
the shape of the pdf of a random variable X can differ
greatly from the pdf of X2. The basic result
relating the pdf's of X and Y=h(X) is:
Figure 4a: lower graph is
pdf, upper is transformation curve |
The aim of this program is to make this result more
intuitive. We shall suppose that the student has chosen X
to have a quadratic pdf and he or she wants to find the
pdf of p X2
. In Figure 4a, the pdf is shown at the bottom and the
transformation is shown at the top. The first step is to
choose an interval and see what happens to the
corresponding probability when the transformation is made,
Figure 4b. We see that the area under the pdf
corresponding to the probability has "risen" to
meet the transformation curve. Call the width of the
slice |
Since it must have the same area as before and
, the length of the slice must have changed by
a factor
, hence the term
|dx/dy| in the above formula. The probability slice now
moves to the left until its base rests on the vertical y axis,
Figure 4d. This is repeated for other slices until the
approximate pdf of Y is fully constructed, Figure 4e. The
resulting pdf is seen to be a different shape from the pdf of
X. The width of the slices can now be reduced in order to
produce a better approximation to the pdf of Y, if desired.
The animation has motivated the mathematics.
Figure 4b: a probability "slice" chosen Figure 4c: the probability slice is transformed
![]() |
| Figure 4d: the slice is plotted on the rotated graph | Figure 4e: graph of all probability slices
|
9.7 Learning Routine Tasks
This is perhaps a rather unspectacular use of the computer but can be very useful for helping students learn routine tasks. We shall illustrate this approach using program CHISQ from Introduction to Statistics. The program goes through the steps of carrying out a c 2 contingency table test. The (artificial) data shown at the top of Figure 5a relate to the preferences of different age groups for different coloured tablets (pink, orange or white).
We want to see whether there is significant evidence of a
difference between the age groups in respect of their colour
preferences. The program goes through the method of carrying out
the test, progressing in small steps; we shall illustrate just
two of them. In the first, Figure 5a, the calculations required
to evaluate the expected frequencies are shown in detail. These
are repeated for each entry in the table, so that the student can
see the formula given in the text in action.
![]() |
|
| Figure 5a: evaluating expected frequencies |
Towards the end of the analysis, we arrive at Figure 5b. The
test statistic has been evaluated and found to be significant and
interest now concerns the nature of the difference between the
age groups. The term '5.38' in the calculation of the test
statistic is underlined, since it is the largest term in the sum,
and the corresponding observed and expected frequencies are
highlighted. We see that the observed frequency is 26, while the
expected frequency is 16.6, indicating an exceptionally high
preference for the colour pink among the youngest age group.
![]() |
| Figure 5b: investigating the nature of the association between the variables |
9.8 Conclusions
The software is used in various ways on different courses and at different institutions. The simplest to organise is in classroom demonstrations integrated into a standard teaching session, with the teacher or lecturer operating the software. The programs are easy and flexible to operate, so it is often possible to use them to answer students' 'what if?' questions.
The software can also be used in a more student-centred way, if enough computers are available. For example, students can proceed through specified sections of the appropriate Computer Illustrated Text, using the programs as suggested in the book. Many teachers may prefer instead to prepare their own worksheets, using the software to amplify their own explanations of topics or to help tackle exercises. However the software is used, it is there to help the teacher, not to replace him or her.
We have seen that Statistics is a vital subject, rightly seen by students as relevant to their lives and a source of interesting practical material. I believe that even the theory can be motivated in an exciting way by using graphics.
9.9 References
Robinson D R and Bowman A W (1986) Introduction to Probability, Adam Hilger, Bristol.
Bowman A W and Robinson D R (1987) Introduction to Statistics, Adam Hilger, Bristol.
Bowman A W and Robinson D R (1990) Introduction to Regression and Analysis of Variance, Adam Hilger, Bristol.
[Software is available in PC and BBC B versions; Archimedes versions are also available by special arrangement with the publisher.]