9. USING COMPUTER SOFTWARE TO HELP TEACH STATISTICS Derek Robinson, University of Sussex

9.1 Introduction

Statistics is a subject with great vitality! For a number of years Statistics has been developing a greater role in school and college curricula. There are various reasons for this. As indicated most forcibly by several talks in this book, Statistics is becoming more important in a wide range of businesses, so that for students with a mathematical bent, Statistics is increasingly seen as the natural outlet for their talents in employment. In schools and colleges, the subject lends itself naturally to the type of practical work which has been encouraged in recent years. And at the "frontiers of knowledge", Statistics is seeing many exciting developments, often associated with advances in computing power.

This is a great background, but it does not necessarily help in the day-to-day business of getting through a syllabus. It has to be admitted that learning the use of Normal tables or what a statistical test is all about are not in themselves necessarily the most riveting of activities.

9.2 Using Computers in Teaching

Adrian Bowman, of Glasgow University, and myself some years ago became convinced that computers can play a role in making the traditional classroom teaching more exciting. We were not so much thinking of the data analysis use of computers, provided by packages such as Minitab or Instat, though of course these can be extremely useful in taking the tedium out of calculations and allowing more realistic problems to be tackled. Rather our aim was to develop software to help explain statistical ideas and methods. We have published three textbook/software packages: Introduction to Probability, Introduction to Statistics and Regression and Analysis of Variance. In this article I will describe ways in which the software is used to motivate and explain ideas through the use of graphics, simulations, animation and a tutorial mode of presentation. Of course, in describing these approaches on paper much of the attraction of the method is lost e.g. the colour, the animation and the scope for interaction. The reader is urged to try out the software to fill in these gaps!

9.3 Using Graphics to Summarise Data

This is often described as the "electronic blackboard" use of the computer i.e. it is doing what teachers may already do on the board. The computer has the advantage that it can produce diagrams more quickly and accurately and also results in more time being made available for students to try out different variations.

We start with a simple example relating to histograms, using program HIST in Introduction to Statistics. The program can run with your own data, but for this example we use the default data set, the concentration of a mineral in some rock samples.

The program draws a histogram, Figure 1a, which we see is skewed to the right. If the normal distribution is being studied, students will, we hope, realise that the data do not look as if they can be modelled by this distribution. They may also know that this is a pity, since there is some useful theory about that distribution. default data set, the concentration of a mineral in some rock samples. The program draws a histogram, Figure 1a, which we see is skewed to the right. If the normal distribution is being studied, students will, we hope, realise that the data do not look as if they can be modelled by this distribution. They may also know that this is a pity, since there is some useful theory about that distribution. Rather than lose hope, it seems sensible to see if a transformation of the data might produce something approximately normal; for instance, taking logarithms is often helpful with data skewed to the right. We therefore choose the "Transform" option of the program and type "LOG(X)". The histogram of the transformed data is shown in Figure 1b. Certainly it appears to be more symmetrical, though there is a hint that the data are now skewed to the left.
 

  Figure 1a: histogram of mineral concentrations

 
 

  Figure 1b: histogram of transformed data

 

We could go on to try other transformations, such as taking square roots, until the transformed data do appear to have an approximately Normal distribution. Being able to see the changing shape of the histograms as different transformations are chosen greatly reinforces abstract discussion of the value of transformations. (Program HIST also demonstrates various other methods, such as the construction of ogives.)

9.4 Using Graphics to Teach Hypothesis Testing and Types of Error

In order to show that graphics can help demonstrate rather more complex ideas, let us look at an aspect of teaching hypothesis testing. We shall consider the case of Normal data where the variance is known. Let us suppose that students have been introduced to the ideas of null and alternative hypotheses, type I errors and critical regions and we want now to start to get them to think about type II error, power and sample size.
 

  Figure 2a: hypothesis testing; critical regions  
Figure 2b: shaded area is the power at m = 41  

 

In Introduction to Statistics, we start by considering simulations, but here we shall move on to the more formal stage. Figure 2a, taken from the output of program POWER, shows a test of Ho : m = 40 against H1 : m40 for Normal data, with variance equal to 9, based on a sample of size 10. A 10% significance level has been chosen. Figure 2a shows the probability density function (pdf) of the sample mean under Ho and the critical region of the test. We select the option to evaluate the power at m = 41, say, and the computer produces the display shown in Figure 2b. The curve on the right is the pdf of the sample mean when m = 41, and the shading shows the probability of rejecting Ho. The power is evaluated as 0.281.
Several uses of this option with various values of m soon give the idea of the meaning of the power of a test. The next step is to investigate the power for different values of m . One option allows the powers to be plotted as they are evaluated and once this has been explored the student can move on to plot the complete power function over the range of interest, Figure 2c. The strengths and weaknesses of the test for detecting various changes in the value of m from the null hypothesis value can now be discussed. Also, we can explore the effect of changing, say, the sample size. In Figure 2d, the second power curve shows the effect of increasing the sample size to 20. The student can now go on to consider the use of power in determining sample size. All these quite sophisticated ideas have been developed without introducing mathematical notation.
 

  Figure 2c:  power function for sample size 10  

 
 

  Figure 2d: power function for sample size 20 added

 
 
9.5 Simulations

Monte Carlo simulation is a natural tool to use in probability and statistics. It should be appreciated, though, that to be interesting to students a degree of user control is needed to allow exploration and it is particularly useful if unexpected behaviour occurs under some conditions.

The Central Limit Theorem is a key result in Statistics, but a proof is beyond the scope of first courses. Perhaps because of this, students often misunderstand what is going on, the most common mistake being to think that all large samples must have a Normal distribution. The student needs to appreciate not only the true meaning of the result but also when it is reasonable to use the Normal approximation. Simulation provides a way of illustrating the Theorem and showing what sample sizes are needed to make the asymptotic result relevant. Program CLT, from Introduction to Probability, allows a number of predetermined distributions to be simulated or for the student to specify a distribution by giving its pdf. Having chosen a distribution, the student enters the sample size and number of simulations and the computer then carries out the simulations, at the same time constructing a histogram. In Figure 3a, the Uniform distribution has been chosen, with sample size 1. We see that, subject to some random variation, the histogram has similar shape to the Uniform pdf. Now suppose that the student chooses sample size 2. The resulting histogram, of the sample means of pairs of simulated values, is shown in Figure 3b and we see that the shape is already reminiscent of a Normal distribution, which is superimposed for comparison. (The true pdf is actually a triangle.) The student can now explore the shapes for other sample sizes and draw conclusions about how large the sample size needs to be for the Normal approximation to be reasonable.
 

Figure 3a: simulating a Uniform distribution  

 
Simulations can be rather unexciting if things always go right!  So the student might go on to the outlier distribution, whose pdf is at the tope of Figure 3c.  The histogram obtained when the sample size is 10 is shown at the bottom of Figure 3c and, clearly, this is nowhere near a Normal.  The student can be asked to think why the shape is like this and can again go on to explore if and when the Normal approximation does apply.
 

  Figure 3b: simulating a Uniform distribution; histogram of sample means, size 2  

At the very least, these demonstration should serve to convince students of the plausibility of the Central Limit Theorem and may moticate some to explore it more deeply.  One spin-off from this type of simulation exercise is to make students aware of the cariability of the shapes of histograms for small samples even when the underlying distribution is fixed.  Students should not expect, for instance, the 50 observations from a Normal distribution should necessarily have a histogram with a perfect bell shape.
 

  Figure 3c: simulating an outlier distribtuion; histogram of sample means, sample size 10  

 

 9.6 Use of Animation

Animation is used in a number of programs in the Computer Illustrated Text series, for example in demonstrating the construction of histograms and stem-and-leaf plots and in rearranging data when carrying out nonparametric tests. The following example, making use of program TRNSPDF from Introduction to Probability, will, I hope, give some of the flavour of the value of this type of approach. The program concerns the transformation of distributions and illustrates how, for instance, the shape of the pdf of a random variable X can differ greatly from the pdf of X2. The basic result relating the pdf's of X and Y=h(X) is:    

  Figure 4a: lower graph is pdf, upper is transformation curve   The aim of this program is to make this result more intuitive. We shall suppose that the student has chosen X to have a quadratic pdf and he or she wants to find the pdf of p X2 . In Figure 4a, the pdf is shown at the bottom and the transformation is shown at the top. The first step is to choose an interval and see what happens to the corresponding probability when the transformation is made, Figure 4b. We see that the area under the pdf corresponding to the probability has "risen" to meet the transformation curve. Call the width of the slice  and the corresponding range of y values . In Figure 4c, the "probability slice" has turned the corner. 

Since it must have the same area as before and , the length of the slice must have changed by a factor , hence the term |dx/dy| in the above formula. The probability slice now moves to the left until its base rests on the vertical y axis, Figure 4d. This is repeated for other slices until the approximate pdf of Y is fully constructed, Figure 4e. The resulting pdf is seen to be a different shape from the pdf of X. The width of the slices can now be reduced in order to produce a better approximation to the pdf of Y, if desired. The animation has motivated the mathematics.

      Figure 4b: a probability "slice" chosen           Figure 4c: the probability slice is transformed

 

Figure 4d: the slice is plotted on the rotated graph Figure 4e: graph of all probability slices 
 

 

9.7 Learning Routine Tasks

This is perhaps a rather unspectacular use of the computer but can be very useful for helping students learn routine tasks. We shall illustrate this approach using program CHISQ from Introduction to Statistics. The program goes through the steps of carrying out a c 2 contingency table test. The (artificial) data shown at the top of Figure 5a relate to the preferences of different age groups for different coloured tablets (pink, orange or white).

We want to see whether there is significant evidence of a difference between the age groups in respect of their colour preferences. The program goes through the method of carrying out the test, progressing in small steps; we shall illustrate just two of them. In the first, Figure 5a, the calculations required to evaluate the expected frequencies are shown in detail. These are repeated for each entry in the table, so that the student can see the formula given in the text in action.
 

 
Figure 5a: evaluating expected frequencies    

Towards the end of the analysis, we arrive at Figure 5b. The test statistic has been evaluated and found to be significant and interest now concerns the nature of the difference between the age groups. The term '5.38' in the calculation of the test statistic is underlined, since it is the largest term in the sum, and the corresponding observed and expected frequencies are highlighted. We see that the observed frequency is 26, while the expected frequency is 16.6, indicating an exceptionally high preference for the colour pink among the youngest age group.
 

 Figure 5b: investigating the nature of the association  between the variables

 
9.8 Conclusions

The software is used in various ways on different courses and at different institutions. The simplest to organise is in classroom demonstrations integrated into a standard teaching session, with the teacher or lecturer operating the software. The programs are easy and flexible to operate, so it is often possible to use them to answer students' 'what if?' questions.

The software can also be used in a more student-centred way, if enough computers are available. For example, students can proceed through specified sections of the appropriate Computer Illustrated Text, using the programs as suggested in the book. Many teachers may prefer instead to prepare their own worksheets, using the software to amplify their own explanations of topics or to help tackle exercises. However the software is used, it is there to help the teacher, not to replace him or her.

We have seen that Statistics is a vital subject, rightly seen by students as relevant to their lives and a source of interesting practical material. I believe that even the theory can be motivated in an exciting way by using graphics.

9.9 References

Robinson D R and Bowman A W (1986) Introduction to Probability, Adam Hilger, Bristol.

Bowman A W and Robinson D R (1987) Introduction to Statistics, Adam Hilger, Bristol.

Bowman A W and Robinson D R (1990) Introduction to Regression and Analysis of Variance, Adam Hilger, Bristol.

[Software is available in PC and BBC B versions; Archimedes versions are also available by special arrangement with the publisher.]

Top

Contents