CUSUM: Computer Simulation

for Statistics Teaching

F. H. THOMAS and J. L. MOORE

The use of computers for simulation can be used not only to introduce CUSUM (cumulative sum) techniques but also to improve students’ understanding of statistical inference and decision making.

While it is unlikely that a student will learn much of the basic ideas of statistical decision-making without acquiring a sound working knowledge of ways of calculating means, variances and other basic measures, there is always a risk that teachers will place too much emphasis on arithmetical processes. Undue attention to mechanical calculation can cause teachers and students alike to lose sight of the inferential side of statistics, the drawing of conclusions and the making of decisions.

Bissell (1974) has suggested three guidelines for the teaching of statistics to those who require of the subject either an elementary introduction or a broad appreciation. These guidelines are:

1. The practical relevance of the subject should be stressed, the students’ interests being guided and developed by the use of examples drawn from everyday situations.

2. A non-mathematical approach should be used, especially in the early stages.

3. The generation of data by a variety of means should be used as an aid to the absorption of statistical ideas.

The introduction of the computer into the school classroom has brought a new technique to teaching, the technique of simulation. In the context of teaching statistics, computer simulation enables students to generate data having the essential characteristic of variability and to follow processes run at a convenient speed which may be faster or slower than in the real instance. The student can now be placed in a decision-making role, a role in which he has to make statistical inferences based on data presented to him and then to take the appropriate actions. It is an essential feature of the interactive simulation that the next set of data generated and presented will depend on the student’s previous decision(s).

We believe that simulation via the computer offers numerous possibilities for presenting the basic ideas of statistical decision-making in a relevant, meaningful and yet non-mathematical fashion. It is the ability of the computer to interact with the student that distinguishes it from all other teaching aids.

To illustrate the use of the computer as an interactive teaching aid and to demonstrate that Bissell’s guidelines can be successfully applied to the teaching of statistics at school-level, we have developed a teaching package CUSUM (1977). This package is available from Central Program Exchange, The Polytechnic, Wolverhampton.

CUSUM: a computer simulation

Outline

The process simulated is the packing of matches into boxes, each containing a nominal 50 matches. Natural variation causes some boxes to contain more and some less than this figure and in addition the machine-setting is liable to suffer discrete integer-value "jumps" that move the mean up or down from the initial value of 50.

The simulation places a student or small group of students in a quality-control situation where they have responsibility for detecting and correcting for under-or-over-packing as quickly as possible. The student is shown how to keep records of "production" in the form of three graphs one of which is a cusum chart.

Readers unfamiliar with cumulative sum techniques are referred to Woodward (1964).

Target population

The simulation has been found suitable for fifth- and sixth-form pupils taking courses that include an appreciation of statistical techniques. It has also been used with Further Education students taking first-level statistics as a service subject for Business Studies etc. No prior knowledge of statistics is assumed or required.

Aims

The package has a number of aims falling within the general philosophy outlined in the introduction, these aims are:

(i) to present statistical ideas in an interesting and relevant manner without the need for extensive arithmetical calculation.

(ii) To develop an appreciation that statistics is concerned with making decisions and that often these decisions are based on inferences drawn from variable data.

(iii) To emphasize that in a practical context each decision results in consequences that may require further decisions or actions.

(iv) To show the value of the graphical presentation of data and in particular the value of the cusum chart for detecting changes in the value of a population mean.

Example

The student’s notes suggest that initially he takes samples of size 5 (this may be changed later) and that he keeps records in the form of three graphs.

Graph A is simply a plot of each item of each sample. Due to natural variation each item in a sample may deviate (positively or negatively) from the nominal mean. The total of these deviations is calculated for each sample and in graph B it is this figure which is plotted. In addition a running total is kept of these deviations and after each sample this cumulative sum is plotted giving the graph C, the cusum chart.

To illustrate these ideas suppose that the first three samples are

1. 50, 51, 52, 50, 50; 2. 51, 52, 49, 49, 48; 3. 51, 52, 50, 51 ,48 the following deviations from 50 are obtained

Sample 1 0, 1, 2, 0, 0; Total deviation 3; Cumulative sum (5 items) 3

Sample 2 1, 2, -1, -1, -2; Total deviation- 1; Cumulative sum (10 items) 2

Sample 3 1, 2, 0, 1, - 2 ; Total deviation 2; Cumulative sum (15 items) 4

and the three graphs A, B and C of Figure 1 are drawn.

In graphs A and B a rise in the population mean results in a general rise in the points of the graphs relative to the horizontal axis. The effect in graph C is quite different. If the average values of the samples are close to the proposed value of 50 then the cusum chart will be more or less horizontal. A rise in the population mean results in the cusum chart sloping upwards. Similarly a drop in value of the population mean lowers the general level of points in graphs A and B and causes a slope downwards in graph C. Thus in the cusum chart one is looking for change of gradient and the pupil soon discovers that this is more readily observable than the change in level of graphs A and B. After a while the pupil can stop drawing graphs A and B and rely only on the cusum chart. At this stage the teacher may care to suggest that the cusum chart can be plotted after each item rather than after each sample.
 
 

Figure 1. Graphical recording of CUSUM samples.

Figure 2 shows a run during which no changes were made by the operator and the system was allowed to fluctuate naturally. The system, as revealed by the summary table provided by the program, ran correctly until sample 6 when the process mean changed to 49. It corrected itself by chance at sample 13 only to change once again at sample 19 when the mean became 51. The corresponding changes are barely observable on the top two graphs but the changes are clearly indicated on both versions of the cusum chart.

The program

The package is based on a computer program written round a random number generator (Moore and Thomas, 1979) which is used to produce a nearly normal distribution with mean P (initially set at 50) and fixed spread. The generator is also used to determine the number of samples supplied before a change in the value of P occurs. After such a change is detected the pupil can exercise his option to change P by ± 3, ± 2, ± 1 and thus restore the mean to 50. He then continues sampling. The program has been so arranged that the student should have no difficulty in detecting and correcting for one change in P before the next random change occurs.
 
 

Figure 2. Graphs illustrating a run of CUSUM during which changes in the process mean occurred at samples 6, 13 and 19.

 

The sequence of samples obtained by each student is determined by his choice of a starting seed (for the random number generator) input at the beginning of the simulation. The use of different seeds ensures that each student will obtain his own unique series but any particular run can be repeated by giving the same starting seed on the subsequent occasion. This feature enables the teacher to "set-up" a demonstration or a student to re-run a trial to determine the effect of a different decision at some stage.

After choosing to end the simulation, the student may request a summary of the changes that have taken place during the run. From this summary the student can assess his performance by noting for how long he allowed machine induced changes of mean to remain uncorrected.

Time required

We suggest that students work in groups of two or three per terminal sharing the work of taking the samples and Plotting the three graphs. One or two runs of the simulation lasting about half an hour in total is usually sufficient to achieve the aims set for the package.

Summary

Computer simulations offer the teacher the chance to use more varied and where appropriate non-mathematical approaches to the teaching of statistics. The teaching package CUSUM in which the student is responsible for the quality control of a simple one-dimensional process can give students of a wide range of interests and ability an insight into the use of statistics as an industrial aid.

References
Bissell A. F. (1974). Discussion on the Report of the Joint Committee on the Teaching of Statistics J. Roy. Stat. Soc. Series A. 137 p.416.
Moore, J. L. and Thomas, F. H. (1979). A random number generator for BASIC, Computer Education 33 pp. 13—16.
Woodward, R. H. and Goldsmith, P. L. (1964). Monograph No. 3, Cumulative Sum Techniques, Oliver & Boyd, London.

Back to top

Back to contents of The Best of Teaching Statistics

Back to main Teaching Statistics page