2. INDUSTRIAL STATISTICS Roland Caulcutt, BP Chemicals Lecturer, European Centre for Total Quality Management, University of Bradford
2.1 Introduction
In the series of workshops for practising teachers that are written up in this book, the speakers that have contributed could be placed into one of two groups. The first group contain the speakers who focus upon the use of statistical techniques by people employed in business, industry and government. The second group contain those speakers who address the pedagogical and curriculum issues associated with the learning of statistics in schools. I am most certainly in the first group.
For many years I have been associated with numerous companies in manufacturing industry, as an employee or as an external consultant. I have advised managers and researchers on the use of statistical techniques to improve the quality of products and to increase the efficiency of complex processes. Much of my experience is in the chemical industry, but my work has ranged widely across the food, drink, electrical, motor vehicles, packaging and distribution sectors. Thus I am able to report which statistical techniques are most widely used by non-statisticians within these industries.
Perhaps the best way for me to achieve this objective is to describe the content of some of the short courses I have run for managers, scientists and engineers during the past 20 years.
2.2 Basic Techniques for Data Analysis
Statistical techniques are used in industry by statisticians, managers, scientists, engineers, supervisors, technicians and operators. Different users have different objectives, of course, but the majority are attempting to take account of variability whilst assessing the past, controlling the present, or predicting and planning for the future. Many feel that they do not have sufficient data to draw valid conclusions. The others appear to have too much data. I have never met anyone who considered that he or she had just the right quantity of data.
So what statistical techniques do people in industry actually use, when they are assessing, controlling and predicting? They use quite a range of techniques, some of which are heavily used but only in certain industries. The most widely used statistical methods include:
b) confidence limits for the difference between two population means;
c) t-tests for one-sample, two-samples and matched samples;
d) simple regression and correlation;
e) cusums;
f) statistical process control charts;
g) process capability indices;
h) techniques for assessing the precision and accuracy of test methods or measurement systems;
i) confidence limits for population percentages and chi-squared tests.
The final item can be omitted for those who do not need to analyse qualitative data. Simple non-parametric methods can be added, though they are little used in industry and do not appear frequently in the research journals read by industrial scientists and engineers.
2.3 Teaching Basic Techniques in Industry
Teaching statistical techniques to managers and researchers in industry differs considerably from the teaching of GCSE or A-Level statistics in schools or colleges. Perhaps the greatest difference is in the time scale. Industrial clients are prepared to attend a short course that lasts only three, four or perhaps five days. During this time they wish to learn how to use the techniques and how to avoid misusing them. Some course members would also like to study the theory underlying the practical methods, but they realise that this is not possible in the time available. Some experience anxiety at the start of the course, fearing that their mathematical foundations may be inadequate.
If these fears are to be quickly allayed, care should be taken to choose a suitable starting point, and to use a problem centred approach. I prefer to start the course by considering one of several realistic case studies that require either the use of confidence limits for a population mean or the assessment of process capability indices. Thus the course members are introduced to relatively simple techniques, using only simple formulae for the calculations, but achieving a useful objective. This would not be true if we spent say, one day laying a foundation of probability theory, then a second day discussing sampling distributions.
Confidence limits for the population mean are obtained from
the formula
in which
is the sample mean
calculated from the data and n is the number of data values used
in this calculation. Clearly 't' and 's' are not so easily
explained. 's' is defined as a measure of spread or variability
which emerges from the electronic calculator alongside the mean.
't' is obtained from the t-table with n-1 degrees of freedom.
Course members readily accept that they will never understand how
the values in the t-table are calculated, and it would not be
helpful to spend valuable time introducing the t-distribution.
Time is much better spent discussing the assumptions underlying
the t-table and the formula we have used to calculate the
confidence limits. To check the normal distribution assumption we
draw a dot plot or histogram. To check the stability of the
process we plot a run chart. We cannot check the random sampling
assumption, but a practical exercise demonstrates very clearly
how non-random sampling can lead to ridiculous conclusions.
All of this can be covered in a lecture/discussion lasting 60 minutes. A tutorial session follows, in which course members have the opportunity to practise the technique and to explore various complications such as outliers and skewness. They can also develop an acceptance of the t-table and the standard deviation that gradually appear more "reasonable" with frequent use. For the tutorial I allow 90 minutes, leaving time for a second lecture/discussion before lunch. This could be devoted to confidence limits for the difference between two population means, an obviously useful technique for comparing two methods or treatments. When the interval does not contain zero, we have proved beyond reasonable doubt that the treatments differ in their effect. The assumptions underlying this technique are discussed. Double dot plots and double run charts are used to check the assumptions. A reminder of the dangers of unrepresentative samples is appropriate at this point.
The course continues with two or three lectures per day, each followed immediately with a tutorial. Case Studies and interactive group exercises are included to give a broader perspective and to provide practice in the choice of the most appropriate technique. It is hoped that, by the end of the course, the clients will be able to use a repertoire of simple statistical techniques, with safety, and be able to explain the benefits of so doing.
2.4 Industrial Statistics - The Future
Many years ago, at the end of a short course, a client said to me, 'Until you have attended a good course in statistics you don't know what you are missing'. I was pleased that he thought my course was good, but also disturbed to realise that there might be many thousands of people in industry who didn't know what they were missing. Since then many other clients have expressed similar opinions and I am convinced that there are vast numbers of industrial managers and researchers who do not realise how much they would benefit from a good course in statistics.
An unknown percentage of these people have already studied statistics as a subsidiary subject whilst reading for a degree in science, technology, business, etc. Clearly that study did not equip all of them to make use of statistics, nor did it give them the enthusiasm for further pursuit of a subject that, perhaps, did not appear relevant to their main interests. Contrast this state of affairs with that in Japan, where all school children are taught "the seven basic tools for quality improvement". Would it not be possible for British schools to demonstrate the relevance of statistics to other disciplines and the usefulness of statistical techniques in business and industry?
2.5 Conclusion
The focus on quality, that developed in the late eighties and continues to spread, offers an opportunity and a challenge to statisticians and to teachers of statistics in schools, colleges and universities. Simple statistical techniques are extremely useful in the never ending pursuit of quality improvement. However, they need to be understood and used by people at all levels, from senior management to shop floor workers. I believe the foundation for this understanding will have to be laid in schools.
2.6 Bibliography
The following books give some indication of how statistical techniques are used in industry. Most of the books were written for scientists, managers, engineers and researchers, rather than statisticians.
Box, G.E.P., Hunter, W.G., and Hunter, J.S. (1978). Statistics for Experimenters, Wiley, New York.
Caulcutt, R. (1991). Statistics in Research and Development, Chapman and Hall.
Daniel, C. (1976). Applications of Statistics to Industrial Experimentation, Wiley: New York.
Duncan, A.J. (1974). Quality Control and Industrial Statistics, Irwin.
Neave, H.R. (1990). The Deming Dimension, SPC Press.
Oakland, J.S., and Followell, R.F. (1990). Statistical Process Control, Heinemann.