Do- it-Yourself Opinion-poll Sampling Experiments

FLAVIA R. JOLLIFFE

Sampling experiments based on Opinion-poll type data might be an interesting alternative to coin-tossing experiments for social science students. This paper describes some which I have developed and used. Initially I prepared a population of slips of paper to represent potential voters for use in a classroom situation Later I made part of this into an exercise for an assessed assignment

The size of the population. of slips of paper was 300, of which 43% (129 slips) were labelled Conservative, 45% (135 slips) were labelled Labour, 8% (24) were labelled Liberal and 4% (12) were labelled Others These proportions are not dissimilar from the proportions of votes for the parties in recent General Elections (1). It was thought that 300 would be large enough to be acceptable by the students as a population yet, not so large as to be unmanageable

Ages of the members of the four Political groups represented by slips of paper were obtained as follows. It was assumed that for each group there was an underlying super-population~ of ages which was normally distributed with mean and standard deviation in years as shown in the table. The assumption of normality was made mainly for convenience, and is possibly not unreasonable. The means and standard deviations were derived from some age distributions obtained by Gallup in a sampling experiment (unpublished) and in this sense are real. (Consideration of how to obtain the ‘true’ values might be an interesting topic for class discussion!)
 
Party
Mean Age (years)
Standard deviation of age (years
Conservative
50.5
14.25
Labour
46.8
13.94
Liberal
49.3
10.28
Others
50.8
15.46

Using tables of random normal deviates z, ages for a group were then found as X = zs + µ whereµ is the mean and s the standard deviation of age in the super-population for that group. X was then rounded to the nearest whole number If extremely low ages (say less than 15) or extremely high ages (say greater than 100) occur in such a simulation it would be sensible to reject these as unrealistic, but of course the chance of such extreme values is very small with the given means and standard deviations. Naturally the means and standard deviations of age in the finite populations generated by this process are subject to sampling fluctuations in the usual way and cannot be specified in advance.

Slips of stiff paper 4" x 3" (about l0cm by 7.5cm) in size were used. On each slip was written the political party, the age, and a unique identifier (a number between 1 and 300) of the potential voter it was meant to represent. Slips were folded into four with information on the inside and placed in a carrier bag.

There are many exercises which can be done with such a population For example, one can ask students to select samples, record the information written on the paper, and consider how to estimate the proportion in each party and the average age in each party. Discussion in connection with this could include the meaning of populations and samples, the distinction between sampling with replacement and without replacement, the idea of randomness, and an indication of how to choose sample size. Following on from this the difference between a statistic and a parameter, the idea of a sampling distribution and perhaps the idea of an interval estimate. All of this can be covered at the start of the statistics course.

Next one might ask students to choose samples of a fixed size, say 10, without replacement and find the sample proportion in each party. Students could also use random number tables to choose a sample of identifiers and the instructor could provide information on age and party from a central check list. The two methods of obtaining the sample can be compared and discussed in relation to a more real situation. The exercise could be repeated with larger samples and the average age in each of the two large parties found. In fact two sub-populations consisting of the ages of the Conservative and the ages of the Labour supporters were also prepared and the average age in samples of a fixed size from these was found. In all cases it is sensible to record the results of the different students centrally and discuss them in class.

There are clearly some practical difficulties with these experiments. Preparation is time-consuming and bags of slips of paper tend to be awkward in use. Ideally several bags are needed so that all students have a chance to draw several samples. However the set-up is easy to understand (no mysterious expensive equipment) and the exercise was useful to initiate discussion.

As it is instructive to let students sample from normal populations I prepared an assignment on similar lines as follows. Students were told to suppose that the (super) population of ages of either Labour or Conservative supporters had a normal distribution with a particular mean and standard deviation (exact details indicated for each student) and were asked to take a sample of at least fifteen ages from it by using a table of random normal deviates They were then asked to take random samples of fixed size from the set of ages they had generated sampling without replacement, and to find the means of the samples in order to start building up the sampling distribution of the mean empirically. Guidance was given in class as to what size of sample and how many samples to take. This last part of the assignment is a variation on an example presented in many introductory texts and was found to be useful in giving students an understanding of the sampling process.

Students were given different (super) populations The reason for doing so and for leaving some of the details of sampling open was that students were being assessed on the basis of coursework see [3]. In some instances there would be advantages in having all students do the same exercise. It is thought students would gain most from this exercise if it is used as a back up to the concepts it involves.

Clearly both the classroom exercise and that used for an assignment can be changed in various ways. Other variables can be used and less or more information be put on the slips of paper. Students could be given complete information about the finite population and it could be used to illustrate many topics in an introductory statistics course, as in [2]. The whole process, including the sampling, could be done on a computer.

Brunel University
 
 
References 1. Butler, D. and Sloman, A. (1975). British Political Facts 1900—1975. 4th edition. Macmillan.

2. Hodges, Jr., J. L., Krech, D. and Crutchfield, R. S. (1975). STATLAB An empirical introduction to statistics. McGraw-Hill.

3. Jolliffe, F. R. (1976). "A continuous assessment scheme for statistics courses for social scientists."Int. J. Math. Educ. Sci. Technol., Vol. 7, No. 1, 97—103.

Back to top

Back to contents of The Best of Teaching Statistics
Back to main Teaching Statistics page