Measuring Uncertainty
P. G. MOORE

When analysing situations which involve decisions to be made as between alternative courses of action under conditions of uncertainty, decision makers and their advisers are often called upon to assess judgmental probability distributions of quantities whose true values are unknown to them. How can this judgement be taught?

Probabilities reflect a person’s uncertainties about the external world, present and future, and describe a relationship between that person and his environment. They may thus legitimately vary from one person to another. This can worry managers who point to items such as historical accounting data where there is one correct answer whoever carries out the work. Yet this is hardly a true analogy; indeed the current debate on inflation accounting, and the many different opinions that have been expressed, centres around the real difficulty encountered in 5~mmari5ing all the past activities and the future potential of some organisation in a single unidimensional set of figures.

In expressing concern with future outcomes one must distinguish between a forecast and a prediction. A forecast is linked with an outcome under defined conditions or assumptions and may be expressed in probabilistic terms. A prediction is a statement as to what will happen in some given sphere of activity. Thus one could forecast future unemployment in 1985 under a series of defined assumptions about taxes, subsidies, etc. Alternatively one could predict unemployment in 1985 by taking into account views as to changes that may be made in matters like taxation or subsidies, import controls, etc. The two approaches are rather different. Forecasting in a probabilistic sense is not a meaningless exercise, nor is it impossible to measure the relative abilities of one forecaster versus another even though at any one trial some defined event may or may not happen. For example, the weather forecaster for tomorrow could either give a prediction (rain or no rain) or alternatively a forecast, e.g. it will rain with probability 0.8. Over a period we could study his forecasts and the subsequent outcomes and judge them against another forecaster. What is true, moreover, is that with training and experience many such forecasters would improve, and hence training and experience in making forecasts is desirable.

The exercises in this article aim to help individuals determine how well they perform such assessments; whether they demonstrate any consistent biases; how they can improve their performance. In general, people tend to believe they are more certain than they really are, and hence tend to underestimate their degree of uncertainty. The five exercises given in this paper are laid out in a defined sequence. The first two exercises deal with verbal descriptions of uncertainty and tying them to the probability scale. Exercise 3 aims to see how consistent assessors are in true/false situations. Exercises 4 and 5 deal with situations where there are differing amounts of information available to those making an assessment.

No special knowledge is required for the exercises, although some elementary knowledge of probability and the zero-one scale is assumed. As we cannot use future events on which to base the exercises they are basically planned around situations where participants are unlikely to be completely certain about the correct results, which are, however, available. A brief commentary is given with each exercise. Readers who wish to study the subject further are referred to the papers given in the references, in which the principles, but not the details, behind the various exercises are discussed in more depth.

Exercise 1

Participants are asked to rank the following list of twelve verbal expressions, each of which conveys some sense of uncertainty, into their preferred order, the top of the list giving the highest degree of certainty, and the bottom item on the list the lowest degree of certainty.

  • Probable      Uncertain      Unlikely        Hoped
  • Certain         Possible        Credible        Expected

    Impossible     Doubtful      Conceivable   Likely

    When a class has all done this, the average rank for each expression can be computed and an overall ranked order obtained. Each participant can then be compared with the overall list. Everybody will presumably have put Certain at the top of their list and Impossible at the bottom, but there will be substantial variations in between. Alternatively, some particular pairs of expressions can be looked at, and an examination made to see how often they have been ranked AB and how often BA. It could be pointed out to participants that where resources are being allocated on such verbal assessments of uncertainty, misunderstandings could easily arise.

    Exercise 2

    One member of a group is asked (privately and beforehand) a question such as ‘how do you rate the chance of Liverpool being top of the Football League first division next year’ and asked to use one of the phrases in exercise I to describe his view. Suppose he gives the answer ‘possible’. The rest of the group are now asked to put the word ‘possible’ on the zero-one scale of probability, using an interval of 0.05. A frequency distribution is compiled of the answers and shown to the group. Commonly the frequency distribution obtained has a fairly wide spread showing that the use of expressions to describe uncertainty can lead to misunderstandings, and very different concepts in the minds of the receivers of such information. As a follow-up to this exercise, the group could be asked individually to give their own assessments of the probability that Liverpool would be top next year. This could be compared with that given by the original respondent and the reasons for the variations in probability (e.g. information about football) discussed.

    Exercise 3

    Participants are shown a series of questions (of which 10 are given below as specimens and others can be made up very easily). At least 20 such questions should be used. Each question can be written on a card and numbered. The cards are then passed round from participant to participant or, alternatively, they can be duplicated on two or three sheets and a copy issued to all concerned simultaneously.

    Each question has a ‘correct’ answer, namely (a) or (b), but the participant will commonly not be absolutely certain as to which is the true answer. He is therefore asked to assess his degree of uncertainty by marking the two possible answers with values p and 1 - p respectively, where p is his assessed probability that (a) is the correct answer. Thus, if he is certain that (b) is the correct answer, he will return 0, 1 as his assessment. If he is completely undecided and ambivalent between the two possible answers, he will return 0.5, 0.5 as his assessment. It is suggested that participants stick to the eleven values 0, 0.1, 0.2,... ., 0.9. 1 for their assessments.

    Possible questions:

  • 1. Which is the longer canal:
  • (a) Suez

  • (b) Panama?
     
    2. Which country has the greater population:
  • (a) France

  • (b) Federal Republic of Germany?
     
    3. Which is further (as the crow flies) from London:
  • (a) Chicago

  • (b) Bombay?
     
    4. The number of road accident deaths per annum in Great Britain is over 6000:
  • (a) True

  • (b) False.
     
    5. The percentage of children who go to University in the UK is over 30 percent:
  • (a) True

  • (b) False
    6. Over 50 per cent of the overseas visible trade of the UK is with European countries:
  • (a) True

  • (b) False.
    7. The percentage of stake money in football pools paid out as prizes is less than 35:
  • (a) True

  • (b) False.
    8. The Battle of Culloden was fought in:
  • (a) 1682

  • (b) 1746.
    9. The height of Mont Blanc is:
  • (a) over 17,000 feet

  • (b) under 17,000 feet.
    10. In boxing parlance, featherweight is heavier than bantamweight:
  • (a) True

  • (b) False.
  •  

  • Answers: 1(a) 2(b) 3(a) 4(a) 5(b) 6(a) 7(a) 8(b) 9(b) 10(a)
    When the questions have been answered they are first sorted into groups according to the higher probability given for each question (0.6, 0.7, 0.8, 0.9 or 1.0), the 0.5 answers being put on one side for the moment. For each of the five groups the proportion of the answers that are both given the higher probability and proved to be correct (the ‘hit-ratio’) is calculated. For the 0.5 group this is assumed to be 05. A graph is plotted of the hit-ratio’ against the assessed probability. The group should be invited to comment on the form that the graph takes, as seen against a straight line from (05, 05) to (1.0,1.0) that would occur if the forecasts were perfect in a proportional sense, i.e. 70 percent of the forecasts labelled 0.7 were correct, etc. It is useful as well to put underneath each level of forecast (0.5, 0.6, etc.) the number of such forecasts made in total by the class. A discussion could follow of the U.S.A. system of forecasting rainfall and how weather forecasters might be judged (see the paper by Murphy and Winkler in the references).

    Exercise 4

    The participants are given the following piece of description: A group of 100 professional people consists of 60 engineers and 40 lawyers. The following summaries relate to six different individuals drawn at random from the group. Give your assessed probability that in each situation the individual referred to is an engineer’.

  • (a) Mr A is very active in politics and is a good public speaker. He is greatly concerned in his spare time with the financial problems that fall on the one-parent family and gives a lot of unpaid help at the local Citizens Advice Bureau.
  • (b) Mr B is 35 years old, married with 3 children. He passed his ‘0’ levels in 9 subjects, is well respected by his colleagues and expected to do well in his career.

    (c) Mr C took a degree at Imperial College London before taking a job in Birmingham where he has been for the past 10 years. He finds his hours of work somewhat unpredictable and this inhibits his interest in the local music Society’s activities.

    (d) Mr D lives in the suburbs of London and commutes to the City each day. He is keen on gardening and is a well known exhibitor of dahlias at local flower shows.

    (e) Mr E studied mathematics, physics and chemistry at A’ level at school going on to University where he obtained a good degree. He is good with his hands and spends much of his spare time tinkering with old cars.

    (f) The card relating to Mr F unfortunately got spoilt and is completely unreadable.

    The point here is to see how individuals use the information. Presumably in the absence of any further information than that given in the opening sentence the assessed probability would be 0.6. This should apply to individual (f) and individual (b) where the information is equally applicable to an engineer or a lawyer. Of the others, (a) and (d) seem to suggest a lawyer rather than an engineer, (c) and (e) the reverse. But these ‘feels’ should weight the Probability below 0.6 in the first, and above 0.6 in the second case. Commonly people seem to start from the 0.5/0.5 situation and move from these rather than from the 0.6/0.4 prior situation

    Exercise 5

    The following question is posed to the group (as individuals): ‘Electronic components are being manufactured as a continuous process and a random sample from the output is tested from time to time to check on quality, using an accelerated test basis. Not more than 5 per cent defectives would be expected under the accelerated test in the long run. Two schemes of testing are considered. Under scheme A a random sample of 100 components from each day’s production is tested and the percentage of defectives recorded. Under scheme B a random sample of 50 components is tested each morning and the percentage of defectives recorded, the procedure being repeated for each afternoon’s production The process is stopped and checked if the percentage of defectives found in a sample examined is 10 or greater. Would you expect to stop the process for checking (assuming that it is running normally with about 5 per cent defectives):
     

    (a) more frequently under scheme A
    (b) more frequently under scheme B
    (c) equally frequently under scheme A or scheme B?’
     
    Whilst those who have studied sampling will no doubt indicate (b) as the correct answer, many will suggest (c) and find difficulty in seeing why this is not the true situation. The exercise can be used to reinforce the elements of sampling theory.

    References

    Alpert, M. and Raiffa, H. (1969). Report on the training of probability assessors. Harvard Business School.

    Moore, P. G. and Thomas, H. (1976). The Anatomy of Decisions. Penguin.
    Moore, P. G. (1977). The Manager’s Struggles with Uncertainty. Journal of the Royal Statistical Society, Series A, 140, 129-148.
    Murphy, A. H. and Winkler, R. L. (1977). Reliability of subjective probability forecasts of precipitation and temperature. Applied Statistics, 26, 41-47.
    Tversky, A. (1974). Assessing Uncertainty. Journal of the Royal Statistical Society, Series B, 36, 148?159.
    Winkler, R. L. (1967). The assessment of prior distributions in Bayesian analysis. Journal of American Statistical Association, 62, 776-800.
     
    Back to Contents of The Best of Teaching Statistics

    Home
    Back to main Teaching Statistics Page