In the match programme for the Second Division game between Swansea
City and Grimsby (1), in a section headed Soccer Surprises, I really
was surprised to read the following paragraph:
|
We have at least one mathematician among our followers, Professor Hawkes of Swansea University; here’s one for your students. What are the odds against Chesterfield being drawn against Doncaster in the First Round of the F.A. Cup and against Oldham in the 2nd two seasons in succession? How about 3081—1? Oh, it happened! 1960—61 and 1961—62 |
Of course I couldn’t resist the challenge, and I offer my thoughts on the subject in this journal because I felt that many a schoolboy (not to mention a professor or two) would find the problem interesting. The academic point that it illustrates is that it can sometimes be very difficult to decide what event your are dealing with and that, even if you can define the event properly, you may have to make do with some rough bounds on the probability of that event instead of a precise probability.
Each year eighty clubs enter the draw for the first round of the F.A. cup. Of these, forty-eight are the current members of the third and fourth divisions of the Football League and the remainder are non-league clubs which have won through some little publicised system of qualifying rounds. As a result of the draw there are of course forty matches played and the winners go into the second round. The draw for the second round thus contains forty clubs, resulting in twenty matches whose winners go on into the third round. In the third round they are joined in the draw by the forty-four first and second division clubs, and thereafter the competition proceeds in straightforward fashion up to the finals.
In the quoted paragraph we have an example of the well-known, and dangerous,
practice of saying "look, something with small probability has actually
happened - isn’t that surprising!" The truth is that events of small probability
are happening all the time but we tend only to notice the ones which have
some interesting feature such as symmetry or a run of consecutive events.
Furthermore, if you only notice the event after it has already happened,
you may have great difficulty in deciding what event you are really interested
in. To illustrate the point formally we consider the usual basic structure
of a probability space
whose
elements are the elementary outcomes of a random experiment.
An event is defined to be a subset E Ã
of elementary outcomes. If, when the experiment is performed, you observe
the outcome w then you also observe every possible
event E which contains w . In the problem at
hand, which particular E are we talking about? Here is a list, by no means
exhaustive, of possible events of interest.
E1: "X plays Y and Z in the 1st and 2nd rounds in year n
E2: "X plays same teams in 1st and 2nd rounds in yearn as they did in year n - 1"
E3: "X plays Y and Z in 1st and 2nd rounds in both year n - 1 and year n"
E4 "In year n at least 1 team played the same teams in the first two rounds as they did in year n- 1"
E5: "The event E4 occurred at least once in 40 years"
(As quoted X, Y, Z are Chesterfield, Doncaster and Oldham - but they could be any other teams.)
Before trying to assign probabilities to any of these events we may wish to specify some special conditions C which are assumed to hold, so that we calculate a conditional probability P(E|C). A short list, again incomplete, of relevant conditions is:
C1 : "X, Y, Z are in the 1st round draw in year n"
C2: "X reached 2nd round in year n - 1 and they, and the teams they played are in the 1st round draw in year n"
C3 : "X, Y, Z are in the 1st round draw in both year n - 1 and year n"
C4 : "X in 1st round draw in year n - 1 and year n"
In the article quoted an attempt seems to be made to find
P(E1|C1) = P(E2|C2)
These are equal because the probability P(E1|C1) does not actually depend on the particular teams which Y, Z happen to be, so you might as well take them to be the teams which X played against the previous season.
Presumably the argument goes:
p = 1/79 * 1/39 = 1/3081
Hence odds are 3080 to 1 against [nb. not 3081 to 1, because odds are given by (1 - p)/p]. However, this seems to make the unrealistic assumption that X and Z are sure to win their 1st round matches. I suggest that it is reasonable to take a probability of 1/2 of winning so that the above should be
1/79 * 1/39 * 1/2 * 1/2 = 1/12324
The question actually asked seems more like
giving odds of more than 150 million to 1!
So far I have been getting round the problem of non-league clubs and promotion, and relegation between 3rd and 2nd division by assuming as part of the conditions that the relevant teams were actually in the 1st round draw (the unconditional probability P(E3) clearly has the trivial answer zero if you talk about the 1980/81 season—Oldham were then in the 2nd division). Putting this condition in makes the odds smaller than the ones that we would really like to calculate (but can’t because it is too difficult). Thus P(E2|C4) is less than p/2 because X’s opponents in year n - 1 might not be in the 1st round in year n.
Now the particular named teams are of no special interest (they merely happened to be the ones involved in the noted situations). Therefore a more interesting probability is P(E4). This is difficult, but we can certainly say P(E4) ¾40p, since P(U Ai) ¾ P(Ai) for any set of events Ai. On the other hand I would estimate (guessing somewhat here) that p(E4)> l0p so that the odds are somewhere between 307 to 1 and 1231 to 1. My feeling is that the nearest thing to a sensible probability is perhaps P(E5) because the records could have been scanned for several years. This is given by P(E5) = 1 — (1 — P(E4))40. Bearing in mind the inequalities given for P(E4), this leads to odds of between 30 to 1 and 7 to 1 against E5. So perhaps it is not so surprising after all!
Another Problem
On p.37 of report number 15 of the Association of Football Statisticians (2), I read:
University College of Swansea
References
1. Swansea City A.F.C., Programme for League Division Two match against Grimsby Town, 24 October, 1980.
2. Association of Football Statisticians. Report number 15, November 1980.
Back to
contents of The Best of Teaching Statistics
Back to main Teaching Statistics
page