The purpose of this report is to make a tentative contribution to the knowledge of students’ understanding of the statistical concepts; median and mode.
Introduction
The Median
The Mode
Discussion and Conclusions
This article relates to a pilot study made by the author, and tries to answer the following questions connected with the mistakes students tend to make:
What are the difficulties?
How many students have difficulties of this kind?
How does the statistical vocabulary develop with respect to concept?
The pilot study concerns 95 students in the age range 1721, with
a mathematical background of 0-level or TEC Level One mathematics. All
the students were studying for Technician Education Council (TEC) qualifications.
Sixty-nine per cent were studying TEC standard unit TEC U75/012 (i.e. Engineering
Technician students), while the remaining thirty-one per cent were studying
TEC standard unit TEC U76/031 (i.e. Science Technician students). A standard
unit" consists of general and specific objectives. The TEC units TEC U75/Ol
2 and TEC U76/031 are both Level Two units, having common objectives in
certain topic areas. The general objective, in this case, common to both
units is: "The expected learning outcome is that the student calculates
measures of central tendency and dispersion of numerical data". The items
quoted are from the tests constructed for these two units. They are of
multiple choice format as follows.
| Question: | Response a,b,c,d. | (one of which may be right). |
| e. | (blank, to be filled if your answer is different from a.b.c.d). | |
| f. | (if you are unable to answer the item). |
Consider the information given in Table A below, concerning the number of broken biscuits found in some packets of chocolate fingers.
TABLE A
| Number of broken biscuits | Frequency |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1. What is the median of the distribution?
d. 8 (2%) e. (24%) f. (6%)
The percentages in brackets represent the number of students using the responses. The star values are the correct answers. The most common response is an answer of 5. This is the median of the figures in the ‘Frequency’ column. The second most popular response is to give an answer of 6. This is the median of the figures in the Number of broken biscuits’ column. Twenty per cent of the students achieved the correct answer. While a further fourteen per cent of the students gave the response of the ‘middle value’ of the ‘Frequency’ column.
The idea that the median is the middle value of something has clearly been grasped. Doubt as to what that something is, is obviously evident. Is the difficulty here really understanding that a frequency table is only a summary of a list of data; so that an alternative representation of the data would be
2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 6, 9, 9, 11?
This is obviously a part of the student difficulty.
An alternative item asking for the median number of broken biscuits per packet would perhaps for some students, remove the nebulous nature of the word distribution If this were the case this would suggest a lack of understanding of the concept of distribution, which leads the student to ask median value of what?
However, we can see a rule has been learned: the median is the middle value. So let us now consider another item investigating the student ideas of median in a simpler context.
1.1 The median of the following set of numbers: 1,5,1,6,1,6,8 is:
a. 1 (l7%) b. 4 (8%) c.. 5 * (50%) d. 6 (21%)
e. (3%) f. (2%)
Once again we see that the majority of students use the rule that the median is a middle value. Twenty-one per cent suggest the middle value of the list. Fifty per cent give the answer as the middle value of the ordered list. They are of course correct.
The intention, thus far, is to call the reader's attention to the students’ lack of understanding (feeling) of the median. What chance do these students have of using these ideas out of the college situation to follow statistical arguments about pay scales, for example, when only fifty per cent of the sample manage to obtain the correct solution to a fundamentally simple use of the idea of a median?
Further discussion and conclusion will be found after
the section on the mode.
Back to top
The following item was constructed based on Table A.
2. What is the mode of the distribution?
a. 11(4%)
b. 7 (13%) c. 6 * (68%)
d. l (l%)
e. (9%)
f. (4%)
The majority of the students appear to have learned the rule for the mode, as the most popular, most common value, or, with respect to this frequency table, the value which occurs with the greatest frequency. Thirteen per cent appear to have used the rule on the ‘Frequency’ column to obtain their answer, hence the clarification.
However the students’ understanding of the mode in the jigsaw of statistical measures is quite interesting. Consider the following item:
2.1 Which of the following is not a measure
of central tendency (i.e. a measure of location)?
a. mean (5%)
b. standard deviation * (55%) c. median (5%)
d. mode (24%
e. (1%) f. (9%)
Nearly a quarter of the students feel that the mode is not a measure of central tendency! Now since at this stage the students are only familiar with measures of central tendency and dispersion, one might assume that the mode would be considered a measure of dispersion. This does not appear to be the case. Given the item:
2.2 Which of the following is a measure of dispersion?
a. frequency (27%)
b. mode (0%) c. histogram (40%)
d. standard deviation * (26%)
e.. (3%)
f. (3%)
From this item we can see that the mode is not considered
a measure of dispersion, but what is considered a ‘measure’ of dispersion?
Forty per cent suggest that the histogram is a measure of dispersion (or
spread). A histogram shows the spread of data, but is surely not a measure!
A similar argument holds for the frequency response. Spread can be seen
from a list of frequenciesone may even draw a histogram from them,
but they are not a measure of dispersion. It would appear that the idea
of measure is not clear to the students.
Back to top
‘Surface concepts’ seem to have been learned in connection with the median and the mode. Student responses reflect the basic rules of: the median is the middle value and the mode is the most common value.
This achieved, the students appear to have muddled views as to how to apply these rules. Why should such confusion exist? Why should students wonder: which set of numbers do I choose in this frequency table which column. . . etc? What does the concept of median mean to these students? What does the frequency table mean to them? To find the mode we look for the biggest frequencybut to find the median we need to extend our tableor collapse it. This process is the first step towards finding the median, it appears that to find the median is a much harder task than that of finding the mode. Is this a deterrent?
If one knows how to construct a frequency table, and one has the elementary idea of what a median is, one might be able to find the median intuitively given a frequency table. Intuition requires a certain amount of feeling for the ideas. Perhaps this feeling is lacking.
When we teach the mean, the median and the mode, we often refer to them as being measures of location, measures of position, or measures of central tendency. This appears to be encouraged in the TEC Units, thus giving the students a ‘peg’ on which to hang the ideas of mean, median and mode. We discuss the properties of these measures and so develop a vocabulary specific to statistics. The vocabulary will enable us to short cut, just as a good notation does. Progress to higher concepts becomes quicker. However, progress is only possible if a certain amount of fluency exists with basic concepts. Communication of ideas becomes harder if the teacher assumes a common understanding of a concept when there is no common understanding to begin with! Student ideas develop in a disjoint fashion, and confusion will be the final outcome.
Student intuition of what generic terms are, such as central tendency and dispersion, must not be allowed to cloud a definition which may contain certain subtleties. For example, we often talk of dispersion, as being the amount of spread. Is it surprising that when we ask students to differentiate between measures and illustrations of spread, as in 2.2, that they are confused? The student’s ‘surface’ understanding of the basic vocabulary does not allow him to distinguish between a measure and an illustration. He has learned that dispersion is about spread, though, just as he has learned that medians are about middle values. This ‘surface understanding mixed with the student’s intuition leads to misunderstanding of simple problems and basic concepts.
The problems discussed in this article are connected
with those of statistics instruction at a basic levelthey illustrate
the subtleties that exist within these basic concepts.
Reference
Easingwood, T. (1979). The Technician and Business Education Councils, Mathematics in School, 8, 3.
Back
to contents of The Best of Teaching Statistics
Back
to main Teaching Statistics page