ADVOCACY ISSUES

Understanding Measurement Issues

The identification of learning problems usually involves assessment or measurement. This measurement process often involves determining two numbers: 1) an estimate of ability or potential; and 2) an estimate of actual achievement in academic skills.

To understand measurement, it is important to understand that we cannot measure anything exactly, any more than we can know precisely how many ounces of cereal are in a box. Any measurement is an estimate, and how good the estimate is depends on how good the instrument of measurement is.

Just from experience, we have found that measuring mental events is not that different from measuring other biological events, such as height. That is, people only vary so much, from the shortest to the tallest individual, with most people clumping around the middle or average. The degree to which people are dispersed (vary above and below the mean or average) is indicated by the "standard deviation." In educational assessment, most of the instruments we use have the same statistical qualities, so that we can calculate "standard scores" which will be comparable between instruments. These standard scores commonly have an average midpoint of 100 (at the 50th percentile), and the same standard deviation of 15 points. This means that the further someone is from the mean in 15 point units, or standard deviations, the more notably he or she differs from the average. (There is also another benchmark called the "standard error of measurement," but we aren't going to get into this here. Suffice it to say, this is a way of recognizing that no test measures consistently what we want it to and of assessing the probability that any student would have gotten a different score on a different day.)

The measurement process becomes more complicated when we are using two tests, and are then comparing their standard scores. (Standard scores don't have to be expressed like IQ's and can involve regularized scores such as a "z score" or a "T score;" but in education we are usually talking about scores with a mean of 100 and a standard deviation of 15). It is common when assessing a learning disability to give an intellectual test (which yields an IQ score) and a test of academic achievement or other skills (which yields another standard score). Just as a rule of thumb, we can tolerate a downward discrepancy between achievement or other skills of 15 points below the IQ, because that is still within the average range. However, as discrepancies get greater, we realize this may not be a random event, but it actually means something! And what it means is this student is having significant problems in the skills that have been assessed.

In estimating discrepancies between ability and actual functioning in a particular skill, one can quibble about which IQ score best represents a student's "ability." Unless there is good reason to believe otherwise, we assume the child is about average if he or she has an IQ of 100. This is usually the Full Scale IQ (FSIQ) on an intelligence test like the Wechsler Intelligence Scale for Children (WISC-III). Actually, a complex test like the WISC-III does not yield just one, but several scores. However, we don't expect these scores to vary much from the mean (or from each other). We also don't expect the Verbal IQ (VIQ) and Performance IQ (PIQ) to differ all that much.

Sometimes, however, the child functions well above the mean (say at FSIQ 115, or one standard deviation above the mean); or he or she has a significant discrepancy between verbal and nonverbal ability (although his or her FSIQ is 115, the PIQ is 130 and the VIQ is 100). In the first case, we would expect academic functioning to also be much higher than average or roughly equal to 115. In the second case, we could argue that academic functioning should actually be closer to 130. In this second case, there may be reasons why the child doesn't do well on certain subtests of the IQ test (such as an impoverished or chaotic family environment or a psychiatric diagnosis that causes him or her to do less well on certain subtests). Whatever the reasons, it is important to determine if the child's actual ability is uneven, what areas are affected, and whether the FSIQ may actually be an underestimate of his or her true ability or potential. There is a further complication. Some ability and achievement tests are coordinated when they are first developed, so that we don't have to use an arbitrary measurement of discrepancy such as the standard deviation of 15 points or more (in California, for example, a discrepancy of 22 points is required between ability and achievement in order for a learning disability to be diagnosed). These tests are more highly correlated with one another, and so we expect a better fit between ability and achievement. In these cases, when looking at discrepancy in terms of standard deviations, we might be looking at differences in units of a lot less than 15 points. This becomes an issue in the area of advocacy, when it is important to set as careful guidelines as possible for measuring "learning disability."