Why Test Scores Result The Way They Do
By Dr. James Cox
JK Educational Associates, Inc.
March, 2000
One of the most common uses to which test scores are put is to evaluate the quality of educational programs or to compare scores from one testing point to another to assess "growth." If the scores don't support what we're hoping to see, a common conclusion is that some limitations exist within the instructional program and that "fixing" is in order.
When educators or community members draw the above conclusion, they are assuming that the only variable that affects test scores is program quality. This, however, is not the case. Scores result the way they do for six reasons: demographics; physical environment for testing; attitudes of the teachers and students toward the testing program; students' test taking skills; alignment of curriculum content with test content; program quality. The purpose of this paper is to briefly discuss these six reasons and to remind all of us that test scores do not point to cause. Test scores report "what is." All they give us is a status report. We must answer "Why?". Discovering the "why" requires further investigation. Assuming cause from a score is irresponsible.
Demographics
Three significant issues come to mind when discussing demographics. These are English proficiency, mobility, and affluence or socio-economic status.The first is an obvious point. When a student has limited English skills and tests in English, the resulting score will not measure what the student knows in the area being tested. For example, I do not know a language other than English, but I can read very well. If I were to take a reading test in any other language than English, I would score close to the chance level. Those who would then conclude that I lack reading skills or that the educational program being employed to teach me reading is weak are simply not using common sense. There must be an agenda other than to assess fairly my reading skills or the quality of the instructional program. I find it very hard to contain myself from screaming at those who degrade educational programs using scores from a school whose limited English population is relatively high.The second issue with demographics is mobility. When a school population is highly mobile, there is a large number of students whose test scores are not a result of the educational program of the school in which the student now resides. To combine these scores (be they high or low) with those from the stable population diminishes the accuracy of any conclusion drawn about the quality of the instructional program of that school.And third, there is a predictable link between the affluence of a group of students and their resulting test scores. While this phenomenon is never to be interpreted as a cause and effect relationship, the relationship is there, nevertheless. In two California assessment programs in the 80s and early 90s that have since been discontinued, the statistical correlation between the affluence of a school and that school's test scores was reported to be between .85 and .90. This relationship is so high that it sometimes is difficult for educators to keep cause and effect out of the thought process. Remember, as soon as we conclude that affluence is the cause of high test scores and poverty is the cause of low scores, we have just disempowered ourselves as educators to make a difference in the lives of poor kids. We must constantly remind ourselves that while the relationship may exist, we refuse to accept it for the kids we teach, and we will do our darndest to reverse that cycle of predictability.Student demographics can affect teacher expectations. If the students are less affluent or less capable of speaking English, for example, we unfortunately tend to expect less.Scores can change from one year to the next because the student population changes. For example, if more capable students are in the class the second year than the first, scores may be higher. Similarly, if less capable students are in the class, the scores may be lower. Districts that are experiencing an influx of limited and non English speaking students will undoubtedly experience lower scores.
The demographics of a school or district should never be used to "excuse" low or declining scores. It is one thing to use the demographic issue as a crutch and quite another to realize that although the demographics are contributing to the scores, the demographics will not stand in the way of providing a superior education for all students.
Physical Environment for Testing
Physical environment can affect the entire testing population, positively or negatively. An attractive environment, of course, is desirable. Lighting, room temperature, and comfortable seating all play a part in how the students feel. Testing conditions should be as close to classroom conditions as possible. Hauling students into the cafeteria with a microphone and a proctor does not bode well from an environmental perspective.
If tests are inappropriately administered or if something out of the ordinary occurs, it affects all the students, not just one or two. Giving poor directions, not allowing the allotted time, helping students who are obviously having trouble are all examples of conditions that affect scores. Then what if the lawn mower (or something equally distracting) is being used just outside the classroom during testing time? All of these distractions can have a disastrous effect on the overall test results from a school.
Attitudes of Teachers and Students Toward the Testing Program
High stakes testing, in which results are printed in the newspapers and praise and indictments run rampant throughout a school community, is not a positive situation for educators. Educators who are truly professional will never run from being held accountable for doing good work and producing significant results in students, but the history of large scale testing suggests that results are often used inappropriately to draw unwarranted conclusions. Thus, a positive attitude toward such testing programs must be built; it doesn't come automatically.Occasionally, students are not motivated to do well on a standardized test. Taking a test, "which doesn't count for my grade," for many students isn't accompanied by a thirst to excel.
Poor or ambivalent attitudes toward a testing program may (and probably will) result in lower scores than if a more positive attitudes had existed. The obvious goal is for teachers to support the high stakes testing program (even if they may not agree with it) and for students to try their best.
Test Taking Skills
When any of us takes a test, the score we receive is a function of two things: the knowledge and/or skills we bring to the table and the degree to which we are capable of taking the test, i.e. test taking skills. Granted, some of the other stuff (like physical environment) could play a part, but when we think of one student, these two variables are key. I'm going to depart a bit from the focus of this paper and talk about test taking skills in a way that we don't typically hear. This whole paper focuses upon why. Here I'd like to suggest what we ought to do. Usually when test taking skills enter the scene, it is because there's some big test looming and we've got to prepare the kids for it, so we do all kinds of stuff to try to "give them an edge." I'm going to take a different tact. I hope you'll buy it. Here it is:Every one of us has at some point had our lives seriously affected by a score on a test; our lives, not just our academic lives. Every one of our flesh and blood kids, our own children, either have had or will have their lives seriously affected by a score on a test. At the highest professional levels: medical exams, bar exams, CPA exams; not too many years ago my son had to take seven days of exams to become an architect in California; and for those who may not be pursuing highly professional careers but are going after some very worthy careers, none the less: electricians, contractors; real estate agents; civil service; military, police or fire work, insurance; cosmetologists; technicians of all kinds; and the list goes on. For each of these lines of work, what is a common thread? They've got to pass a test to get there, and in many cases more than one.This is the United States of America. We are a capitalistic competitive society and we give tests. In fact we've created a culture of testamania. All of this having been said, I believe (and oh, how I want you to believe) that test taking skills are a life skill. Kids must learn how to take tests and I believe we are obligated to teach them. You see, testing is not just a school thing, it is a life thing; testing is part of our culture. Kids seem to believe that when they're through with school, there will be no more tests; they don't realize that the most important tests they will take are the ones they take when they're out of school. I would love to see educators approach the issue of test taking skills from this perspective rather than only being concerned about them "a month before the big game. " You see, we typically want kids to have test taking skills so they can make us look good. In these days of high stakes testing, if the kids have test taking skills, then the adults (that's us) will look good. No, we want kids to have test taking skills so that when they take the most important exams of their lives, these skills will make them look good. I would like us to believe that test taking skills are to be taught, not caught.Test taking skills are such a significant issue that I continuously promote the notion of a school staff coming together and actually planning what they will do to build test taking skills in their students over the long haul. It will take time; kids need to practice them; remember, kids don't build test-taking skills because of some high stakes test; they learn how to take tests on that 30 item social studies quiz on Thursday morning. This effort shouldn't encroach upon an already overloaded curriculum, but if handled in a low key, continuous fashion, you will have one fine group of test takers. And over the long haul, those skills will make you look good, but more importantly, it will make them look good.
Test taking skills enable students to get as high a score as they should, based upon knowledge, skills, and prior preparation. Contrary to popular opinion, test taking skills are not skills enabling higher scores than those to which their knowledge and skills entitle them. When students lack the skills to take a test effectively and efficiently, unless they are very lucky, they will score lower than they should. When large numbers of students lack the skills to test well, this factor will certainly affect the group score, the one that is reported in the newspapers.
Curriculum Content and Test Content Alignment
Assume that a school tests in grades one through six and tests in reading, math, language, science and social science. Each time the school gives the exam, thirty group scores are reported (six grades and five content areas). When the content of the test is matched with the content of the curriculum on these thirty occasions, the "common ground" will differ among the thirty matches. That is, the amount of overlap will not be the same. The greater the overlap, the more likely the students will do well. When there is less overlap between program content and test content, the students are being asked some questions which they really can't be expected to answer correctly.
Program Quality
Test scores will be caused, in part, by the quality of the educational program. When improvements occur in quality, all else being equal, test scores will go up. When the quality declines, so will the test scores.
Quality can be described as the interaction among three significant components. These are (1) materials and equipment; (2) activities, both instructional and support; (3) the people involved in implementing the program including their skills and experiences, their attitudes, and the general educational climate that they create. The quality of each of these can vary, which will affect achievement measures.
BOTTOM LINE:
Consumers of testing information, when reviewing scores, tend to attribute any change to program quality. As described above, such is not the case.
