Metrics, metrics, metrics. More and more of educational life is governed by metrics. Changes in policy are said to be justified by trends in metrics; measures of school performance are down, so something must change. Students are accepted or rejected into college based on metrics. School closings are justified on the basis of metrics. Teachers are to be fired based on metrics. Not a day goes by without educators being exposed to or complaining about metrics. The fate of school systems across the country, we are told, hangs on the results of metrics.
Never before, so it seems, have metrics been so implicated in an overgrowth of our individual and collective anxiety, and never before has so much metrical energy been expended with so little accompanying clarity regarding the nature of our efforts to pass on the accumulated knowledge and wisdom of human cultures and prepare youth for the social transformations they must lead. With this I begin my series examining the origins of what might be called metric morality.
The word metric refers to a standard or system of measurement. Measurement occurs when a mathematical system represents and corresponds to quantitative changes in things or phenomena. Scientific measurement is not an idea, nor is it a convention. We study the relationship between quantitative and qualitative change using measurement, which allows us to determine, for example, at what point water is converted to steam when heat is applied. These analyses in turn help us both understand and alter our natural world, and in some instances, our social world too. These measurements do not themselves afford us control over the world, or by themselves change it; rather, knowledge of the world is gained by the application of mathematics to represent dynamic changes in natural and social phenomenon.
We produce measurements of quantitative features of phenomena; as such, a measurement cannot be a means for grasping the essence of a phenomenon. A human being cannot be reduced to their current temperature, height, or the level of glucose in their blood. The essence of a human person or a aspect of social life (e.g., education) cannot be discerned solely from the results of measurement. Measurement is not the only source of knowledge about the world. Importantly, many aspects of our natural and social world do not come in degrees — that is, measurements regarding qualitative phenomena cannot be produced. Whether or not something exists in a quantitative form is a theoretical question that must be answered before metrication can take place.
The Fraudulent Nature of Metrics in Education
We are led to believe that the metrics developed by policy makers and testing companies are no different than those developed by scientists studying the natural world. But what stands as “metrics” in education bear little resemblance to a ruler, thermometer, or Geiger counter. A quick overview of why this is the case should suffice. The “metrics” now used to demonstrate “results” all share the following shortcomings, which will be elaborated in future posts:
- They confound properties of individuals, individual schools and individual school systems with the relations those individuals, individual schools and individual school systems have with their social contexts. A good example of this problem is the large correlation between test scores and measures of income and wealth, and the very small correlation between teacher characteristics, classroom practices and test scores.
- They never specify the actual object of measurement, and instead follow the long discredited practice of defining the object of measurement “operationally”; that is, things and phenomena are defined by how they are “measured”. A good example is this: intelligence is the ability to do well on an intelligence test. Most identify this logic as circular and tautological. The same brain malfunction appears with test-based teacher evaluations which define effective teaching as a teacher’s students’ “boosted” standardized tests scores.
- They assume the flawed definition of measurement as “the assignment of numerals according to a rule.” This means anything goes. Related to this problem is the assumption that everything that exists must exist in some amount. This would mean, for example, that we accept the proposition that humans exist in their degree of human-ness. Some of us are more human than others. Thankfully, the testers will select the chosen ones!
- They confuse ranking with measurement. Producing a rank ordering of things or phenomena does not produce a measurement, i.e., the rank numbers do not correspond to an actual amount of the property that has been ranked. Ranking five people by their height does not, by itself, tell us the height of any of the five individuals. The Business First school rankings are, likewise, not a measurement of school quality, or anything else. It is a mere ranking, nothing more. As such, it tells us very little, and confuses us much. No effort is made to even inform readers of the relative distance between rankings, which are rarely equidistant. This creates confusion as to the meaningfulness of ranked differences.
- They rely on the logic of cut scores, which are arbitrary in both the scientific and political sense. Cut scores are increasingly established in secret.
- They no longer even adhere to the established norms of “validity” and “reliability” for evaluating metrics in education. Despite the fact that the research community has repeatedly cautioned policy makers, invalid and unreliable metrics are used, used inappropriately and, increasingly, the practices of test development and validation are done in secret.
Metric Morality: An Introduction
But despite these facts, bad feelings are created as those critical of high stakes testing are defamed: they are against facts, unreasonable and defend the “achievement gap.” Opposing the current metrics for the purpose of demonstrating “results” is not only unthinkable, but according to some, unethical.
In suggesting teachers are immoral if they oppose the state’s metrics, Commission Elia in New York helps us see something that is often missed. With the state’s charge of immorality against those refusing to submit to autocratic irrationalism, Elia exposes the fears of a declining power that increasingly acts against wisdom, science and human dignity. She now acts against basic democratic norms. She is, with her liturgical ignorance regarding the value and validity of state tests and associated policies, exposing the impotence and autocratic nature of the regime she has been parachuted in to save. But most importantly here, allegiance to the testing regime is presented as a moral measure of conduct. This is now a political test and portends a political order on the basis of moral worth (which, historically, were slave societies). And it creates heightened difficulties for those presenting themselves as hard-nosed empiricists looking at data to make decisions so as to close achievement gaps.
But the insinuation that allegiance to the state’s testing program is itself a moral test provides an opening. The mass movement against high-stakes testing and in defense of public education is bringing to the fore the need to expose the origin and underlying logic of the corporate school reform program. Lynchpins of corporate reform, these metrics and the “science” they are based on are key blocks to imagining a new future. But it is not enough to point out that current metrics are neither reliable nor valid. Their framework of construction, their assumptions — all this must be interrogated to clear space for fresh thinking about public education and democracy.
Deeply rooted in American institutions is a distorted form of “behavioral science” that became institutionalized following the end of World War II. This distortion is known as behaviorism. It is the conceptual and practical basis for the metric distortions outlined above, and has profound implications for almost every aspect of education theory and practice today. Behaviorism instituted a “new” sense of morality rooted in the experimenter’s control of the environment to induce the “right behaviors” and promises to eliminate all human conflict. In this framework, tests do not produce understanding but rather stand as means to regulate; metrics thus became a favored means to establish control. Those not readily complying are subject to behavior modification. “Testing”, “measurement” and specific understandings of the “experiment” have thus become the accepted way of doing education research and evaluation and stand in fact as techniques of governing. This “psychology” I believe has become married to the corporate school “reform” project and is essential to understanding the rise of anti-public education.
Behaviorist assumptions and practices, as I hope to show in a future series of posts, are the root of:
- the fallacies inherent to the testing industry since the turn of the last century;
- an ever increasing drug-like fixation on qualification, a mechanistic and reductionist mentality that deform understandings of skill, thinking, teaching and learning;
- many educational technology efforts;
- an obsession with social control disguised as research based pedagogy, now the prescribed methods of leading charter school chains.
(To be continued)
- This means important aspects of education never get attention. Bogus metrics divert us from serious investigations of key questions, like: What purposes do we want schools to serve? How might we determine if those purposes are being served? ↩
- See my book, A Measure of Failure. Stevens, S. S. (1946). On the theory of scales of measurement theory of scales of measurement. Science, 103, 677–680. Retrieved from http://www.mpopa.ro/statistica_licenta/Stevens_Measurement.pdf ↩
- For example, schools might be placed in first, second and third place based average test scores, such as: 95.5, 95.4, 95.3. We might be told a school “squeaked” by another school. This creates the corollary problem that false differences are created where they probably do not exist; real difference might likewise be obscured using such “methods.” ↩
- See Glass, G. V. (2003). Standard and criteria redux (pp. 1–34). Available from http://www.markgarrison.net/wp-content/uploads/2013/04/Glass_StandardsCriteria_2003.pdf ↩
- See Baker, E. l., Barton, P. E., Darling-Hammon, L., Haertel, E., Ladd, H. F., Linn, R. L., … Shepard, L. A. (2010). Problems with the Use of stUdent test scores to evaliate teachers. Washington, DC: Economic Policy Institute. ↩
- Consider this: when scientists debate ways in which to test for a disease, does the morality of research workers stand as a basis for evaluating their various claims? If one investigator cautions that a given test should not be used for diagnosis for a given disease because it yields a high rate of false negatives, is it normal practice to declare him a sinner? If a doctor refuses to use such a test in screening a patient, are charges of immortally leveled against her? If a parent asks his physician a question about the possible negative affects of a proposed test for his child’s ailment, or requests alternative tests for his child, should the parent’s morality be impugned? If parents can refuse medical tests for their child, why can’t they refuse state tests? ↩
- Closing these gaps, rhetoric suggests, is also increasingly rendered in moral terms. ↩