The National Academies: Advisers to the Nation on Science, Engineering, and Medicine
NATIONAL ACADEMY OF SCIENCES NATIONAL ACADEMY OF ENGINEERING INSTITUTE OF MEDICINE NATIONAL RESEARCH COUNCIL
Current Operating Status
CORE HOMEPAGE

ABOUT CORE

FOCUS OF CORE

CORE MEETINGS, WORKSHOPS & PRODUCTS

RELATED NRC EFFORTS


Workshop on Understanding and Promoting Knowledge Accumulation in Education:

Tools and Strategies for Education Research

Day 1 – June 30, 2003

Remarks by Dr. Kenneth Howe

KENNETH HOWE: I am not going to respond directly to Kenji’s remarks. By mutual agreement, which means what I gleaned from an email from Kenji, I am going to join him in creating a collage of perspectives, and here is my piece of the collage:

Questions concerning knowledge accumulation in educational research cannot be disentangled from questions concerning broader methodological frameworks. So I’ll address knowledge accumulation in the context of critically evaluating the two frameworks that are currently dominant in conversation, what I call “classical experimentalism“ and “mixed-methods experimentalism.”

I’ll end with a few remarks suggesting that the conversation needs to move beyond the limits set by these two frameworks to include consideration of an interpretive perspective.

Okay. So classical experimentalism. This is the view articulated by Campbell and Stanley in their seminal 1963 monograph, and it has recently enjoyed resurgence. It is the basic view advocated by a coalition for evidence-based policy which, as far as I can tell, has been pretty much embraced by the U.S. Department of Education.

The general features of classical experimentalism include a heavy emphasis on internal validity and on establishing causal relationships. These are conjoined with a strong preference for quantitative methods, especially randomized experiments.

Qualitative methods have a rather lowly status in classical experimental as according to Campbell and Stanley, the case study is so flawed as a research design that it is well nigh unethical to use it, circa 1963.

Regarding accumulation of knowledge, classical experimentalism endorses the idea that educational research can and ought to be cumulative. Indeed, according to Campbell and Stanley, reliance on an experiment is “the only way of establishing a cumulative tradition.” The kind of knowledge that accumulates within this framework is a repertoire of treatments or interventions that are effective in producing desired educational ends. So we accumulate knowledge and better and better techniques for teaching math, for teaching science and so on.

Classical experimentalists on the current scene typically appeal to the perceived success of medical research to demonstrate how the accumulation of knowledge regarding successful treatments can be fostered by experimentalism.

I have five criticisms of classical experimentalism, and some of which have been around since the mid ‘70s and early ‘80s, particularly as voiced by Lee Kromblock(?). I read this stuff in graduate school, and, so far as I can tell, nothing has changed with respect to the argument, except for the introduction of a medical analogy, something I’ll have a few words to say about.

First criticism. There is a real tradeoff between internal validity and external validity. Arguably, external validity should trump internal validity, not vice versa. The pervasive problem with randomized experiments in education is that the more you restrict the population and the treatment to achieve internal validity, the less applicability the object of study will have in the real world.

It is amazing to me how this problem just sort of rolls off certain people’s backs. Tom Cook, for example, says, “Randomized experiments are best when a causal question is simple, sharply focused and easily justified.” This is a 2002 EPA article, very recently published.

Then he goes on to say that when applied to schools, among the things that make random assignment most feasible are treatments that are short and that require no teacher training. This certainly looks to me like a case of the tail wagging the dog.

In general, prescribing experiments as the preferred methodology encourages the accumulation of knowledge about easy-to-manipulate, simplistic instructional approaches.

Second criticism. Randomization is oversold. Social research uses random assignment, not random selection. This typically restricts the population of those who volunteer, and then they are assigned.

Moreover, randomization is often ruled out on political, legal grounds. For example, in states with charter schools - one of my favorite objects of investigation - states with charter schools on the books, you can’t randomly assign districts to participate or not, yet the effects of charter schools are producing – the effects they are producing is an extremely important policy question that should be investigated and I think can be investigated convincingly.

Finally, the differential drop-off problem between treatment and control groups often compromises the primary virtue of random assignment. This problem has been prominent in the experiments in vouchers, for example.

Third criticism. The building of randomized experiments, per se, to zero in on causation, is oversold. Randomized experiments are neither sufficient nor necessary for causal inference. They are not sufficient because they often give only a very gappy account of causal relationships, such as the precise mechanisms are not understood. A more precise understanding of causal mechanisms requires elaborate substantive knowledge of various kinds.

Randomized experiments are not necessary to establish causal relationships because in cases where substantive background knowledge is available, robust causal relationships, at least in the statistical sense, can be established. Cigarette smoking causes cancer is a good example.

Fourth criticism. The medical research analogy is used selectively to oversell randomized experiments in educational research.

First, important differences between clinical and medical research and educational research are glossed. In particular, it is typically much easier to zero in on the treatment and to maintain as consistent administration in clinical medicine and education.

Compare the treatment defined as “X milligrams of compound Y each morning” to the treatment defined as “Instruction in connected math five hours per week.”

In addition, the context of the administration in the treatment is also much less of a complicating factor in clinical medicine than it is in education. Compare Mr. Jones and Ms. Smith’s homes to Mr. Jones and Ms. Smith’s classrooms.

Second, the non-randomized clinical trials are quite common in medical research, particularly outside of the pharmaceutical research. Clinical medical research is divided into four phases ranging from Phase 1, exploratory research on safety and side effects, often performed on so-called normal health individuals, to Phase 4, which tracks the effects of interventions, treatments into their general use. This is medical research’s external validity question. Randomized clinical trials typically are employing Phase 3, but they aren’t required in any phase.

So I did a little study - unscientific, mind you, but revealing, nonetheless. I went to the Medline website and looked under “Coronary Disease,” the treatment of which has got to be considered one of modern medicine’s success stories. As I expected, I found that the majority of the clinical studies described there were not randomized experiments. Indeed, I found one study in which the researchers claim that their non-randomized study was actually methodologically superior because it didn’t depend on a sample of volunteers, but rather was more representative of the entire population.

A third problem with the analogy with medical research is that the knowledge that has been accumulated through clinical trials is not the only or arguably even the most powerful determinant of the public health. Improved sanitation and the development of antibiotics are often cited as the most powerful developments historically.

Furthermore, associate economic status is highly correlated with health status, and there is a persistent gap in health status associated with SES. According to one commentator, “Low socio-economic position is as strong a risk factor for poor health outcomes as smoking.”

If closing the achievement gap is one of the overriding goals of educational research in the U.S., then maybe medical research doesn’t provide such a good model.

My final criticism of classical experimentalism is a sort of ad homonym. Donald Campbell - the same Donald Campbell of Campbell and Stanley - recanted his criticisms of qualitative methods, partly in response to growing dissatisfaction with experimentalist research and partly in response to developments in the philosophy of social science.

Here is what he had to say in his paper, “Qualitative Knowing in Action Research,” written in 1974, approximately a decade after the publication of the Campbell and Stanley monograph, “The polarity of quantitative experimental versus qualitative approaches to research on social action remains unresolved. If the resolution were to provide predominant justification of one over the other. Each poll is at its best in its criticisms of the other, not in invulnerability of its own claim’s descriptive knowledge. If we are to be truly scientific, we must reestablish the qualitative grounding of the quantitative.”

Okay. Let me move on now to the second view, the second kind of experimentalism, which I call “mixed methods experimentalism,” and this is basically a position that is outlined in my view in the NRC report.

In general agreement with Campbell’s later views, mixed-methods experimentalism assigns a significant role to qualitative methods. It also distinguishes among types of research questions and advocates tailoring qualitative and quantitative methods to particular kinds of questions.

Like classical experimentalism, however, mixed-methods experimentalism places quantitative experimental research methods and what-works questions at the top of the scientific hierarchy.

Regarding the accumulation of knowledge in educational research, it is disjointed, of short-lived interest and non-cumulative, at least by comparison to disciplines such as physics, chemistry and biology.

The NRC report believes this is consistent with the existence of important lines of educational research that can and should be approached scientifically.

In my view, mixed-method experimentalism is a significant advance over classical experimentalism, but I do have three general criticisms.

First, it is less congenial to qualitative methods and closer to classic experimentalism than it might seem. Quantitative methods are largely constrained to operate within a framework focused on accumulating knowledge about what works. One problem is that what works is elliptical for what works to accomplish outcome O, and the O’s are apparently off the table.

The role of research participants is limited to them receiving the interventions. The interventions and their associated outcomes are laden with values that are assumed, if only implicitly, to be more valuable than other possible interventions and outcomes, and the question of whether the outcomes are achieved is solely an instrumentalist one, solely one about means, not about ends.

It is not that qualitative methods can’t be fruitfully and appropriately used in this instrumental way, but their natural home is in a broadly- interpretive framework that seeks to understand various practices from the insider’s perspective and to give voice to the actors participating in such practices.

The NRC report largely ignores the ascendance of interpretive-oriented philosophy of social science that followed in the wake of the demise of (?). Interpretive philosophy of social science - and here I am using the term very expansively - embraces a difference of kind between social science and natural science, such that human behavior, unlike atoms and molecules, can only be fully understood from the insider’s perspective, in terms of the meanings that actors employ.

The NRC report does little more than wave at this conception of social science. Instead, it embraces the principal of the unity of science, such that social science and natural science exhibit only a difference of degree. In particular, social science is more complex, because it involves many more relevant variables and investigators have less control over them. People don’t always do what you tell them to.

I don’t want to suggest that the authors of the NRC report were obligated to adopt an interpretive perspective. In light of this prominence in the various disciplines, however, as well as the fact that many educational researchers embrace some version of it, it does seem reasonable to have expected to have seen it addressed in a much more serious and comprehensive way.

One final criticism in the NRC framework is the call for more randomized experiments in education seems to come out of the blue and lack any real justification in what precedes this claim that occurs in the conclusion of Chapter 5. Even if, all other things equal, randomized experiments are the best method to establish causal relationships, all other things really are equal, and it doesn’t follow that we need more randomized experiments to be conducted in education simply because few currently are.

Let me wrap up with a few key features of interpretive methodological framework applied to educational research, and this is going to be thumb-nail-type stuff.

One feature of such a framework is that it is dialogical, in the sense that it engages research participants and stakeholders in dialogue. There are both methodological and moral reasons for conducting research in this way.

Methodologically, it provides access to the kind of data relevant to understanding human behavior, as opposed to the kind of thin data that one would use in descriptions to describe physical objects.

Morally, it provides participants the opportunity to have their views heard on the merits of various policies and practices as well as to have their views heard on how they see themselves affected by such policies and practices.

The second feature of an interpretive framework is that it examines the circumstances of disadvantaged - students in terms of larger socio-cultural structures in order to determine what the sources of disadvantages and oppression might be.

The appropriate interventions here are like preventative medicine. Problems, for example, associated with poverty and racial discrimination are identified and eliminated before they result in the need for curative treatments.

Finally, the aims in an interpretive framework can be at odds with accumulating knowledge in the sense that it seeks to ultimately falsify descriptions of social reality earlier documented as true. Work on gender equity in schools is a candidate for a success story in this kind of self-falsifying line of research.

RSS News Feed | Subscribe to e-newsletters | Feedback | Back to Top