A recent article in The New York Times asked a provocative question: Do clinical trials work? The article was written about clinical trials in medicine, especially cancer research, where it often happens that promising medications and procedures found to be effective in small studies turn out to be ineffective in large ones.
As an advocate of clinical trials (randomized experiments) in education, I found the article distressing. In education, as in medicine, larger and better-controlled experiments frequently fail to find positive effects of programs found to be effective in smaller and less-well-controlled studies. In fact, there is a clear relationship between study sample sizes and outcomes: The larger the study, the lower the reported impact of the treatment.
Some in both medicine and education are wondering if different research methods are needed that are more likely to show positive effects. In my view, this is foolish. The problem is not in the research methods. What we need to do is to identify why so many experiments show no impacts, and solve those problems.
One common problem in randomized experiments in education, for example, is that before the experiments, both experimental and control teachers must not have been using the experimental method. If the treatment is difficult to learn to use, this may mean that in a study of one year or less, teachers using the new method did not get good at it until near the end of the experiment. There are numerous experiments in education in which there were no impacts in the first year but significant impacts in the second. Yet a one-year study would not find out about the second-year impacts, shortchanging the treatments’ reported effect.
As someone who does a lot of meta-analyses of educational treatments, another problem I routinely see is that many large, randomized evaluations assess weak treatments. For example, there are dozens of studies comparing a publisher’s new textbook in comparison to existing textbooks. Such studies invariably produce effect sizes near zero. This does not mean that the new texts are ineffective, but that they are no more effective than other texts. Technology studies also often evaluate ho-hum commercial software unlikely to make much difference. All too often, researchers carry out large and expensive evaluations of programs that are too poorly defined or too much like ordinary practice to show much impact. Wishful thinking runs up against harsh reality in large, well-controlled experiments.
The funding structure for research in education often leads experimenters to carry out large scale, randomized evaluations of programs that are not fully ready for large-scale evaluation. New and truly innovative methods often need to be piloted, evaluated on a modest scale, and only then subjected to large-scale evaluation, but funding for small-scale formative evaluation is hard to obtain.
In education, as in medicine, there is a problem of too many disappointing findings in clinical trials. Yet the solution is not to abandon clinical trials. It is to create more powerful and effective treatments.