What Kinds of Studies Are Likely to Replicate?

Replicated scientists 03 01 18

In the hard sciences, there is a publication called the Journal of Irreproducible Results.  It really has nothing to do with replication of experiments, but is a humor journal by and for scientists.  The reason I bring it up is that to chemists and biologists and astronomers and physicists, for example, an inability to replicate an experiment is a sure indication that the original experiment was wrong.  To the scientific mind, a Journal of Irreproducible Results is inherently funny, because it is a journal of nonsense.

Replication, the ability to repeat an experiment and get a similar result, is the hallmark of a mature science.  Sad to say, replication is rare in educational research, which says a lot about our immaturity as a science.  For example, in the What Works Clearinghouse, about half of programs across all topics are represented by a single evaluation.  When there are two or more, the results are often very different.  Relatively recent funding initiatives, especially studies supported by Investing in Innovation (i3) and the Institute for Education Sciences (IES), and targeted initiatives such as Striving Readers (secondary reading) and the Preschool Curriculum Evaluation Research (PCER), have added a great deal in this regard. They have funded many large-scale, randomized, very high-quality studies of all sorts of programs in the first place, and many of these are replications themselves, or they provide a good basis for replications later.  As my colleagues and I have done many reviews of research in every area of education, pre-kindergarten to grade 12 (see www.bestevidence.org), we have gained a good intuition about what kinds of studies are likely to replicate and what kinds are less likely.

First, let me define in more detail what I mean by “replication.”  There is no value in replicating biased studies, which may well consistently find the same biased results (as when, for example, both the original studies and the replication studies used the same researcher- or developer-made outcome measures that are slanted toward the content the experimental group experienced but not what the control group experienced) (See http://www.tandfonline.com/doi/abs/10.1080/19345747.2011.558986.)

Instead, I’d consider a successful replication one that shows positive outcomes both in the original studies and in at least one large-scale, rigorous replication. One obvious way to increase the chances that a program producing a positive outcome in one or more initial studies will succeed in such a rigorous replication evaluation is to use a similar, equally rigorous evaluation design in the first place. I think a lot of treatments that fail to replicate are ones that used weak methods in the original studies. In particular, small studies tend to produce greatly inflated effect sizes (see http://www.bestevidence.org/methods/methods.html), which are unlikely to replicate in larger evaluations.

Another factor likely to contribute to replicability is use in the earlier studies of methods or conditions that can be repeated in later studies, or in schools in general. For example, providing teachers with specific manuals, videos demonstrating the methods, and specific student materials all add to the chances that a successful program can be successfully replicated. Avoiding unusual pilot sites (such as schools known to have outstanding principals or staff) may contribute to replication, as these conditions are unlikely to be found in larger-scale studies. Having experimenters or their colleagues or graduate students extensively involved in the early studies diminishes replicability, of course, because those conditions will not exist in replications.

Replications are entirely possible. I wish there were a lot more of them in our field. Showing that programs can be effective in just two rigorous evaluations is way more convincing than just one. As evidence becomes more and more important, I hope and expect that replications, perhaps carried out by states or districts, will become more common.

The Journal of Irreproducible Results is fun, but it isn’t science. I’d love to see a Journal of Replications in Education to tell us what really works for kids.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

One thought on “What Kinds of Studies Are Likely to Replicate?

  1. Thank you for your blog post! I am a graduate student at UNC (Early Education and Family Support); Dr. Simeonsson, our professor for Child Development and Disability, assigned us to carry out three replications. Many of us were wondering why he gave us such an assignment, and now, after researching studies that I can replicate in my own prekindergarten classroom, I realize that there are not that many replicated studies. I have two completed (cognitive and communication) but am having difficulty selecting one for social-emotional. Do you have any suggestions? Again, thank you for your post and assistance.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s