Evidence Means Different Things in ESSA and NCLB

Whenever I talk or write about the new evidence standards in the Every Student Succeeds Act (ESSA), someone is bound to ask how this is different from No Child Left Behind (NCLB). Didn’t NCLB also emphasize using programs and practices “based on scientifically-based research?”

Though they look similar on the surface, evidence in ESSA is very different from evidence in NCLB. In NCLB, “scientifically-based research” just meant that a given program or practice was generally consistent with principles that had been established in research, and almost any program can be said to be “based on” research. In contrast, ESSA standards encourage the use of specific programs and practices that have themselves been evaluated. ESSA defines strong, moderate, and promising levels of evidence for programs and practices with at least one significantly positive outcome in a randomized, matched, or correlational study, respectively. NCLB had nothing of the sort.

To illustrate the difference, consider a medical example. In a recent blog, I told the story of how medical researchers had long believed that stress caused ulcers. Had NCLB’s evidence provision applied to ulcer treatment, all medicines and therapies based on reducing or managing stress, from yoga to tranquilizers, might be considered “based on scientifically based research” and therefore encouraged. Yet none of these stress-reduction treatments were actually proven to work; they were just consistent with current understandings about the origin of ulcers, which were wrong (bacteria, not stress, causes ulcers).

If ESSA were applied to ulcer treatment, it would demand evidence that a particular medicine or therapy actually improved or eliminated ulcers. ESSA evidence standards wouldn’t care whether a treatment was based on stress theory or bacteria theory, as long as there was good evidence that the actual treatment itself worked in practice, as demonstrated in high-quality research.

Getting back to education, NCLB’s “scientifically-based research” was particularly intended to promote the use of systematic phonics in beginning reading. There was plenty of evidence summarized by the National Reading Panel that a phonetic approach is a good idea, but most of that research was from controlled lab studies, small-scale experiments, and correlations. What the National Reading Panel definitely did not say was that any particular approach to phonics teaching was effective, only that phonics was a generically good idea.

One problem with NCLB’s “scientifically-based research” standard was that a lot of things go into making a program effective. One phonics program might provide excellent materials, extensive professional development, in-class coaching to help teachers use phonetic strategies, effective motivation strategies to get kids excited about phonics, effective grouping strategies to ensure that instruction is tailored to students’ needs, and regular assessments to keep track of students’ progress in reading. Another, equally phonetic program might teach phonics to students on a one-to-one basis. A third phonics program might consist of a textbook that comes with a free half-day training before school opens.

According to NCLB, all three of these approaches are equally “based on scientifically-based research.” But anyone can see that the first two, lots of PD and one-to-one tutoring, are way more likely to work. ESSA evidence standards insist that the actual approaches to be disseminated to schools be tested in comparison to control groups, not assumed to work because they correspond with accepted theory or basic research.

“Scientifically-based research” in NCLB was a major advance in its time, because it was the first time evidence had been mentioned so prominently in the main federal education law, yet educators soon learned that just about anything could be justified as “based on scientifically-based research,” because there are bound to be a few articles out there supporting any educational idea. Fortunately, enthusiasm about “scientifically-based” led to the creation of the Institute of Education Sciences (IES) and, later, to Investing in Innovation (i3), which set to work funding and encouraging development and rigorous evaluations of specific, replicable programs. The good work of IES and i3 paved the way for the ESSA evidence standards, because now there are a lot more rigorously evaluated programs. NCLB never could have specified ESSA-like evidence standards because there would have been too few qualifying programs. But now there are many more.

Sooner or later, policy and practice in education will follow medicine, agriculture, technology, and other fields in relying on solid evidence to the maximum degree possible. “Scientifically-based research” in NCLB was a first tentative step in that direction, and the stronger ESSA standards are another. If development and research continue or accelerate, successive education laws will have stronger and stronger encouragement and assistance to help schools and districts select and implement proven programs. Our kids will be the winners.


Evidence and the ESSA

The U.S. House of Representatives last week passed the new and vastly improved version of what is now being called the Every Student Succeeds Act (ESSA), the successor to No Child Left Behind (NCLB) and the Elementary and Secondary Education Act (ESEA). For people (such as me) who believe that evidence will provide salvation for education in our country, the House and Senate ESSA conference bill has a lot to like, especially in comparison to the earlier draft.

ESSA defines four categories of evidence based on their strength:

  1. “strong evidence” meaning supported by at least one randomized study;
  2. “moderate evidence” meaning supported by at least one quasi-experimental study;
  3. “promising evidence” meaning at least one correlational study with pretests as covariates; and
  4. programs with a rationale based on high-quality research or a positive evaluation that are likely to improve student or other relevant outcomes and that are undergoing evaluation, often referred to as “strong theory” (though the bill does not use that term).

The top three categories effectively constitute proven programs, as I read the law. For example, seven competitive funding programs would give preference points to applications with evidence meeting one of those categories, and a replacement for School Improvement Grants requires local educational agencies to include “evidence-based interventions” in their comprehensive support and improvement plans.

One good thing about this definition is that for the first time, it unequivocally conveys government recognition that not all forms of evaluation are created equal. Another is that it plants the idea that educators should be looking for proven programs, as defined by rigorous, sharp-edged standards. This is not new to readers of this blog, but is very new to most educators and policy makers.

Another positive feature of ESSA as far as evidence is concerned is that it includes a new tiered-evidence provision called Education Innovation Research (EIR) that would effectively replace the Investing in Innovation (i3) program. Like i3, it is a tiered grant program that will support the development, evaluation and scale-up of local, innovative education programs based on the level of evidence supporting the programs, but without the limitation of program priorities established by the U.S. Department of Education. It is a real relief to see Congress value continued development and evaluation of innovations in education.

Of course, there are also some potential problems, depending on how ESSA is administered. First, the definition for “evidence-based” includes correlational studies, and these are of lower quality than experiments. Worse, if “strong theory” is widely used, then the whole evidence effort may turn out to make no difference, as any program on Earth can be said to have “strong theory.”

A strong theme throughout ESSA is moving away from federal control of education toward state and local control. Philosophically, I have no problem with this, but it could cause trouble in the evidence movement, which has been largely focused on policy in Washington. These developments create a strong rationale for the evidence movement to expand its focus to state and local leaders, not just federal, and that would be a positive development in itself.

In education policy, it’s easy for well-meaning language to be watered down or disregarded in practice. Early on in NCLB, for example, evidence fans were excited by the 110 mentions of “scientifically-based research,” but “scientifically-based” was so loosely defined that it ended up changing very little in school practice (though it did lead to the creation of the Institute for Education Sciences, which mattered a great deal).

So recognizing that things could still go terribly wrong, I think it is nevertheless important to celebrate the potentially monumental achievement represented by ESSA. The evidence parts of the Act were certainly aided by the tireless efforts of numerous organizations that worked collectively to create scrupulously bipartisan coalitions in the House and Senate to support evidence in government. Just seeing both sides of the aisle and both sides of the Capitol collaborate in this crucial effort gives me hope that even in our polarized times, bipartisanship and bicameralism is still possible when children are involved. Congratulations to all who were responsible for this achievement.

Evidence-Based vs. Evidence-Proven

Way back in 2001, when we were all a lot younger and more naïve, Congress passed the No Child Left Behind Act (NCLB). It had all kinds of ideas in it, some better than others, but those of us who care about evidence were ecstatic about the often-repeated requirement that federal funds be used for programs “based on scientifically-based research (SBR),” particularly “based on scientifically-based reading research (SBRR).” SBR and SBRR were famously mentioned 110 times in the legislation.

The emphasis on research was certainly novel, and even revolutionary in many ways. It led to many positive actions. NCLB authorized the Institute for Education Sciences (IES), which has greatly increased the rigor and sophistication of research in education. IES and other agencies promoted training of graduate students in advanced statistical methods and supported the founding of the Society for Research in Educational Effectiveness (SREE), which has itself had considerable impact on rigorous research. The U.S. Department of Education has commissioned high-quality evaluations comparing a variety of interventions such as studies of computer-assisted instruction, early childhood curricula, and secondary reading programs. IES funded development and evaluation of numerous new programs, and the methodologies promoted by IES are essential to Investing in Innovation (i3), a larger effort focused on development and evaluation of promising programs in K-12 education.

The one serious limitation of the evidence movement up to the present is that while it has greatly improved research and methodology, it has not yet had much impact on practices in schools. Part of the problem is just that it takes time to build up enough of a rigorous evidence base to affect practice. However, another part of the problem is that from the outset, “scientifically-based research” was too squishy a concept. Programs or practices were said to be “based on scientifically-based research” if they generally went along with accepted wisdom, even if the specific approaches involved had never been evaluated. For example, “scientifically-based reading research” was widely interpreted to support any program that included the five elements emphasized in the 2000 National Reading Panel (NRP) report: phonemic awareness, phonics, vocabulary, comprehension, and fluency. Every reading educator and researcher knows this list, and most subscribe to it (and should do so). Yet since NCLB was enacted, National Assessment of Educational Progress reading scores have hardly budged, and evaluations of specific programs that just train teachers in the five NRP elements have had spotty outcomes, at best.

The problem with SBR/SBRR is that just about any modern instructional program can claim to incorporate the standards. “Based on…” is a weak standard, subject to anyone’s interpretation.

In contrast, government is beginning to specify levels of evidence far more specific than “based on scientifically-based research.” For example, the What Works Clearinghouse (WWC), the Education Department General Administrative Regulations (EDGAR), and i3 regulations have sophisticated definitions of proven programs. These typically require comparing a program to a control group, using fair and valid measures, appropriate statistical methods, and so on.

The more rigorous definitions of “evidence-proven” mean a great deal as education policies begin to encourage or provide incentives for schools to adopt proven programs. If programs only have to be “based on scientifically-based research,” then just about anything will qualify, and evidence will continue to make little difference in the programs children receive. If more stringent definitions of “evidence-proven” are used, there is a far greater chance that schools will be able to identify what really works and make informed choices among proven approaches.

Evidence-based and evidence-proven differ by just one word, but if evidence is truly to matter in policy, this is the word we have to get right.