School Improvement Grants Embrace Evidence of Effectiveness


Despite all of the exciting gains made by evidence-based reform in recent years, all of the progress so far has been limited to development, evaluation, and scale-up of proven programs outside of mainline education policy or funding. Title I, Title II, Race to the Top, School Improvement Grants, and other large, influential funding sources for reform have hardly been touched by the growth of proven, replicable programs sponsored primarily by the Institute of Education Sciences and Investing in Innovation (i3). Until the evidence movement crosses over from R&D to the real world of policy and practice, it will remain the domain of academics and policy wonks, not a real force for change.

In the recently passed Omnibus budget, however, appears a first modest step over the R&D/policy divide. This is a new provision in congressional authorization of School Improvement Grants (SIG). Up until now, SIG schools (ones that have suffered from very low achievement levels for many years) had to choose among four models, all of which require major changes in staffing. Each SIG school is expected to develop its own model of reform, usually with the help of consultants. The problem has been that each of the hundreds of schools receiving (substantial) SIG funding has to create its own never-before-tested path to reform, and then try to implement it with quality in a school that has just experienced a substantial turnover of its leadership and staff.

The “Fifth Option” recently introduced by Congress adds a new alternative. SIG schools can choose to adopt a “proven whole-school reform model” that meets at least a moderate level of evidence support, which includes having been tested against a control group in at least one rigorous experiment. The fifth option will let schools keep their leaders and staffs, but adopt a schoolwide approach that has been used in many similar schools and found to be effective.

The Omnibus bill was passed too late in the year to apply this fifth option to the 2014-2015 school year, and the U. S. Department of Education, as well as individual states, have a lot of work to do to prepare new regulations and supports for schools applying for SIG funds under this new option in 2015-2016.

However, the fifth option makes an important statement that has not been made previously. In a major school improvement (not R&D) funding program, the fifth option says “use what works.” Wisely, it does not mandate the use of any specific programs, but by highlighting evidence-proven approaches, it puts the government behind the idea that federal funding should whenever possible be used to help educators use programs with strong evidence of effectiveness. This could be the start of something beautiful.


Success in Evidence-Based Reform: The Importance of Failure

As always, Winston Churchill said it best: “Success consists of going from failure to failure without loss of enthusiasm.” There is a similar Japanese saying: “Success is being knocked down seven times and getting up eight.”

These quotes came to my mind while I was reading a recently released report from the Aspen Institute, “Leveraging Learning: The Evolving Role of Federal Policy in Education Research.” The report is a useful scan of the education research horizon, intended as background for the upcoming reauthorization of the Education Sciences Reform Act (ESRA), the legislation that authorizes the Institute of Education Sciences (IES). However, the report also contains brief chapters by various policy observers (including myself), focusing on how research might better inform and improve practice and outcomes in education. A common point of departure in some of these was that while randomized experiments (RCTs) emphasized for the past decade by IES and, more recently, Investing in Innovation (i3), are all well and good, the IES experience is that most randomized experiments evaluating educational programs find few achievement effects. Several cited testimony by Jon Baron that “of the 90 interventions evaluated in randomized trials by IES, 90% were found to have weak or no positive effects.” As a response, the chapter authors proposed various ways in which IES could add to its portfolio more research that is not RCTs.

Within the next year or two, the problem Baron was reporting will take on a great deal of importance. The results of the first cohort of Investing in Innovation grants will start being released. At the same time, additional IES reports will appear, and the Education Endowment Foundation (EEF) in the U.K., much like i3, will also begin to report outcomes. All four of the first cohort of scale-up programs funded by i3 (our Success for All programReading RecoveryTeach for America, and KIPP) have had positive first-year findings in i3 or similar evaluations recently, but this is not surprising, as they had to pass a high evidence bar to get scale-up funding in the first place. The much larger number of validation and development projects were not required to have such strong research bases, and many of these are sure to show no effects on achievement. Kevan Collins, Director of the EEF, has always openly said that he’d be delighted if 10% of the studies EEF has funded find positive impacts. Perhaps in the country of Churchill, Collins is better placed to warn his countrymen that success in evidence-based reform is going to require some blood, sweat, toil, and tears.

In the U.S., I’m not sure if policymakers or educators are ready for what is about to happen. If most i3 validation and development projects fail to produce significant positive effects in rigorous, well-conducted evaluations, will opinion leaders celebrate the programs that do show good outcomes and value the knowledge gained from the whole process, including knowledge about what almost worked and what to avoid doing next time? Will they support additional funding for projects that take these learnings into account? Or will they declare the i3 program a failure and move on to the next set of untried policies and practices?

I very much hope that i3 or successor programs will stay the course, insisting on randomized experiments and building on what has been learned. Even if only 10% of validation and development projects report clear, positive achievement outcomes and capacity to go to scale, there will be many reasons to celebrate and stay on track:

1. There are currently 112 i3 validation and development projects (plus 5 scale-ups). If just 10% of these were found to be effective and scalable, that would be 11 new programs. Adding this to the scale-up programs and other programs already positively reviewed in the What Works Clearinghouse, this would be a substantial base of proven programs. In medicine, the great majority of treatments initially evaluated are found not to be effective, yet the medical system of innovation works because the few proven approaches make such a big difference. Failure is fine if it leads to success.

2. Among the programs that do not produce statistically significant positive outcomes on achievement measures, there are sure to be many that show promise but do not quite reach significance. For example, any program whose evaluation shows a student-level positive effect size of, say, +0.15 or more should be worthy of additional investment to refine and improve its procedures and its evaluation to reach a higher standard, rather than being considered a bust.

3. The i3 process is producing a great deal of information about what works and what does not, what gets implemented and what does not, and the match between schools’ needs and programs’ approaches. These learnings should contribute to improvements in new programs, to revisions of existing programs, and to the policies applied by i3, IES, and other funders.

4. As the findings of the i3 and IES evaluations become known, program developers, grant reviewers, and government leaders should get smarter about what kinds of approaches are likely to work and to go to scale. Because of this, one might imagine that even if only 10% of validation and development programs succeed in RCTs today, higher and higher proportions will succeed in such studies in the future.

Evidence-based reform, in which promising scalable approaches are ultimately evaluated in RCTs or similarly rigorous evaluations, is the best way to create substantial and lasting improvements in student achievement. Failures of individual evaluations or projects are an expected, even valued part of the process of research-based reform. We need to be prepared for them, and to celebrate the successes and the learnings along the way.

As Churchill also said, “Success is not final, failure is not fatal; It is the courage to continue that counts.”