Evidence and Policy: If You Want to Make a Silk Purse, Why Not Start With…Silk?

Everyone knows that you can’t make a silk purse out of a sow’s ear. This proverb goes back to the 1500s. Yet in education policy, we are constantly trying to achieve stellar results using school and classroom programs of unknown effectiveness, or even those known to be ineffective, even though proven effective programs are readily available.

Note that I am not criticizing teachers. They do the best they can with the tools they have. What I am concerned about is the quality of those tools, the programs, and professional development teachers receive to help them succeed with their children.

An excellent case in point was School Improvement Grants (SIG), a major provision of No Child Left Behind (NCLB). SIG provided major grants to schools scoring in the lowest 5% of their states. For most of its existence, SIG required schools seeking funding to choose among four models. Two of these, school closure and charterization, were rarely selected. Instead, most SIG schools selected either “turnaround” (replacing the principal and at least 50% of the staff), or the most popular, “transformation” (replacing the principal, using data to inform instruction, lengthening the school day or year, and evaluating teachers based on the achievement growth of their students). However, a major, large-scale evaluation of SIG by Mathematica showed no achievement benefits for schools that received SIG grants, compared to similar schools that did not. Ultimately, SIG spent more than $7 billion, an amount that we in Baltimore, at least, consider to be a lot of money. The tragedy, however, is not just the waste of so much money, but the dashing of so many hopes for meaningful improvement.

This is where the silk purse/sow’s ear analogy comes in. Each of the options among which SIG schools had to choose was composed of components that either lacked evidence of effectiveness or actually had evidence of ineffectiveness. If the components of each option are not known to be effective, then why would anyone expect a combination of them to be effective?

Evidence on school closure has found that this strategy diminishes student achievement for a few years, after which student performance returns to where it was before. Research on charter schools by CREDO (2013) has found an average effect size of zero for charters. The exception is “no-excuses” charters, such as KIPP and Success Academies, but these charters only accept students whose parents volunteer, not whole failing schools. Turnaround and transformation schools both require a change of principal, which introduces chaos and, as far as I know, has never been found to improve achievement. The same is true of replacing at least 50% of the teachers. Lots of chaos, no evidence of effectiveness. The other required elements of the popular “transformation” model have been found to have either no impact (e.g., benchmark assessments to inform teachers about progress; Inns et al., 2019), or small effects (e.g., lengthening the school day or year; Figlio et al., 2018). Most importantly, to blog_9-26-19_pig_500x336my knowledge, no one ever did a randomized evaluation of the entire transformation model, with all components included. We did not find out what the joint effect was until the Mathematica study. Guess what? Sewing together swatches of sows’ ears did not produce a silk purse. With a tiny proportion of $7 billion, the Department of Education could have identified and tested out numerous well-researched, replicable programs and then offered SIG schools a choice among the ones that worked best. A selection of silk purses, all made from 100% pure silk. Doesn’t that sound like a better idea?

In later blogs I’ll say more about how the federal government could ensure the success of educational initiatives by ensuring that schools have access to federal resources to adopt and implement proven programs designed to accomplish the goals of the legislation.

References

Figlio, D., Holden, K. L., & Ozek, U. (2018). Do students benefit from longer school days? Regression discontinuity evidence from Florida’s additional hour of literacy instruction. Economics of Education Review, 67, 171-183.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Transforming Transformation (and Turning Around Turnaround)

At the very end of the Obama Administration, the Institute for Education Sciences (IES) released the final report of an evaluation of the outcomes of the federal School Improvement Grant program. School Improvement Grants (SIG) are major investments to help schools with the lowest academic achievement in their states to greatly improve their outcomes.

The report, funded by the independent and respected IES and carried out by the equally independent and respected Mathematica Policy Associates, found that SIG grants made essentially no difference in the achievement of the students in schools that received them.

Bummer.

In Baltimore, where I live, we believe that if you spend $7 billion on something, as SIG has so far, you ought to have something to show for it. The disappointing findings of the Mathematica evaluation are bad news for all of the usual reasons. Even if there were some benefits, SIG turned out to be a less-than-compelling use of taxpayers’ funds.  The students and schools that received it really needed major improvement, but improved very little. The findings undermine faith in the ability of very low-achieving schools to turn themselves around.

However, the SIG findings are especially frustrating because they could have been predicted, were in fact predicted by many, and were apparent long before this latest report. There is no question that SIG funds could have made a substantial difference. Had they been invested in proven programs and practices, they would have surely improved student outcomes just as they did in the research that established the effectiveness of the proven programs.

But instead of focusing on programs proven to work, SIG forced schools to choose among four models that had never been tried before and were very unlikely to work.

Three of the four models were so draconian that few schools chose them. One involved closing the school, and another, conversion to a charter school. These models were rarely selected unless schools were on the way to doing these things anyway. Somewhat more popular was “turnaround,” which primarily involved replacing the principal and 50% of the staff. The least restrictive model, “transformation,” involved replacing the principal, using achievement growth to evaluate teachers, using data to inform instruction, and lengthening the school day or year.

The problem is that very low achieving schools are usually in low achieving areas, where there are not long lines of talented applicants for jobs as principals or teachers. A lot of school districts just swapped principals between SIG and non-SIG schools. None of the mandated strategies had a strong research base, and they still don’t. Low achieving schools usually have limited capacity to reform themselves under the best of circumstances, and SIG funding required replacing principals, good or bad, thereby introducing instability in already tumultuous places. Further, all four of the SIG models had a punitive tone, implying that the problem was bad principals and teachers. Who wants to work in a school that is being punished?

What else could SIG have done?

SIG could have provided funding to enable low-performing schools and their districts to select among proven programs. This would have maintained an element of choice while ensuring that whatever programs schools chose would have been proven effective, used successfully in other low-achieving schools, and supported by capable intermediaries willing and able to work effectively in struggling schools.

Ironically, SIG did finally introduce such an option, but it was too little, too late.  In 2015, SIG introduced two additional models, one of which was an Evidence-Based, Whole-School Reform model that would allow schools to utilize SIG funds to adopt a proven whole-school approach. The U.S. Department of Education carefully reviewed the evidence and identified four approaches with strong evidence and the ability to expand that could be utilized under this model. But hardly any schools chose to utilize these approaches because there was little promotion of the new models, and few school, district, or state leaders to this day even know they exist.

The old SIG program is changing under the Every Student Succeeds Act (ESSA). In order to receive school improvement funding under ESSA, schools will have to select from programs that meet the strong, moderate, or promising evidence requirements defined in ESSA. Evidence for ESSA, the free web site we are due to release later this month, will identify more than 90 reading and math programs that meet these requirements.

This is a new opportunity for federal, state, and district officials to promote the use of proven programs and build local capacity to disseminate proven approaches. Instead of being seen as a trip to the woodshed, school improvement funding might be seen as an opportunity for eager teachers and administrators to do cutting edge instruction. Schools using these innovative approaches might become more exciting and fulfilling places to work, attracting and retaining the best teachers and administrators, whose efforts will be reflected in their students’ success.

Perhaps this time around, school improvement will actually improve schools.

An Exploded View of Comprehensive School Reform

Recently, I had to order a part for an electric lawnmower. I enjoyed looking at the exploded view (similar to the one above) on the manufacturer’s web site. What struck me about it was that so many of the parts were generic screws, bolts, springs, wheels, and so on. With a bit of ingenuity, I’m sure someone (not me!) could track down generic electric motors, mower blades, and other more specialized parts, and build their very own do-it-yourself lawn mower.

There are just a few problems with this idea.

  1. It would cost a lot more than the original mower
  2. It would take a lot of time that could possibly be used for better purposes
  3. It wouldn’t work and you’d end up with an expensive pile of junk to discard.

Why am I yammering on about exploded views of lawn mowers? Because the idea of assembling lawn mowers from generic parts is a lot like what all too many struggling schools do in the name of whole school reform.

In education, the equivalent do-it-yourself idea using generic parts is the idea that if you choose one program for reading and another for behavior and a third for parent involvement and a fourth for tutoring and a fifth for English learners and a sixth for formative assessment and a seventh for coaching, the school is bound to do better. It might, but this piecemeal approach is really hard to do well.

The alternative to assembling all of those generic parts is to adopt a comprehensive school improvement model. These are models that have coordinated, well worked-out, well-supported approaches to increasing student success. Our own Success for All program is one of them, but there are others for elementary and secondary schools. After years of encouraging schools receiving School Improvement Grants (SIG) to assemble their own comprehensive reforms (remember the lawn mower?), the U.S. Department of Education finally offered SIG schools the option of choosing a proven whole-school approach. In addition to our Success for All program, the U.S. Department of Education approved three other comprehensive programs based on their evidence of effectiveness: Positive Action, the Institute for Student Achievement, and New York City’s small high schools of choice approach. These all met the Department’s standards because they had at least one randomized experiment showing positive outcomes on achievement measures, but some had a lot more evidence than that.

Comprehensive approaches resemble the fully assembled lawn mower rather than the DIY exploded view. The parts of the comprehensive models may be like those of the do-it-yourself SIG models, but the difference is that the comprehensive models have a well-thought-out plan for coordinating all of those elements. Also, even if a school used proven elements to build its own model, those elements would not have been proven in combination, and each might compete for the energies, enthusiasm, and resources of the beleaguered school staff.

This is the last year of SIG under the old rules, but it will continue in a different form under ESSA. The ESSA School Improvement provisions require the use of programs that meet strong, moderate, or promising evidence standards. Assembling individual proven elements is not a terrible idea, and is a real improvement on the old SIG because it at least requires evidence for some of the parts. But perhaps broader use of comprehensive programs with strong evidence of effectiveness for the whole school-wide approach, not just the parts, will help finally achieve the bold goals of school improvement for some of the most challenging schools in our country.

Rx for School Improvement

School Improvement Grants, or SIG, are supposed to be strong medicine for the most difficult ailments in the American school system: schools performing in the lowest 5% of their states. SIG provides proportional funding to states, which then hold competitions among low-achieving elementary and secondary schools. SIG funding is quite substantial, yet evaluations of SIG recipients find modest impacts on achievement. Most SIG schools gain about as much as non-SIG schools in the same state on state achievement measures.

Part of the problem with SIG is that until recently, schools had to choose among four models. All required draconian changes in governance (such as closure or charterization) or personnel (firing the principal and/or half of the staff). Worse, the grants were for only three years, so many schools spent most of that time recovering from SIG-inflicted disruptions. Not to mention that none of the solutions had any evidence of effectiveness.

Last year, SIG changed for the better. Schools could choose among three additional models, including a proven whole-school reform model, in which schools could implement an externally-developed model with at least one large, randomized study indicating positive achievement effects. This and one other model were continued into the current year, after which SIG will transition into a remodeled School Improvement program under the Every Student Succeeds Act (ESSA).

The proven whole-school option should have been a major advance, but it is not yet making much of a difference. Few SIG schools applied under this option for the 2015-2016 school year. One problem is that only four proven programs qualified: two elementary (our Success for All program and Positive Action), and two secondary (Institute for Student Achievement and New York City’s small high schools program). Things may pick up this year, but none of us see any indication yet that this will be the case. More likely, schools will once again chose among the four original models, because they are familiar and known to reliably bring in the grants.

As ESSA requires states and districts to transition to the new School Improvement program, perhaps things will be different. ESSA does not require any particular models, but does require that schools receiving School Improvement funding use programs that meet “strong,” “moderate,” or “promising” standards of evidence of effectiveness.

ESSA could have the same problem as the previous SIG proven whole-school reform option. Not many programs will meet the evidence standards at first. With the traditional four SIG models swept away and more federal emphasis on research, perhaps there will be more use of proven programs in low performing schools, but perhaps not.

Here is an additional idea that could greatly increase the use of proven programs in low performing schools. Consistent with ESSA regulations, School Improvement leaders at the federal and state levels might encourage qualifying schools to either adopt proven whole-school models, as in the current whole-school reform model, or to build their own model, using proven components. For example, a qualifying school might adopt a proven reading program, a proven math program, and a proven tutoring approach for struggling readers. Because there are several proven models of each kind, this would give schools much more flexibility and choice. Coordinating multiple programs takes some care, but the coordination itself would be part of the School Improvement plan. Imagine, for example, that the U.S. Department of Education created a recommended list of components that could be fulfilled by one partner organization (a proven whole-school program) or by several providers of proven approaches. Here is a simple checklist that might be suggested for an elementary school:
 Proven reading program
 Proven math program
 Proven tutoring program for struggling readers
 Proven social-emotional learning/behavior management approach

ESSA allows for a considerable range of evidence, from “strong” (at least one randomized study) to “promising” (one correlational study with statistical controls for pretests). The law is what it is, but I wonder if states and local districts, and perhaps even the U.S. Department of Education, might encourage schools to choose programs that meet the highest standards. The four programs approved by the U.S. Department of Education that meet the current SIG whole-school reform model in the current law had to meet what amounts to the “strong” evidence standard. For schools in major trouble, why would we encourage use of weaker evidence? Stronger evidence increases certainty of effectiveness, and certainty is the goal.

We have spent many years and billions of dollars failing to turn around thousands of schools that demonstrably need a lot of help. These are places where proven programs should make a substantial difference. Can anyone think of a reason we shouldn’t try?

New Years Resolutions for Evidence-Based Reform

I want to wish all of my readers a very happy New Year. The events of 2015, particularly the passage of the Every Student Succeeds Act (ESSA), gives us reason to hope that evidence may finally become a basis for educational decisions. The appointment of John King, clearly the most evidence-oriented and evidence informed Secretary of Education ever to serve in that office, is another harbinger of positive change. Yet these and other extraordinary events only create possibilities for major change, not certainties. What happens next depends on our political leaders, on those of you reading this blog, and on others who believe that children deserve proven programs. In recognition of this, I would suggest a set of New Years resolutions for us all to collectively achieve in the coming year.

1. Get beyond D.C. The evidence movement in education has some sway within a half-mile radius of K Street and in parts of academia, but very little in the rest of the country. ESSA puts a lot of emphasis on moving power from Washington to the states, and even if this were not true, it is now time to advocate in state capitols for use of proven programs and evidence-informed decisions. In the states and even in Washington, evidence-based reform needs a lot more allies.

2. Make SIG work. School Improvement Grants (SIG) written this spring for implementation in fall, 2016, continue to serve the lowest performing 5% of schools in each state. Schools can choose among six models, the four original ones (i.e., school closure, charter-ization, transformation, and turnaround) plus two new ones: proven, whole-school reforms, and state-developed models, which may include proven programs. SIG is an obvious focus for evidence, since these are schools in need of sure-fire solutions, and the outcomes of SIG with the original four models have been less than impressive. Also, since SIG is already well underway, it could provide an early model of how proven programs could transform struggling schools. But this will only happen if there is encouragement to states and schools to choose the proven program option. Perceived success in SIG would go a long way toward building support for use of proven programs more broadly. (School Improvement will undergo significant changes the following year pursuant to ESSA and this merits its own blog, but it’s important to note here that states will be required to include evidence-based interventions as part of their plan, so moving towards evidence now may help ease their transition later.)

3. Celebrate successes of Investing in Innovation (i3). The 2010 cohort, the first and largest cohort of i3 grantees, is beginning to report achievement outcomes from third-party evaluations. As in any set of rigorous evaluations, studies that did not find significant differences are sure to outnumber those that do. We need to learn everything we can from these evaluations, whatever their findings, but there is a particular need to celebrate the findings of those studies that did find positive impacts. These provide support for the entire concept of evidence-based reform, and give practicing educators programs with convincing evidence that are ready to go.

4. In as many federal discretionary funding programs as possible, provide preference points for applications proposing to implement proven programs. ESSA names several (mostly small) funding programs that will provide preference points to proposals proposing to use proven programs. This list should be expanded to include any funding program in which proven programs exist. Is there any reason not to encourage use of proven programs? It costs nothing, does not require use of any particular program, and makes positive outcomes for children a lot more likely.

5. Encourage use of proven programs in formula funding, such as Title I. Formula funding is where the big money goes, and activities funded by these resources need to have as strong an evidence base as possible. Incentives to match formula funding, as in President Obama’s Leveraging What Works proposal, would help, of course, but are politically unlikely at the moment. However, plain old encouragement from Washington and state departments of education could be just as effective. Who can argue against using Title I funds, for example, to implement proven approaches? Will anyone stand up to advocate for ineffective or unproven approaches for disadvantaged children, once the issue is out in the open?

These resolutions are timely, because, at least in my experience, both government and the field adjust to new legislation in the first year, and then whatever sticks stays the same for many years. Whatever does not stick is hard to add in later. The evidence elements of ESSA will matter to the extent our leaders make them matter, right now, in 2016. Let’s do whatever we can to help them make the right choices for our children.

When Will We Reach Our 1962 Moment in Education?

2015-04-23-1429791233-4460932-HP69_57Chevy.jpg

When I was in college, I had an ancient 1957 Chevy. What a great car. Stylish, dependable, indestructible.

My 1957 Chevy was beautiful, but it had no seatbelts, no airbags, and no recourse if the brakes went out. It got about 13 miles to the gallon, polluted the atmosphere, and was not expected to last more than 100,000 miles. Due to development, evaluation, and public-spirited policy, all these problems have been solved. Automotive design has been revolutionized by embracing policies based on innovation and evidence.

Not that I remember 1957 very well, but I was thinking about it as a model for where we are today in evidence-based reform in education, as distinct from medicine. In 1957, drug companies could make any claims they liked about medications. There was research, but physicians routinely ignored it. However, change was on the way. In 1962, the Kefauver-Harris Amendment required that all drug applications to the Food and Drug Administration (FDA, established in 1927) demonstrate “substantial evidence” of safety and effectiveness. These standards continue to evolve, but today it is unthinkable that drug companies could make misleading claims about unproven medicines.

In 1957, the progress toward evidence-based reform in medicine would have been clear, but the policy world was not yet ready. For one thing, the American Medical Association fought tooth and nail against the evidence standards, as did most drug companies. Yet evidence prevailed because despite the power and money of the AMA and the drug companies, millions of ordinary citizens, not to mention the majority of physicians, knew that prescribing medications of unknown safety and effectiveness was just plain wrong. Everyone takes medicine, or we have relatives who do, and we want to know what works and what doesn’t. Specifically, a European drug called Thalidomide taken by pregnant mothers caused massive and widespread birth defects, and this swept away the opposition to drug testing standards.

In education, we have not reached our 1962 moment. Publishers and software developers are free to make any claims they like about the effectiveness of their products, and educators have difficulty sorting effective from ineffective products. Yet the handwriting is on the wall. Rigorous evaluations of educational programs are becoming more and more common. Many of these evaluations are being paid for by the companies themselves, who want to be on the right side of history when and if our 1962 moment arrives.

In education, our 1962 will probably not involve an equivalent of the FDA or a prohibition on the use of untested products. Unlike medicine, few educational products are likely to be harmful, so experimentation with new approaches is a lot safer. What is more likely, I believe, is that there will be incentives and encouragement from various levels of government for schools to adopt proven programs. In particular, I think it is very likely that Title I and other federal programs will begin insisting on a strong evidence base for investments of federal dollars.

To reach our 1962 moment will require sustained investment in development, evaluation, and scale-up of proven programs in all subjects and grade levels, and a change of policies to encourage the use of proven programs.

I hope our 1962 moment is coming soon. To bring it closer, we have a lot of work to do, in innovation, evaluation, policy, and practice. Government, foundations, innovators, researchers, and anyone who knows the transformative potential of education should be working toward the day when we no longer have to guess what works and what doesn’t. This is the time to build up our stock of proven, replicable programs of all kinds. It is also the time to try policy experiments such as Investing in Innovation (i3)SIG evidence-proven whole-school models, and Leveraging What Works, because when our 1962 comes, we will need to know how to build support for the whole evidence movement. Like my beloved 1957 Chevy, I hope we’re driving confidently toward our 1962 and beyond, confident that every new year will bring better outcomes for all.

Helping Struggling Schools

2015-04-02-1427985578-9049647-MrFixIt_500x39004_2_2015.jpg

Illustration by James Bravo

There are a lot of schools in the U.S. that need to be achieving much better outcomes. However, there is a much smaller group of schools in which achievement levels are appalling. The solutions for garden-variety low-achieving schools are arguably different from those for schools with the very worst levels of performance.

In recent years, a key resource for very low-achieving schools has been the School Improvement Grants (SIG) program. SIG provides substantial funding to schools in the lowest 5 percent of their states on achievement measures. Up until this year, schools receiving SIG funds had to choose among four models.

This year, three new models were added to SIG. These include an option to use an evidence-proven whole-school reform model; such a model has to have been successfully evaluated in a study meeting What Works Clearinghouse (WWC) standards. This is potentially a significant advance. Perhaps, if many schools choose proven models, this option will enhance the effectiveness and image of the much-maligned SIG program. (Full disclosure: Our Success for All program is one of four approaches approved thus far by the U.S. Department of Education for use under the evidence-proven whole-school option for SIG.)

However, stop for a moment and consider what’s going on here. The lowest-achieving schools in America are being offered substantial funding (currently, up to $2 million over five years) to turn themselves around. These schools need the very best programs provided by organizations with a proven track record. Of all schools, why should these very needy schools receive unproven approaches, perhaps invented from scratch for the SIG proposal and therefore never even piloted before? When your car breaks down, do you tow it to a mechanic who has never fixed a car before? When you have medical problems, do you want to be someone’s first patient?

There should always be a fund separate from ordinary Title I to provide intensive assistance to schools in the lowest 5 percent of their states on state assessments. However, instead of focusing SIG on governance, personnel, and untested prescriptions, as it has done up to now, SIG (or its successor) should focus on helping schools select and effectively implement proven programs. In addition to the four “evidence-proven whole-school reform” models identified recently, SIG schools might be funded to implement a mix of reading approaches, math approaches, tutoring models, and social-emotional approaches, for example, each of which has convincing evidence of effectiveness.

The recent changes in SIG, allowing proven whole-school reforms, are a big step in the right direction, but additional steps in the same direction are needed to make this crucial investment a model of wise use of federal funds to solve serious problems in education.