Money and Evidence

Many years ago, I spent a few days testifying in a funding equity case in Alabama. At the end of my testimony, the main lawyer for the plaintiffs drove me to the airport. “I think we’re going to win this case,” he said, “But will it help my clients?”

The lawyer’s question has haunted me ever since. In Alabama, then and now, there are enormous inequities in education funding in rich and poor districts due to differences in property tax receipts in different districts. There are corresponding differences in student outcomes. The same is true in most states. To a greater or lesser degree, most states and the federal government provide some funding to reduce inequalities, but in most places it is still the case that poor districts have to tax themselves at a higher rate to produce education funding that is significantly lower than that of their wealthier neighbors.

Funding inequities are worse than wrong, they are repugnant. When I travel in other countries and try to describe our system, it usually takes me a while to get people outside the U.S. to even understand what I am saying. “So schools in poor areas get less than those in wealthy ones? Surely that cannot be true.” In fact, it is true in the U.S., but in all of our peer countries, national or at least regional funding policies ensure basic equality in school funding, and in most cases I know about they then add additional funding on top of equalized funding for schools serving many children in poverty. For example, England has long had equal funding, and the Conservative government added “Pupil Premium” funding in which each disadvantaged child brings additional funds to his or her school. Pupil Premium is sort of like Title I in the U.S., if you can imagine Title I adding resources on top of equal funding, which it does in only a few U.S. states.

So let’s accept the idea that funding inequity is a BAD THING. Now consider this: Would eliminating funding inequities eliminate achievement gaps in U.S. schools? This gets back to the lawyer’s question. If we somehow won a national “case” that required equalizing school funding, would the “clients” benefit?

More money for disadvantaged schools would certainly be welcome, and it would certainly create the possibility of major advances. But in order to maximize the impact of significant additional funding, it all depends on what schools do with the added dollars. Of course you’d have to increase teachers’ salaries and reduce class sizes to draw highly qualified teachers into disadvantaged schools. But you’d also have to spend a significant portion of new funds to help schools implement proven programs with fidelity and verve.

Again, England offers an interesting model. Twenty years ago, achievement in England was very unequal, despite equal funding. Children of immigrants from Pakistan and Bangladesh, Africans, Afro-Caribbeans, and other minorities performed well below White British children. The Labour government implemented a massive effort to change this, starting with the London Challenge and continuing with a Manchester Challenge and a Black Country Challenge in the post-industrial Midlands. Each “challenge” provided substantial professional development to school staffs, as well as organizing achievement data to show school leaders that other schools with exactly the same demographic challenges were achieving far better results.

Today, children of Pakistani and Bangladeshi immigrants are scoring at the English mean. Children of African and Afro-Caribbean immigrants are just below the English mean. Policy makers in England are now turning their attention to White working-class boys. But the persistent and substantial gaps we see as so resistant to change in the U.S. are essentially gone in England.

Today, we are getting even smarter about how to turn dollars into enhanced achievement, due to investments by the Institute of Education Sciences (IES) and Investing in Innovation (i3) program in the U.S. and the Education Endowment Foundation (EEF) in England. In both countries, however, we lack the funding to put into place what we know how to do on a large enough scale to matter, but this need not always be the case.

Funding matters. No one can make chicken soup out of chicken feathers, as we say in Baltimore. But funding in itself will not solve our achievement gap. Funding needs to be spent on specific, high-impact investments to make a big difference.


Accelerating the Pace of Innovation

The biggest problem in evidence-based reform in education is that there are too few replicable programs that have strong evidence of effectiveness available to educators. The evidence provisions of the Every Student Succeeds Act (ESSA) encourage the use of programs that have strong, moderate, or promising evidence of effectiveness, and they require School Improvement efforts (formerly SIG) to include approaches with evidence that meets these definitions. There are significant numbers of programs that do meet these definitions, but not enough to give educators multiple choices of proven programs for each subject and grade level. The Institute for Education Sciences (IES), Investing in Innovation (i3) program, the National Science Foundation (NSF), and England’s Education Endowment Foundation (EEF) have all been supporting rigorous evaluations of replicable programs at all levels, and this work (and work funded by others) is progressively enriching offerings of programs that are both proven to be effective and ready for widespread dissemination. However, progress is slow. Large-scale randomized experiments demanded by these funders are expensive and may take many years to be completed. As in any scientific field (such as medicine), most experiments do not show positive outcomes for innovative treatments. At a time when demand is starting to pick up, the supply needs to keep pace.

Given that money is not being thrown at education research by Congress or other funders, how can promising innovations be evaluated, made ready for dissemination, and taken to scale? First, existing funders need to be supported adequately to continue the good work they are doing. Grants for Education Innovation and Research (EIR) will pick up where i3 ends, and IES needs to maintain its leadership in supporting development and evaluation of promising programs in all subjects and grade levels. The National Science Foundation should invest far more in creating, evaluating, and disseminating proven STEM approaches. All of this work, in fact, is in need of increased funding and publicity to build political and public support for the entire enterprise.

However, there are several additional avenues that might be pursued to increase the number of proven, ready-to-disseminate approaches. One promising model is low-cost randomized evaluations of interventions supported by government or other funding. Both IES and the Laura and John Arnold Foundation are offering support for such studies. For example, imagine that a school district is introducing a digital textbook to its schools, however, it only can afford to provide the program to 30 schools each year. If the district finds 60 schools willing to receive the program and randomly assigns half of them to start in a given year, then it is spending no more on digital textbooks than it planned to spend. If state test scores can be obtained and used as pre- and post-tests, then the measurement costs nothing. The only costs of studying the effects of the digital textbooks might be the costs of data analysis, perhaps some questionnaires or observations to find out what schools did with the digital textbooks, and a report. Such a study would be very inexpensive, might produce results within a year or two, and would be evaluating something that is appealing to schools and ready to go.

Beyond these existing strategies, others might be considered to speed up the proven programs process. One example might be to build on Small Business Innovation Research (SBIR) grants. At $1 million over two years, these grants, limited to for-profit companies, are often too small to develop and evaluate promising approaches (usually, technology applications). IES or other funders might proactively look for promising SBIR projects and encourage them to apply for larger funding to complete development and do rigorous evaluations. One advantage of SBIRs is that they are usually created by small, ambitious, undercapitalized companies, which are motivated to take their programs to scale.

Another strategy that might work could be to fund “aggregators” whose job would be to identify promising approaches from any source, help assemble partnerships if necessary, and then help prepare applications for funding. This could help young innovators with great ideas combine their efforts, create more complete and powerful innovations, and subject them to rigorous evaluations. In addition to SBIR-funded projects, promising program elements might be found in projects funded by private foundations or agencies outside of education. They might be components of IES or i3 projects that produced promising but not conclusive outcomes in their evaluations, perhaps due to insufficient sample size. Aggregators might link up programs with broad reach but limited technology with brash technology start-ups in need of access to markets. If the goal is finding promising but incomplete efforts and helping them reach effectiveness and scale, every source should be fair game.

Government has made extraordinary progress in promoting the development, rigorous evaluation, and scale-up of proven programs. However, its success has led to a demand for proven programs that it cannot fulfill at the usual pace. Current grant programs at IES and i3/EIR should continue, but in addition we need innovative strategies capable of greatly accelerating the pace of development, evaluation, and scale up.

Educationists and Economists

I used to work part time in England, and I’ve traveled around the world a good bit speaking about evidence-based reform in education and related topics. One of the things I find striking in country after country is that at the higher levels, education is not run by educators. It is run by economists.

In the U.S., this is also true, though it’s somewhat less obvious. The main committees in Congress that deal with education are the House Education and the Workforce Committee and the Senate Health, Education, Labor, and Pensions (HELP) Committee. Did you notice the words “workforce” and “labor”? That’s economists. Further, politicians listen to economists, because they consider them tough-minded, data-driven, and fact-friendly. Economists see education as contributing to the quality of the workforce, now and in the future, and this makes them influential with politicians.

A lot of the policy prescriptions that get widely discussed and implemented broadly are the sorts of things economists love to dream up. For example, they are partial to market incentives, new forms of governance, rewards and punishments, and social impact bonds. Individual economists, and the politicians who listen to them, take diverse positions on these policies, but the point is that economists rather than educators often set the terms of the debates on both sides. As one example, educators have been talking about long-term impacts of quality preschool for 30 years, but when Nobel Prize-winning economist James Heckman took up the call, preschool became a top priority of the Obama Administration.

I have nothing against economists. Some of my best friends are economists. But here is why I am bringing them up.

Evidence-based reform is creating a link between educationists and economists, and thereby to the politicians who listen to them, that did not exist before. Evidence-based reform speaks the language that economists insist on: randomized evaluations of replicable programs and practices. When an educator develops a program, successfully evaluates it at scale, and shows it can be replicated, this gives economists a tangible tool they can show will make a difference in policy. Other research designs are simply not as respected or accepted. But an economist with a proven program in hand has a measurable, powerful means to affect policy and help politicians make wise use of resources.

If we want educational innovation and research to matter to public policy, we have to speak truth to power, in the language of power. And that language is increasingly the language of rigorous evidence. If we keep speaking it, our friends the economists will finally take evidence from educational research seriously, and that is how policy will change to improve outcomes for children on a grand scale.

R&D That Makes a Difference

Over the course of my career, I’ve written a lot of proposals. I’ve also reviewed a lot, and mostly, I’ve seen many funded projects crash and burn, or produce a scholarly article or two that are never heard of again.

As evidence becomes more important in educational policy and practice, I think it’s time to rethink the whole process of funding for development, evaluation, and dissemination.

Here’s how the process works now at the federal level. The feds put out a Request for Proposals (RFP) in the Federal Register. It specifies the purpose of the grant, who is eligible, funding available, deadlines, and most importantly, the criteria on which the proposals will be judged. Proposal writers know that they must follow those criteria very carefully to make it easy for readers to know that each criterion has been satisfied.

The problem with the whole proposal system lies in the perception that each proposal starts with a perfect score (usually 100), and is then marked down for any deficiencies. To oversimplify, reviewers nitpick, and if there is much left after the nits have been picked, the proposal wins.

What this system rewards is enormous care and OCD-level attention to detail. It does not reward creativity, risk, insight, or actual utility for schools. Yet funding grants that do not move forward practice at any significant scale do not do much good in an applied field like education (in related fields such as psychology, purely basic research might justify such approaches, but in education this is a hard argument to make). Maybe our collective inability to do research that affects practice on a broad scale explains some of the lack of enthusiasm our political leadership has for research.

So what would I propose as an alternative? I’m so glad you asked. I’d propose that RFPs be explicitly structured to ask not, “Why shouldn’t we fund this proposal,” but, “Why should we?” That is, proposal writers should be asked to make a case for the potential importance of their work. Here’s a model set of evaluation standards to illustrate what I mean.

A. Significance
1. What are you planning to create?
2. What national problem does your proposed program potentially solve?
3. What outcomes do you expect to achieve, and why are these important?
4. Based on prior research by yourself and others, what is the likelihood that your program will produce the outcomes you expect?
5. What is the likelihood that, if your program is successful, it will work on a significant scale? What is your experience with working at scale or scaling up proven programs in educational settings?
6. In what way is your program creative or distinctive? How might it spark new thinking or development to solve longstanding problems in education?

B. Capabilities
1. Describe the organizational capabilities of the partners to this proposal, as well as the capabilities of the project leadership. Consider capabilities in the following areas:
a. development
b. roll-out, piloting
c. evaluation
d. reporting
e. scale-up
f. communications, marketing
2. Timelines, milestones

C. Evaluation
1. Research questions
2. Design, analysis

D. Impact
Given all you’ve written so far, summarize in one page why this project will make a substantial difference in educational practice and policy.

If we want research and development to produce useful solutions to educational problems, we have to ask the field for just that, and reward those able to produce, evaluate, and disseminate such solutions. Ironically, the federal funding stream closest to the ideal I’ve described is the Investing in Innovation (i3) program, which Congress may be about to shut down. i3 is at least focused on pragmatic solutions rather than theory-building and it has high standards of evidence. But if i3 survives or if it is replaced by another initiative to support innovation, development, evaluation, and scale-up of proven programs, I’d argue that it needs to focus even more on pragmatic issues of effectiveness and scale. Reviewers should be exclaiming, “I get it!” rather than “I gotcha!”

The Evidence or The Morgue

Many years ago, when I was a special education teacher, I had a summer job at a residential school for emotionally disturbed children. The school happened to be located in a former tuberculosis sanitarium. Later, I heard from other teachers elsewhere about having worked in schools in one-time sanitaria as well.

How did it come about, one might ask, that tuberculosis sanitaria across the country became available for use as schools? The answer is that researchers cured the disease. The sanitaria were no longer needed for their original purpose, so they were turned into schools.

One feature of the former sanitaria is that they all had morgues. We used ours to store curriculum materials, because it had very sturdy and useful sliding horizontal cabinets. This arrangement led to a certain amount of macabre humor, but the morgue reminded us that what the sanitaria had once done was deadly serious indeed.

I was recalling my summer in the sanitarium after reading about the latest developments in the reauthorization of ESEA. Both the House and the Senate have now passed bills that eliminate the Investing in Innovation (i3) program and cut funding for the Institute of Education Sciences. In their place, the bills have a lot of language about state and local control, and about identifying and publicizing individual schools that are doing a particularly good job so their good works can help inspire and influence other schools. None of this would bother me if the legislation contained a clear commitment to rigorous research, development, and dissemination, but this may or may not be the case.

The Senate bill, which passed with bipartisan support last week, does authorize an evidence-based innovation fund. Modeled on the successful Small Business Innovation Research (SBIR) program, which funds innovation and evaluation in 11 different government agencies, this initiative would provide flexible funding for a broad range of field-driven projects and allow schools, districts, non-profits, and small businesses to develop and grow innovative programs to improve student achievement. Grants would be awarded using a tiered evidence framework based on an applicant’s proven effectiveness. The provision was initially offered and accepted as a bipartisan amendment during the Senate HELP Committee markup of its bill. However, the House bill has no comparable provision, and I have to wonder if the Senate provision will survive the grueling conference process and make it into the final bill.

Try to imagine what would have happened if tuberculosis research had been treated the way education research has been treated in the House version of the ESEA reauthorization bill. Individual sanitaria with lower death rates might be recognized. States and localities might try out ideas to make the sanitaria more effective, but few if any states or localities would be large enough to do the necessary sustained R & D. “Best practices” would be constrained by the current system, so they might involve better ways for sanitarium staff to do exercises with patients, for example, rather than experimenting with medications or other treatments. The disease would never have been cured. The morgues would still be used for unfortunate patients, not for curriculum materials.

The U.S. spends hundreds of billions of dollars every year on education. What student, parent, teacher, administrator, or policy maker does not want those billions used to make as much of a difference as possible? The pursuit of knowledge about how to improve educational outcomes is obviously important, but it is rarely very high on anyone’s priority list.

Fortunately, medicine and other fields long ago decided that research was in the national interest, and that investments in research were the most reliable way forward in improving important outcomes. In medicine, the choice is stark: either the evidence prevails or the morgue does. Yet in education, anyone with eyes to see knows what happens when children fail to learn. Most of the children who cannot read end up unemployed. Many end up in prison, and all too many in the morgue. We know enough now to be able to say that the great majority of reading failure, for example, is preventable. Yet we choose not to prevent it. What does this say about us as a people, as a society, as a political system?

I hope our leaders in Congress approve the Senate language on evidence, or something similar, and reinstate and fund programs that have the greatest promise in identifying and disseminating effective approaches to key problems. The lives of a generation of vulnerable children depend on their wisdom and courage at this critical juncture.

Good Failure/Bad Failure

Evidence junkies (like me) are reacting to the disappointing news on the evaluation of the Adolescent Behavioral Learning Experience (ABLE), a program implemented at Rikers Island to reduce recidivism among adolescent prisoners. Bottom line: The rigorous independent evaluation of the program failed to find any benefits. What makes this experiment especially interesting is that it is the first U.S. application of social impact bonds. Goldman Sachs put up a $7.2 million loan, and Bloomberg Philanthropies committed to a $6 million loan guarantee. Since the program did not produce the expected outcomes, Goldman Sachs lost $1.2 million.

Ironically, New York City administrators are delighted about the outcome because they do not have to pay for the program. They think they learned a great deal from the experience, for free.

It’s unclear what this will do to the social impact bond movement, currently in its infancy. However, I wanted to extend from this fascinating case to a broader issue in evidence-based reform.

The developers and advocates for the ABLE program who expected positive outcomes turned out to be wrong, at least in this implementation. The investors were wrong in expecting to make a profit. But I’d argue that they are all better off because of this experience, just as the N.Y.C. administrators said.

The distinction I want to make is between wrong and wrong-headed. Wrong, as I’m defining it in this context, means that a given outcome was not achieved, but it was entirely reasonable to expect that it might have been achieved. In contrast, wrong-headed means that not only was the desired outcome not achieved, but it was extremely unlikely that it could have been achieved. In many cases, a key component of wrong-headed actions is that the actor does not even know whether the action was effective or ineffective, right or wrong, and therefore continues with the same or similar actions indefinitely.

Wrong, I’d argue, is an honorable and useful outcome. In a recent interview, former White House advisor Gene Sperling noted that when a few cancer drugs fail to cure cancer, you don’t close down NIH. Instead, you take that information and use it to continue the research and development process. “Wrong,” in this view, can be defined as “good failure,” because it is a step on the path to progress.

“Wrong-headed,” on the other hand, is “bad failure.” When you do something wrong-headed, you learn nothing, or you learn the wrong lessons. Wrong-headed decisions tend to lead to more wrong-headed decisions, as you have no systematic guide to what is working and what is not.

The issue of wrong vs. wrong-headed comes up in the current discussions in Congress about continuing the Investing in Innovation (i3) program. By now, committees in both the House and the Senate have recommended ending i3. But this would be the very essence of wrong-headed policy. Sure, it is probable that many i3 programs funded so far will fail to make a difference in achievement, or will fail to go to scale. This just means that these programs have not yet found success. Some of these may still have evidence of promise, and some will not. However, all i3 programs are rigorously evaluated, so we will know a lot about which worked, which did not, and which still seem promising even if they did not work this time. That’s huge progress. The programs that are already showing success can have immediate impact in hundreds or thousands of schools while others greatly enrich understanding of what needs to be done.

Abandoning i3, in contrast, would be wrong-headed, a sure path to bad failure. A tiny slice of education funding, i3 tells us what works and what does not, so we can continually move towards effective strategies and policies. Without i3 and other research and development investments, education policy is just guesswork, and it gets no smarter over time.

No one can honestly argue that American education is as successful as it should be. Our kids, our economy, and our society deserve much better. Policies that seek a mixture of proven success and “good failure” will get us to solid advances in educational practice and policy. Abandoning or cutting programs like i3 is not just wrong. It’s wrong-headed.

Evidence at Risk in House Spending Bill

The House Appropriations Committee marked up its spending bill yesterday for fiscal year 2016 for the Departments of Labor, Health and Human Services and Education. The spending levels in the bill put forward by the majority reduce Department of Education funding by $2.8 billion, mostly by eliminating approximately two dozen programs and severely cutting back several others, so it is no surprise that the bill passed through the Committee along party lines.

I can’t speak for all of the affected programs, but I do want to address what some of these proposed cuts could do. In a word, they would devastate the movement toward evidence as a basis for policy and practice in education.

First, the House bill would eliminate Investing in Innovation (i3). i3 has been the flagship for “tiered evidence” initiatives, providing large scale-up grants for programs that already have substantial evidence of effectiveness, smaller “validation” grants for programs with some evidence to build up their evidence base, and much smaller “development” grants for programs worth developing, piloting, and evaluating. At $120 million per year, i3 costs about 50¢ per taxpayer. What we get for 50¢ per year is a wide variety of promising programs at all grade levels and in all subjects, serving thousands of mostly high-poverty schools nationwide. We get evidence on the effectiveness of these programs, which tells us which are ready for broader use in our schools. The evidence from i3 informs the whole $630-billion public education enterprise, especially the $15-billion Title I program. That is, i3 costs 2¢ for every $100 spent on public education. Congresswoman Chellie Pingree of Maine offered an amendment to restore i3 and increase its funding level to $300 million, which is what the president had proposed. The process of offering the amendment gave members the opportunity to discuss the importance of i3, but in the end it was withdrawn (a not-uncommon procedural move when the amendment does not have an offset and/or is not expected to pass).

Second, the House proposal would significantly reduce funding for the Institute of Education Sciences (IES). IES commissions a wide variety of educational research, data collection, communications about evidence, and standard-setting for evidence, at a very modest cost. In this case, Congressman Mike Honda of California offered and withdrew an amendment to restore IES funding to its FY15 level of $574 million.

Finally, the main target of the proposed cuts was discretionary programs, which provide direct services to students. Districts, states, and other entities have to apply for these pots of money (as distinct from funds such as Title I or IDEA that are distributed by formula). Examples include Striving Readers (for struggling secondary readers); School Improvement Grants, or SIG (for low-performing schools); Preschool Development Grants; Mathematics and Science Partnerships; Ready to Learn (educational television); and several others.

These discretionary funding sources are the programs that could most easily be focused on evidence. One practical example is SIG, which recently added a category of approved expenditures consisting of whole-school reform programs with at least moderate evidence of effectiveness, which includes having been tested against a control group in at least one rigorous experiment. As another example, Title II SEED grants for professional development now require that programs adopted under SEED funding have at least moderate evidence of effectiveness. Congresswoman Rosa DeLauro offered an amendment to reinstate many of these programs, and it failed along party lines.

Adding evidence as a requirement or encouraging use of proven programs is much easier with discretionary programs than with formula grants. Yet if the House bill were to become law, there would be very few discretionary programs left.

The House proposal would greatly reduce national capacity to find out what works and what does not, and to scale up proven programs and practices. I very much hope our leaders in Congress will rethink this strategy and retain funding for the government programs mostly likely to help all of us learn — policy makers, educators, and kids alike.