How Much Difference Does an Education Program Make?

When you use Consumer Reports car repair ratings to choose a reliable car, you are doing something a lot like what evidence-based reform in education is proposing. You look at the evidence and take it into account, but it does not drive you to a particular choice. There are other factors you’d also consider. For example, Consumer Reports might point you to reliable cars you can’t afford, or ones that are too large or too small or too ugly for your purposes and tastes, or ones with dealerships that are too far away. In the same way, there are many factors that school staffs or educational leaders might consider beyond effect size.

An effect size, or statistical significance, is only a starting point for estimating the impact a program or set of programs might have. I’d propose the term “potential impact” to subsume the following factors that a principal or staff might consider beyond effect size or statistical significance in adopting a program to improve education outcomes:

Cost-effectiveness
Evidence from similar schools
Immediate and long-term payoffs
Sustainability
Breadth of impact
Low-hanging fruit
Comprehensiveness

Cost-EffectivenessEconomists’ favorite criterion of effectiveness is cost-effectiveness. Cost-effectiveness is simple in concept (how much gain did the program cause at what cost?), but in fact there are two big elements of cost-effectiveness that are very difficult to determine:

1. Cost
2. Effectiveness

Cost should be easy, right? A school buys some service or technology and pays something for it. Well, it’s almost never so clear. When a school uses a given innovation, there are usually costs beyond the purchase price. For example, imagine that a school purchases digital devices for all students, loaded with all the software they will need. Easy, right? Wrong. Should you count in the cost of the time the teachers spend in professional development? The cost of tech support? Insurance? Security costs? The additional electricity required? Space for storage? Additional loaner units to replace lost or broken units? The opportunity costs for whatever else the school might have chosen to do?

Here is an even more difficult example. Imagine a school starts a tutoring program for struggling readers using paraprofessionals as tutors. Easy, right? Wrong. There is the cost for the paraprofessionals’ time, of course, but what if the paraprofessionals were already on the schools’ staff? If so, then a tutoring program may be very inexpensive, but if additional people must be hired as tutors, then tutoring is a far more expensive proposition. Also, if paraprofessionals already in the school are no longer doing what they used to do, might this diminish student outcomes? Then there is the problem with outcomes. As I explained in a recent blog, the meaning of effect sizes depends on the nature of the studies that produced them, so comparing apples to apples may be difficult. A principal might look at effect sizes for two programs and decide they look very similar. Yet one effect size might be from large-scale randomized experiments, which tend to produce smaller (and more meaningful) effect sizes, while the other might be from less rigorous studies.

Nevertheless, issues of cost and effectiveness do need to be considered. Somehow.

Evidence from Similar Schools
Clearly, a school staff would want to know that a given program has been successful in schools like theirs. For example, schools serving many English learners, or schools in rural areas, or schools in inner-city locations, might be particularly interested in data from similar schools. At a minimum, they should want to know that the developers have worked in schools like theirs, even if the evidence only exists from less similar schools.

Immediate and Long-Term Payoffs
Another factor in program impacts is the likelihood that a program will solve a very serious problem that may ultimately have a big effect on individual students and perhaps save a lot of money over time. For example, it may be that a very expensive parent training program may make a big difference for students with serious behavior problems. If this program produces lasting effects (documented in the research), its high cost might be justified, especially if it might reduce the need for even more expensive interventions, such as special education placement, expulsion, or incarceration.

Sustainability
Programs that either produce lasting impacts, or those that can be readily maintained over time, are clearly preferable to those that have short-term impacts only. In education, long-term impacts are not typically measured, but sustainability can be determined by the cost, effort, and other elements required to maintain an intervention. Most programs get a lot cheaper after the first year, so sustainability can usually be assumed. This means that even programs with modest effect sizes could bring about major changes over time.

Breadth of Impact
Some educational interventions with modest effect sizes might be justified because they apply across entire schools and for many years. For example, effective coaching for principals might have a small effect overall, but if that effect is seen across thousands of students over a period of years, it might be more than worthwhile. Similarly, training teachers in methods that become part of their permanent repertoire, such as cooperative learning, teaching metacognitive skills, or classroom management, might affect hundreds of students per teacher over time.

Low-Hanging Fruit
Some interventions may have either modest impacts on students in general, or strong outcomes for only a subset of students, but be so inexpensive or easy to adopt and implement that it would be foolish not to do so. One example might be making sure that disadvantaged students who need eyeglasses are assessed and given glasses. Not everyone needs glasses, but for those who do this makes a big difference at low cost. Another example might be implementing a whole-school behavior management approach like Positive Behavior Interventions and Support (PBIS), a low-cost, proven approach any school can implement.

Comprehensiveness
Schools have to solve many quite different problems, and they usually do this by pulling various solutions off of various shelves. The problem is that this approach can be uncoordinated and inefficient. The different elements may not link up well with each other, may compete for the time and attention of the staff, and may cost a lot more than a unified, comprehensive solution that addresses many objectives in a planful way. A comprehensive approach is likely to have a coherent plan for professional development, materials, software, and assessment across all program elements. It is likely to have a plan for sustaining its effects over time and extending into additional parts of the school or additional schools.

Potential Impact
Potential impact is the sum of all the factors that make a given program or a coordinated set of programs effective in the short and long term, broad in its impact, focused on preventing serious problems, and cost-effective. There is no numerical standard for potential impact, but the concept is just intended to give educators making important choices for their kids a set of things to consider, beyond effect size and statistical significance alone.

Sorry. I wish this were simple. But kids are complex, organizations are complex, and systems are complex. It’s always a good idea for education leaders to start with the evidence but then think through how programs can be used as tools to transform their particular schools.

Brown v. Board of Education at 62

On Tuesday, Brown v. Board of Education turned 62. In 1979, when the Brown decision was celebrating its 25th anniversary, I wrote an article about the Social Scientists’ Statement submitted as part of Brown v. Board of Education. Brown v. Board of Education, of course, ordered the desegregation of America’s schools “with all deliberate speed.” Deliberate indeed. As reported in a recent Government Accountability Office (GAO) study, segregation of African-American and Hispanic students has increased, not decreased, over the past 15 years. Worse, schools with concentrations of minority students suffer from low funding and few other resources, and they have difficulty attracting and maintaining qualified staff.

The problem is not new, but it has gone underground. After the wars over bussing in the 1970s and ‘80s, concern for school desegregation has been replaced with vague commitments to improve the schools attended by minority students.

The Social Scientists’ Statement was evidence submitted to the Supreme Court noting that desegregation was going to work a lot better at building positive intergroup relations and respect if schools adopted teaching strategies that emphasized cooperative learning, which would give students opportunities to get to know each other as individuals. I wrote my article on this topic in the Minneapolis Public Library, where I happened to have time on my hands. I wrote at a table near a window. Outside the window was a playground in which little African-American and White children were gleefully playing. It was impossible to imagine that 37 years in the future, when those little children would have children of their own, the problems I was writing about would still exist, and would be getting worse.

To be fair, race relations are far better now than they were in 1979, and by many measures minority groups have advanced economically, educationally, and socially. Yet segregation continues to rise, and inequalities continue to grow.

The solution is straightforward, and attainable: Dramatically improve schools and expand economic opportunity to the point where there is no stigma to minority status. We have a lot of evidence about how to improve the school performance of all students. If we invested in these strategies, and in equally proven policies for expanding job opportunities, poverty and inequality would diminish, and segregation would soon follow. It would take a generation or two but there is no question that it could be done.

Could someone explain to me why we don’t get started now? What problem for the social stability and basic fairness of our nation could be more important?

What Education Policy Can Learn from Composting

As anyone who reads this blog knows, I’ve been very excited about the potential of the evidence definitions in the recently passed Every Student Succeeds Act (ESSA). But I have also been anxious about whether they will really work. Will there be enough clarity and information on proven programs and practices provided to help educational leaders come to see evidence as helpful, rather than just another mandate from Washington?

Not long ago, I went to the American Educational Research Association (AERA) meetings in Washington. While there, I saw something that exactly illustrates how the ESSA standards could succeed or fail. It had nothing to do with educational research, however. It had to do with composting.

Here’s what I saw. On the second floor of the Washington Convention Center is a food court with tables. When you buy food, you put it on trays apparently made of recycled cardboard, and then when you throw away your trash, you are faced with receptacles that offer three choices: “compost,” “bottles/cans,” or “general waste.” I watched as person after person looked at the receptacles. No one had any problem with bottles/cans, but just about everyone looked back and forth several times between “compost” and “general waste,” and finally gave up and put everything in “general waste.” The trash piled up in that bin, while the “compost” bin was mostly empty.

However, there were a handful of receptacles that were different. They had the same three choices, but under “compost,” there was a list. The list included the recycled tray, a big part of the potential compost. At these receptacles, peoples’ behavior was clearly different. They quickly read the list, and put almost all of their trash other than bottles and cans in the “compost” bin, piling up that bin while “general waste” was nearly empty.

What the trash receptacles illustrated to me was the importance of specificity. Most of the people in the food court wanted to do the right thing, and to put as much as possible into compost, but they did not think they knew enough to decide what was appropriate for composting and what was not. They took the time to think about it, but all too often ended up regretfully using the “general waste” bin in frustration.

In the ESSA standards, educational leaders are asked to use programs that meet the highest standards of evidence. Programs with “strong” evidence are those with at least one randomized study showing significant positive effects. Those with “moderate” evidence are those with at least one quasi-experiment with significant positive effects, and “promising” means that a program has at least one correlational study with a significant positive effect. Yet there is no list of specific programs that meet (or do not meet) these standards. Joy Lesnick, the Acting Commissioner of the National Center for Education Evaluation and Regional Assistance (NCEE) within the Institute of Education Sciences (IES), recently wrote that IES has no intention of aligning the What Works Clearinghouse (WWC) or otherwise creating a list of proven programs according to the ESSA standards.

I have no doubt that principals, superintendents, and other educational leaders want to use programs with the strongest possible evidence of effectiveness, and the ESSA standards give them new reasons to want to find out what the evidence says. Yet without specificity about particular programs and practices meeting ESSA standards, educational leaders will be like the dedicated recyclers at the Washington Convention Center. They will not want to spend days in the library figuring out what works, and most will just go back to making decisions based on what publishers’ sales reps tell them the evidence says. If this happens, nothing will change.

The ESSA standards are not self-activating. They will make a difference only if educators embrace them, and then select programs and practices that truly have the highest levels of evidence as indicated by independent, trusted, and capable intermediaries. Otherwise, these standards, like “scientifically based research” in No Child Left Behind, will be honored in the breach of observance, and children will be no better off. It will take quick and intensive action, likely outside of government, to identify and publicize guides to programs meeting ESSA standards, but the alternative is ESSA standards in the “general waste” bin.