Do School Districts Really Have Difficulty Meeting ESSA Evidence Standards?

The Center for Educational Policy recently released a report on how school districts are responding to the Every Student Succeeds Act (ESSA) requirement that schools seeking school improvement grants select programs that meet ESSA’s strong, moderate, or promising standards of evidence. Education Week ran a story on the CEP report.

The report noted that many states, districts, and schools are taking the evidence requirements seriously, and are looking at websites and consulting with researchers to help them identify programs that meet the standards. This is all to the good.

However, the report also notes continuing problems districts and schools are having finding out “what works.” Two particular problems were cited. One was that districts and schools were not equipped to review research to find out what works. The other was that rural districts and schools found few programs proven effective in rural schools.

I find these concerns astounding. The same concerns were expressed when ESSA was first passed, in 2015. But that was almost four years ago. Since 2015, the What Works Clearinghouse has added information to help schools identify programs that meet the top two ESSA evidence categories, strong and moderate. Our own Evidence for ESSA, launched in February, 2017, has up-to-date information on virtually all PK-12 reading and math programs currently in dissemination. Among hundreds of programs examined, 113 meet ESSA standards for strong, moderate, or promising evidence of effectiveness. WWC, Evidence for ESSA, and other sources are available online at no cost. The contents of the entire Evidence for ESSA website were imported into Ohio’s own website on this topic, and dozens of states, perhaps all of them, have informed their districts and schools about these sources.

The idea that districts and schools could not find information on proven programs if they wanted to do so is difficult to believe, especially among schools eligible for school improvement grants. Such schools, and the districts in which they are located, write a lot of grant proposals for federal and state funding. The application forms for school improvement grants always explain the evidence requirements, because that is the law. Someone in every state involved with federal funding knows about the WWC and Evidence for ESSA websites. More than 90,000 unique users have used Evidence for ESSA, and more than 800 more sign on each week.

blog_10-10-19_generickids_500x333

As to rural schools, it is true that many studies of educational programs have taken place in urban areas. However, 47 of the 113 programs qualified by Evidence for ESSA were validated in at least one rural study, or a study including a large enough rural sample to enable researchers to separately report program impacts for rural students. Also, almost all widely disseminated programs have been used in many rural schools. So rural districts and schools that care about evidence can find programs that have been evaluated in rural locations, or at least that were evaluated in urban or suburban schools but widely disseminated in rural schools.

Also, it is important to note that if a program was successfully evaluated only in urban or suburban schools, the program still meets the ESSA evidence standards. If no studies of a given outcome were done in rural locations, a rural school in need of better outcomes could, in effect, be asked to choose between a program proven to work somewhere and probably used in dissemination in rural schools, or they could choose a program not proven to work anywhere. Every school and district has to make the best choices for their kids, but if I were a rural superintendent or principal, I’d read up on proven programs, and then go visit some rural schools using that program nearby. Wouldn’t you?

I have no reason to suspect that the CEP survey is incorrect. There are many indications that district and school leaders often do feel that the ESSA evidence rules are too difficult to meet. So what is really going on?

My guess is that there are many district and school leaders who do not want to know about evidence on proven programs. For example, they may have longstanding, positive relationships with representatives of publishers or software developers, or they may be comfortable and happy with the materials and services they are already using, evidence-proven or not. If they do not have evidence of effectiveness that would pass muster with WWC or Evidence for ESSA, the publishers and software developers may push hard on state and district officials, put forward dubious claims for evidence (such as studies with no control groups), and do their best to get by in a system that increasingly demands evidence that they lack. In my experience, district and state officials often complain about having inadequate staff to review evidence of effectiveness, but their concern may be less often finding out what works as it is defending themselves from publishers, software developers, or current district or school users of programs, who maintain that they have been unfairly rated by WWC, Evidence for ESSA, or other reviews. State and district leaders who stand up to this pressure may have to spend a lot of time reviewing evidence or hearing arguments.

On the plus side, at the same time that publishers and software producers may be seeking recognition for their current products, many are also sponsoring evaluations of some of their products that they feel are mostly likely to perform well in rigorous evaluations. Some may be creating new programs that resemble programs that have met evidence standards. If the federal ESSA law continues to demand evidence for certain federal funding purposes, or even to expand this requirement to additional parts of federal grant-making, then over time the ESSA law will have its desired effect, rewarding the creation and evaluation of programs that do meet standards by making it easier to disseminate such programs. The difficulties the evidence movement is experiencing are likely to diminish over time as more proven programs appear, and as federal, state, district, and school leaders get comfortable with evidence.

Evidence-based reform was always going to be difficult, because of the amount of change it entails and the stakes involved. But sooner or later, it is the right thing to do, and leaders who insist on evidence will see increasing levels of learning among their students, at minimal cost beyond what they already spend on untested or ineffective approaches. Medicine went through a similar transition in 1962, when the U.S. Congress first required that medicines be rigorously evaluated for effectiveness and safety. At first, many leaders in the medical profession resisted the changes, but after a while, they came to insist on them. The key is political leadership willing to support the evidence requirement strongly and permanently, so that educators and vendors alike will see that the best way forward is to embrace evidence and make it work for kids.

Photo courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

The Farmer and the Moon Rocks: What Did the Moon Landing Do For Him?

Many, many years ago, during the summer after my freshman year in college, I hitchhiked from London to Iran.  This was the summer of 1969, so Apollo 11 was also traveling.   I saw television footage of the moon landing in Heraklion, Crete, where a television store switched on all of its sets and turned them toward the sidewalk.  A large crowd watched the whole thing.  This was one of the few times I recall when it was really cool to be an American abroad.

After leaving Greece, I went on to Turkey, and then Iran.  In Teheran, I got hold of an English-language newspaper.  It told an interesting story.  In rural Iran, many people believed that the moon was a goddess.  Obviously, a spaceship cannot land on a goddess, so many people concluded that the moon landing must be a hoax.

A reporter from the newspaper interviewed a number of people about the moon landing.  Some were adamant that the landing could not have happened.  However, one farmer was more pragmatic.  He asked the reporter, “I hear the astronauts brought back moon rocks.  Is that right?”

“That’s what they say!” replied the reporter.

“I am fixing my roof, and I could sure use a few of those moon rocks.  Do you think they might give me some?”

blog_8-1-19_moonfarmer_500x432 (002)

The moon rock story illustrates a daunting problem in the dissemination of educational research. Researchers do high-quality research on topics of great importance to the practice of education. They publish this research in top journals, and get promotions and awards for it, but in most cases, their research does not arouse even the slightest bit of interest among the educators for whom it was intended.

The problem relates to the farmer repairing his roof.  He had a real problem to solve, and he needed help with it.  A reporter comes and tells him about the moon landing. The farmer does not think, “How wonderful!  What a great day for science and discovery and the future of mankind!”  Instead, he thinks, “What does this have to do with me?”  Thinking back on the event, I sometimes wonder if he really expected any moon rocks, or if he was just sarcastically saying, “I don’t care.”

Educators care deeply about their students, and they will do anything they can to help them succeed.  But if they hear about research that does not relate to their children, or at least to children like theirs, they are unlikely to care very much.  Even if the research is directly applicable to their students, they are likely to reason, perhaps from long experience, that they will never get access to this research, because it costs money or takes time or upsets established routines or is opposed by powerful groups or whatever.  The result is status quo as far as the eye can see, or implementation of small changes that are currently popular but unsupported by evidence of effectiveness.  Ultimately, the result is cynicism about all research.

Part of the problem is that education is effectively a government monopoly, so entrepreneurship or responsible innovation are difficult to start or maintain.  However, the fact that education is a government monopoly can also be made into a positive, if government leaders are willing to encourage and support evidence-based reform.

Imagine that government decided to provide incentive funding to schools to help them adopt programs that meet a high standard of evidence.  This has actually happened under the ESSA law, but only in a very narrow slice of schools, those very low achieving schools that qualify for school improvement.  Imagine that the government provided a lot more support to schools to help them learn about, adopt, and effectively implement proven programs, and then gradually expanded the categories of schools that could qualify for this funding.

Going back to the farmer and the moon rocks, such a policy would forge a link between exciting research on promising innovations and the real world of practice.  It could cause educators to pay much closer attention to research on practical programs of relevance to them, and to learn how to tell the difference between valid and biased research.  It could help educators become sophisticated and knowledgeable consumers of evidence and of programs themselves.

One of the best examples of the transformation such policies could bring about is agriculture.  Research has a long history in agriculture, and from colonial times, government has encouraged and incentivized farmers to pay attention to evidence about new practices, new seeds, new breeds of animals, and so on.  By the late 19th century, the U.S. Department of Agriculture was sponsoring research, distributing information designed to help farmers be more productive, and much more.  Today, research in agriculture is a huge enterprise, constantly making important discoveries that improve productivity and reduce costs.  As a result, world agriculture, especially American agriculture, is able to support far larger populations at far lower costs than anyone ever thought possible.  The Iranian farmer talking about the moon rocks could not see how advances in science could possibly benefit him personally.  Today, however, in every developed economy, farmers have a clear understanding of the connection between advances in science and their own success.  Everyone knows that agriculture can have bad as well as good effects, as when new practices lead to pollution, but when governments decide to solve those problems, they turn to science. Science is not inherently good or bad, but if it is powerful, then democracies can direct it to do what is best for people.

Agriculture has made dramatic advances over the past hundred years, and continues to make rapid progress by linking science to practice.  In education, we are just starting to make the link between evidence and practice.  Isn’t it time to learn from the experiences of medicine, technology, and agriculture, among many other evidence based fields, to achieve more rapid progress in educational practice and outcomes?

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Send Us Your Evaluations!

In last week’s blog, I wrote about reasons that many educational leaders are wary of the ESSA evidence standards, and the evidence-based reform movement more broadly. Chief among these concerns was a complaint that few educational leaders had the training in education research methods to evaluate the validity of educational evaluations. My response to this was to note that it should not be necessary for educational leaders to read and assess individual evaluations of educational programs, because free, easy-to-interpret review websites, such as the What Works Clearinghouse and Evidence for ESSA, already do such reviews. Our Evidence for ESSA website (www.evidenceforessa.org) lists reading and math programs available for use anywhere in the U.S., and we are constantly on the lookout for any we might have missed. If we have done our job well, you should be able to evaluate the evidence base for any program, in perhaps five minutes.

Other evidence-based fields rely on evidence reviews. Why not education? Your physician may or may not know about medical research, but most rely on websites that summarize the evidence. Farmers may be outstanding in their fields, but they rely on evidence summaries. When you want to know about the safety and reliability of cars you might buy, you consult Consumer Reports. Do you understand exactly how they get their ratings? Neither do I, but I trust their expertise. Why should this not be the same for educational programs?

At Evidence for ESSA, we are aiming to provide information on every program available to you, if you are a school or district leader. At the moment, we cover reading and mathematics, grades pre-k to 12. We want to be sure that if a sales rep or other disseminator offers you a program, you can look it up on Evidence for ESSA and it will be there. If there are no studies of the program that meet our standards, we will say so. If there are qualifying studies that either do or do not have evidence of positive outcomes that meet ESSA evidence standards, we will say so. On our website, there is a white box on the homepage. If you type in the name of any reading or math program, the website should show you what we have been able to find out.

What we do not want to happen is that you type in a program title and find nothing. In our website, “nothing” has no useful meaning. We have worked hard to find every program anyone has heard of, and we have found hundreds. But if you know of any reading or math program that does not appear when you type in its name, please tell us. If you have studies of that program that might meet our inclusion criteria, please send them to us, or citations to them. We know that there are always additional programs entering use, and additional research on existing programs.

Why is this so important to us? The answer is simple, Evidence for ESSA exists because we believe it is essential for the progress of evidence-based reform for educators and policy makers to be confident that they can easily find the evidence on any program, not just the most widely used. Our vision is that someday, it will be routine for educators thinking of adopting educational programs to quickly consult Evidence for ESSA (or other reviews) to find out what has been proven to work, and what has not. I heard about a superintendent who, before meeting with any sales rep, asked them to show her the evidence for the effectiveness of their program on Evidence for ESSA or the What Works Clearinghouse. If they had it, “Come on in,” she’d say. If not, “Maybe later.”

Only when most superintendents and other school officials do this will program publishers and other providers know that it is worth their while to have high-quality evaluations done of each of their programs. Further, they will find it worthwhile to invest in the development of programs likely to work in rigorous evaluations, to provide enough quality professional development to give their programs a chance to succeed, and to insist that schools that adopt their proven programs incorporate the methods, materials, and professional development that their own research has told them are needed for success. Insisting on high-quality PD, for example, adds cost to a program, and providers may worry that demanding sufficient PD will price them out of the market. But if all programs are judged on their proven outcomes, they all will require adequate PD, to be sure that the programs will work when evaluated. That is how evidence will transform educational practice and outcomes.

So our attempt to find and fairly evaluate every program in existence is not due to our being nerds or obsessive compulsive neurotics (though these may be true, too). But thorough, rigorous review of the whole body of evidence in every subject and grade level, and for attendance, social emotional learning, and other non-academic outcomes, is part of a plan.

You can help us on this part of our plan. Tell us about anything we have missed, or any mistakes we have made. You will be making an important contribution to the progress of our profession, and to the success of all children.

blog_6-6-19_mail_500x381
Send us your evaluations!
Photo credit: George Grantham Bain Collection, Library of Congress [Public domain]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Why Do Some Educators Push Back Against Evidence?

In December, 2015, the U.S. Congress passed the Every Student Succeeds Act, or ESSA. Among many other provisions, ESSA defined levels of evidence supporting educational programs: Strong (at least one randomized experiment with positive outcomes), moderate (at least one quasi-experimental study with positive outcomes), and promising (at least one correlational study with positive outcomes). For various forms of federal funding, schools are required (in school improvement) or encouraged (in seven other funding streams) to use programs falling into one of these top three categories. There is also a fourth category, “demonstrates a rationale,” but this one has few practical consequences.

3 ½  years later, the ESSA evidence standards are increasing interest in evidence of effectiveness for educational programs, especially among schools applying for school improvement funding and in state departments of education, which are responsible for managing the school improvement grant process. All of this is to the good, in my view.

On the other hand, evidence is not yet transforming educational practice. Even in portions of ESSA that encourage or require use of proven programs among schools seeking federal funding, schools and districts often try to find ways around the evidence requirements rather than truly embracing them. Even when schools do say they used evidence in their proposals, they may have just accepted assurances from publishers or developers stating that their programs meet ESSA standards, even when this is clearly not so.

blog_5-30-19_pushingcar_500x344
Why are these children in India pushing back on a car?  And why do many educators in our country push back on evidence?

Educators care a great deal about their children’s achievement, and they work hard to ensure their success. Implementing proven, effective programs does not guarantee success, but it greatly increases the chances. So why has evidence of effectiveness played such a limited role in program selection and implementation, even when ESSA, the national education law, defines evidence and requires use of proven programs under certain circumstances?

The Center on Education Policy Report

Not long ago, the Center on Education Policy (CEP) at George Washington University published a report of telephone interviews of state leaders in seven states. The interviews focused on problems states and districts were having with implementation of the ESSA evidence standards. Six themes emerged:

  1. Educational leaders are not comfortable with educational research methods.
  2. State leaders feel overwhelmed serving large numbers of schools qualifying for school improvement.
  3. Districts have to seriously re-evaluate longstanding relationships with vendors of education products.
  4. State and district staff are confused about the prohibition on using Title I school improvement funds on “Tier 4” programs (ones that demonstrate a rationale, but have not been successfully evaluated in a rigorous study).
  5. Some state officials complained that the U.S. Department of Education had not been sufficiently helpful with implementation of ESSA evidence standards.
  6. State leaders had suggestions to make education research more accessible to educators.

What is the Reality?

I’m sure that the concerns expressed by the state and district leaders in the CEP report are sincerely felt. But most of them raise issues that have already been solved at the federal, state, and/or district levels. If these concerns are as widespread as they appear to be, then we have serious problems of communication.

  1. The first theme in the CEP report is one I hear all the time. I find it astonishing, in light of the reality.

No educator needs to be a research expert to find evidence of effectiveness for educational programs. The federal What Works Clearinghouse (https://ies.ed.gov/ncee/wwc/) and our Evidence for ESSA (www.evidenceforessa.org) provide free information on the outcomes of programs, at least in reading and mathematics, that is easy to understand and interpret. Evidence for ESSA provides information on programs that do meet ESSA standards as well as those that do not. We are constantly scouring the literature for studies of replicable programs, and when asked, we review entire state and district lists of adopted programs and textbooks, at no cost. The What Works Clearinghouse is not as up-to-date and has little information on programs lacking positive findings, but it also provides easily interpreted information on what works in education.

In fact, few educational leaders anywhere are evaluating the effectiveness of individual programs by reading research reports one at a time. The What Works Clearinghouse and Evidence for ESSA employ experts who know how to find and evaluate outcomes of valid research and to describe the findings clearly. Why would every state and district re-do this job for themselves? It would be like having every state do its own version of Consumer Reports, or its own reviews of medical treatments. It just makes no sense. In fact, at least in the case of Evidence for ESSA, we know that more than 80,000 unique readers have used Evidence for ESSA since it launched in 2017. I’m sure even larger numbers have used the What Works Clearinghouse and other reviews. The State of Ohio took our entire Evidence for ESSA website and put it on its own state servers with some other information. Several other states have strongly promoted the site. The bottom line is that educational leaders do not have to be research mavens to know what works, and tens of thousands of them know where to find fair and useful information.

  1. State leaders are overwhelmed. I’m sure this is true, but most state departments of education have long been understaffed. This problem is not unique to ESSA.
  2. Districts have to seriously re-evaluate longstanding relationships with vendors. I suspect that this concern is at the core of the problem on evidence. The fact is that most commercial programs do not have adequate evidence of effectiveness. Either they have no qualifying studies (by far the largest number), or they do have qualifying evidence that is not significantly positive. A vendor with programs that do not meet ESSA standards is not going to be a big fan of evidence, or ESSA. These are often powerful organizations with deep personal relationships with state and district leaders. When state officials adhere to a strict definition of evidence, defined in ESSA, local vendors push back hard. Understaffed state departments are poorly placed to fight with vendors and their friends in district offices, so they may be forced to accept weak or no evidence.
  3. Confusions about Tier 4 evidence. ESSA is clear that to receive certain federal funds schools must use programs with evidence in Tiers 1, 2, or 3, but not 4. The reality is that definitions of Tier 4 are so weak that any program on Earth can meet this standard. What program anywhere does not have a rationale? The problem is that districts, states, and vendors have used confusion about Tier 4 to justify any program they wish. Some states are more sophisticated than others and do not allow this, but the very existence of Tier 4 in ESSA language creates a loophole that any clever sales rep or educator can use, or at least try to get away with.
  4. The U. S. Department of Education is not helpful enough. In reality, USDoE is understaffed and overwhelmed on many fronts. In any case, ESSA puts a lot of emphasis on state autonomy, so the feds feel unwelcome in performing oversight.

The Future of Evidence in Education

Despite the serious problems in implementation of ESSA, I still think it is a giant step forward. Every successful field, such as medicine, agriculture, and technology, has started its own evidence revolution fighting entrenched interests and anxious stakeholders. As late as the 1920s, surgeons refused to wash their hands before operations, despite substantial evidence going back to the 1800s that handwashing was essential. Evidence eventually triumphs, though it often takes many years. Education is just at the beginning of its evidence revolution, and it will take many years to prevail. But I am unaware of any field that embraced evidence, only to retreat in the face of opposition. Evidence eventually prevails because it is focused on improving outcomes for people, and people vote. Sooner or later, evidence will transform the practice of education, as it has in so many other fields.

Photo credit: Roger Price from Hong Kong, Hong Kong [CC BY 2.0 (https://creativecommons.org/licenses/by/2.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Moneyball for Education

When I was a kid, growing up in the Maryland suburbs of Washington, DC, everyone I knew rooted for the hapless Washington Senators, one of the worst baseball teams ever. At that time, however, the Baltimore Orioles were one of the best teams in baseball, and every once in a while a classmate would snap. He (always “he”) would decide to become an Orioles fan. This would cause him to be shamed and ostracized for the rest of his life by all true Senators fans.

I’ve now lived in Baltimore for most of my life. I wonder if I came here in part because of my youthful impression of Baltimore as a winning franchise?

blog_3-14-19_moneyball_500x435

Skipping forward in time to now, I recently saw in the New York Times an article about the collapse of the Baltimore Orioles. In 2018, they had the worst record of any team in history. Worse than even the Washington Senators ever were. Why did this happen? According to the NYT, the Orioles are one of the last teams to embrace analytics, which means using evidence to decide which players to recruit or drop, to put on the field or on the bench. Some teams have analytics departments of 15. The Orioles? Zero, although they have just started one.

It’s not as though the benefits of analytics are a secret. A 2003 book by Michael Lewis, Moneyball, explained how the underfunded Oakland As used analytics to turn themselves around. A hugely popular 2011 movie told the same story.

In case anyone missed the obvious linkage of analytics in baseball to analytics in education, Results for America (RfA), a group that promotes the use of evidence in government social programs, issued a 2015 book called, you guessed it, Moneyball for Government (Nussle & Orszag, 2015). This Moneyball focused on success stories and ideas from key thinkers and practitioners in government and education. RfA was instrumental in encouraging the U.S. Congress to include in ESSA definitions of strong, moderate, and promising evidence of effectiveness, and to specify a few areas of federal funding that require or incentivize use of proven programs.

The ESSA evidence standards are a giant leap forward in supporting the use of evidence in education. Yet, like the Baltimore Orioles, the once-admired U.S. education system has been less than swept away by the idea that using proven programs and practices could improve outcomes for children. Yes, the situation is better than it was, but things are going very slowly. I’m worried that because of this, the whole evidence movement in education will someday be dismissed: “Evidence? Yeah, we tried that. Didn’t work.”

There are still good reasons for hope. The amount of high-quality evidence continues to grow at an unprecedented pace. The ESSA evidence standards have at least encouraged federal, state, and local leaders to pay some attention to evidence, though moving to action based on this evidence is a big lift.

Perhaps I’m just impatient. It took the Baltimore Orioles a book, a movie, and 16 years to arrive at the conclusion that maybe, just maybe, it was time to use evidence, as winning teams have been doing for a long time. Education is much bigger, and its survival does not depend on its success (as baseball teams do). Education will require visionary leadership to embrace the use of evidence. But I am confident that when it does, we will be overwhelmed by visits from educators from Finland, Singapore, China, and other countries that currently clobber us in international comparisons. They’ll want to know how the U.S. education system became the best in the world. Perhaps we’ll have to write a book and a movie to explain it all.  I’d suggest we call it . . . “Learnball.”

References

Nussle, J., & Orszag, P. (2015). Moneyball for Government (2nd Ed.). Washington, DC: Disruption Books.

Photo credit: Keith Allison [CC BY-SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Replication

The holy grail of science is replication. If a finding cannot be repeated, then it did not happen in the first place. There is a reason that the humor journal in the hard sciences is called the Journal of Irreproducible Results. For scientists, results that are irreproducible are inherently laughable, therefore funny. In many hard science experiments, replication is pretty much guaranteed. If you heat an iron bar, it gets longer. If you cross parents with the same recessive gene, one quarter of their progeny will express the recessive trait (think blue eyes).

blog_1-24-19_bunnies_500x363

In educational research, we care about replication just as much as our colleagues in the lab coats across campus. However, when we’re talking about evaluating instructional programs and practices, replication is a lot harder, because students and schools differ. Positive outcomes obtained in one experiment may or may not replicate in a second trial. Sometimes this is true because the first experiment had features known to contribute to bias: small sample sizes, brief study durations, extraordinary amounts of resources or expert time to help the experimental schools or classes, use of measures made by the developers or researchers or otherwise overaligned with the experimental group (but not the control group), or use of matched rather than randomized assignment to conditions, can all contribute to successful-appearing outcomes in a first experiment. Second or third experiments are more likely to be larger, longer, and more stringent than the first study, and therefore may not replicate. Even when the first study has none of these problems, it may not replicate because of differences in the samples of schools, teachers, or students, or for other, perhaps unknowable problems. A change in the conditions of education may cause a failure to replicate. Our Success for All whole-school reform model has been found to be effective many times, mostly by third party evaluators. However, Success for All has always specified a full-time facilitator and at least one tutor for each school. An MDRC i3 evaluation happened to fall in the middle of the recession, and schools, which were struggling to afford classroom teachers, could not afford facilitators or tutors. The results were still positive on some measures, especially for low achievers, but the effect sizes were less than half of what others had found in many studies. Stuff happens.

Replication has taken on more importance recently because the ESSA evidence standards only require a single positive study. To meet the strong, moderate, or promising standards, programs must have at least one “well-designed and well-implemented” study using randomized (strong), matched (moderate), or correlational (promising) designs and finding significantly positive outcomes. Based on the “well-designed and well-implemented” language, our Evidence for ESSA website requires features of experiments similar to those also required by the What Works Clearinghouse (WWC). These requirements make it difficult to be approved, but they remove many of the experimental design features that typically cause first studies to greatly overstate program impacts: small size, brief durations, overinvolved experimenters, and developer-made measures. They put (less rigorous) matched and correlational studies in lower categories. So one study that meets ESSA or Evidence for ESSA requirements is at least likely to be a very good study. But many researchers have expressed discomfort with the idea that a single study could qualify a program for one of the top ESSA categories, especially if (as sometimes happens) there is one study with a positive outcomes and many with zero or at least nonsignificant outcomes.

The pragmatic problem is that if ESSA had required even two studies showing positive outcomes, this would wipe out a very large proportion of current programs. If research continues to identify effective programs, it should only be a matter of time before ESSA (or its successors) requires more than one study with a positive outcomes.

However, in the current circumstance, there is a way researchers and educators might at least estimate the replicability of given programs when they have only a single study with a significant positive outcomes. This would involve looking at the findings for entire genres of programs. The logic here is that if a program has only one ESSA-qualifying study, but it closely resembles other programs that also have positive outcomes, that program should be taken a lot more seriously than a program that obtained a positive outcome that differs considerably from outcomes of very similar programs.

As one example, there is much evidence from many studies by many researchers indicating positive effects of one-to-one and one-to-small group tutoring, in reading and mathematics. If a tutoring program has only one study, but this one study has significant positive findings, I’d say thumbs up. I’d say the same about cooperative learning approaches, classroom management strategies using behavioral principles, and many others, where a whole category of programs has had positive outcomes.

In contrast, if a program has a single positive outcome and there are few if any similar approaches that obtained positive outcomes, I’d be much more cautious. An example might be textbooks in mathematics, which rarely make any difference because control groups are also likely to be using textbooks, and textbooks considerably resemble each other. In our recent elementary mathematics review (Pellegrini, Lake, Inns, & Slavin, 2018), only one textbook program available in the U.S. had positive outcomes (out of 16 studies). As another example, there have been several large randomized evaluations of the use of interim assessments. Only one of them found positive outcomes. I’d be very cautious about putting much faith in benchmark assessments based on this single anomalous finding.

Looking for findings from similar studies is facilitated by looking at reviews we make available at www.bestevidence.org. These consist of reviews of research organized by categories of programs. Looking for findings from similar programs won’t help with the ESSA law, which often determines its ratings based on the findings of a single study, regardless of other findings on the same program or similar programs. However, for educators and researchers who really want to find out what works, I think checking similar programs is not quite as good as finding direct replication of positive findings on the same programs, but perhaps, as we like to say, close enough for social science.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Government Plays an Essential Role in Diffusion of Innovations

Lately I’ve been hearing a lot of concern in reform circles about how externally derived evidence can truly change school practices and improve outcomes. Surveys of principals, for example, routinely find that principals rarely consult research in making key decisions, including decisions about adopting materials, software, or professional development intended to improve student outcomes. Instead, principals rely on their friends in similar schools serving similar students. In the whole process, research rarely comes up, and if it does, it is often generic research on how children learn rather than high-quality evaluations of specific programs they might adopt.

Principals and other educational leaders have long been used to making decisions without consulting research. It would be difficult to expect otherwise, because of three conditions that have prevailed roughly from the beginning of time to very recently: a) There was little research of practical value on practical programs; b) The research that did exist was of uncertain quality, and school leaders did not have the time or training to determine studies’ validity; c) There were no resources provided to schools to help them adopt proven programs, so doing so required that they spend their own scarce resources.

Under these conditions, it made sense for principals to ask around among their friends before selecting programs or practices. When no one knows anything about a program’s effectiveness, why not ask your friends, who at least (presumably) have your best interests at heart and know your context? Since conditions a, b, and c have defined the context for evidence use nearly up to the present, it is not surprising that school leaders have built a culture of distrust for anyone outside of their own circle when it comes to choosing programs.

However, all three of conditions a, b, and c have changed substantially in recent years, and they are continuing to change in a positive direction at a rapid rate:

a) High-quality research on practical programs for elementary and secondary schools is growing at an extraordinary rate. As shown in Figure 1, the number of rigorous randomized or quasi-experimental studies in elementary and secondary reading and in elementary math have skyrocketed since about 2003, due mostly to investments by the Institute for Education Sciences (IES) and Investing in Innovation (i3). There has been a similar explosion of evidence in England, due to funding from the Education Endowment Foundation (EEF). Clearly, we know a lot more about which programs work and which do not than we once did.

blog_1-10-19_graph2_1063x650

b) Principals, teachers, and the public can now easily find reliable and accessible information on practical programs on the What Works Clearinghouse (WWC), Evidence for ESSA, and other sites. No one can complain any more that information is inaccessible or incomprehensible.

c) Encouragement and funding are becoming available for schools eager to use proven programs. Most importantly, the federal ESSA law is providing school improvement funding for low-achieving schools that agree to implement programs that meet the top three ESSA evidence standards (strong, moderate, or promising). ESSA also provides preference points for applications for certain sources of federal funding if they promise to use the money to implement proven programs. Some states have extended the same requirement to apply to eligibility for state funding for schools serving students who are disadvantaged or are ethnic or linguistic minorities. Even schools that do not meet any of these demographic criteria are, in many states, being encouraged to use proven programs.

blog_1-10-19_uscapitol_500x375

Photo credit: Jorge Gallo [Public domain], from Wikimedia Commons

I think the current situation is like that which must have existed in, say, 1910, with cars and airplanes. Anyone could see that cars and airplanes were the future. But I’m sure many horse-owners pooh-poohed the whole thing. “Sure there are cars,” they’d say, “but who will build all those paved roads? Sure there are airplanes, but who will build airports?” The answer was government, which could see the benefits to the entire economy of systems of roads and airports to meet the needs of cars and airplanes.

Government cannot solve all problems, but it can create conditions to promote adoption and use of proven innovations. And in education, federal, state, and local governments are moving rapidly to do this. Principals may still prefer to talk to other principals, and that’s fine. But with ever more evidence on ever more programs and with modest restructuring of funds governments are already awarding, conditions are coming together to utterly transform the role of evidence in educational practice.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.