Lessons for Educational Research from the COVID-19 Vaccines

Since the beginning of the COVID-19 pandemic, more than 130 biotech companies have launched major efforts to develop and test vaccines. Only four have been approved so far (Pfizer, Moderna, Johnson & Johnson, and AstraZeneca). Among the others, many have outright failed, and others are considered highly unlikely. Some of the failed vaccines are small, fringe companies, but they also include some of the largest and most successful drug companies in the world: Merck (U.S.), Glaxo-Smith-Kline (U.K.), and Sanofi (France).

Kamala Harris gets her vaccine.

Photo courtesy of NIH

If no further companies succeed, the score is something like 4 successes and 126 failures.  Based on this, is the COVID vaccine a triumph of science, or a failure? Obviously, if you believe that even one of the successful programs is truly effective, you would have to agree that this is one of the most extraordinary successes in the history of medicine. In less than one year, companies were able to create, evaluate, and roll out successful vaccines, already saving hundreds of thousands of lives worldwide.

Meanwhile, Back in Education . . .

The example of COVID vaccines contrasts sharply with the way research findings are treated in education. As one example, Borman et al. (2003) reviewed research on 33 comprehensive school reform programs. Only three of these had solid evidence of effectiveness, according to the authors (one of these was our program, Success for All; see Cheung et al., in press). Actually, few of the programs failed; most had just not been evaluated adequately. Yet the response from government and educational leaders was “comprehensive school reform doesn’t work” rather than, “How wonderful! Let’s use the programs proven to work.” As a result, a federal program supporting comprehensive school reform was canceled, use of comprehensive school reform plummeted, and most CSR programs went out of operation (we survived, just barely, but the other two successful programs soon disappeared).

Similarly, the What Works Clearinghouse, and our Evidence for ESSA website (www.evidenceforessa.org), are often criticized because so few of the programs we review turn out to have significant positive outcomes in rigorous studies.

The reality is that in any field in which rigorous experiments are used to evaluate innovations, most of the innovations fail. Mature science-focused fields, like medicine and agriculture, expect this and honor it, because the only way to prevent failures is to do no experiments at all, or only flawed experiments. Without rigorous experiments, we would have no reliable successes.  Also, we learn from failures, as scientists are learning from the findings of the evaluations of all 130 of the COVID vaccines.

Unfortunately, education is not a mature science-focused field, and in our field, failure to show positive effects in rigorous experiments leads to cover-ups, despair, abandonment of proven and promising approaches, or abandonment of rigorous research itself. About 20 years ago, a popular federally-funded education program was found to be ineffective in a large, randomized experiment. Supporters of this program actually got Congress to enact legislation that forbade the use of randomized experiments to evaluate this program!

Research has improved in the past two decades, and acceptance of research has improved as well. Yet we are a long way from medicine, for example, which accepts both success and failure as part of a process of using science to improve health. In our field, we need to commit to broad scale, rigorous evaluations of promising approaches, wide dissemination of programs that work, and learning from experiments that do not (yet) show positive outcomes. In this way, we could achieve the astonishing gains that take place in medicine, and learn how to produce these gains even faster using all the knowledge acquired in experiments, successful or not.

References

Borman, G. D., Hews, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 12(2), 125-230.

Cheung, A., Xie, C., Zhang, T. & Slavin, R. E. (in press). Success for All: A quantitative synthesis of evaluations. Journal of Research on Educational Effectiveness.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just enter your email address here.

Another Way to Understand Effect Sizes

Whenever I talk to educators and mention effect sizes, someone inevitably complains. “We don’t understand effect sizes,” they say. I always explain that you don’t have to understand exactly what effect sizes are, but if you do know that more of them are good and less of them are bad, assuming that the research from which they came is of equal quality, then why do you have to know precisely what they are? Sometimes I mention the car reliability rating system Consumer Reports uses, with full red circles at the top and full black circles at the bottom. Does anyone understand how they arrived at those ratings? I don’t even know, but I don’t care, because like everyone else, what I do know is that I don’t want a car with a reliability rating in the black.

People always tell me that they would like it better if I’d use “additional months of gain.” I do this when I have to, but I really do not like it, because these “months of gain” do not really mean very much, and they work very differently at the early elementary grades than they do in high schools.

So here is an idea that some people might find useful. The National Assessment of Educational Progress (NAEP) uses reading and math scales that have a theoretical standard deviation of 50. So an effect size of, say, +0.20 can be expressed as a gain equivalent to a NAEP score gain of +10 (0.20 x 50 = 10) points.  That’s not really interesting yet, because most people also don’t know what NAEP scores mean.

But here’s another way to use such data that might be more fun and easier to understand. I think people could understand and care about their state’s rank on NAEP scores. So for example, the highest-scoring state on 4th grade reading is Massachusetts, with a NAEP reading score of 231 in 2019. What if the 13th state, Nebraska (222), adopted a great reading program statewide, and it gained an average effect size of +0.20. That’s equivalent to 10 NAEP points. Such a gain in effect size would make Nebraska score one point ahead of Massachusetts (if Massachusetts didn’t change). Number 1!

If we learned to speak in terms of how many ranks states would gain if they gained a given effect size, I wonder if that would give educators more understanding and respect for the findings of experiments. Even fairly small effect sizes, if replicated across a whole state, could propel a state past its traditional rivals. For example, 26th ranked Wisconsin (220) could equal neighboring 12th ranked Minnesota (222) with a statewide reading effect size gain of only +0.04. As a practical matter, Wisconsin could increase its fourth grade test scores by an effect size of +0.04, perhaps by using a program with an effect size of +0.20 with (say) the lowest-achieving fifth of its fourth graders.

If only one could get states thinking this way, the meaning and importance of effect sizes would soon become clear. And as a side benefit, perhaps if Wisconsin invested its enthusiasm and money in a “Beat Minnesota” reading campaign, as it does to try to beat the University of Minnesota’s football team, Wisconsin’s students might actually benefit. I can hear it now:

            On Wisconsin, On Wisconsin,

            Raise effect size high!

            We are not such lazy loafers

            We can beat the Golden Gophers

            Point-oh-four or point-oh-eight

            We’ll surpass them, just you wait!

           

Well, a nerd can dream, can’t he?

_______

            Note:  No states were harmed in the writing of this blog.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

The Role of Research and Development in Post-Covid Education

Everyone knows that during World War II, the pace of innovation greatly accelerated. Computers, rocketry, jets, sonar, radar, microwaves, aerosol cans, penicillin, and morphine were among the many wartime developments. What unites these innovations, of course, is that each was developed to solve an urgent problem important to the war effort, and all of them later tuned out to have revolutionary benefits for civilian use. Yet these advances could not have taken place so quickly if not for the urgent need for innovations and the massive resources devoted to them.

Crisis can be the catalyst for innovation.

Today, we face Covid, a dire medical crisis, and investments of massive resources have produced vaccines in record time. However, the Covid pandemic has also created an emergency in education, as millions of children are experiencing educational losses due to school closures. The Institute for Education Sciences (IES) has announced a grants program to respond to the Covid crisis, but at the usual pace, the grants will only lead to practical solutions in many years when (we fervently hope) the crisis will be over.

I would argue that in this perilous time, research in education should focus on urgent practical problems that could have a significant impact within, say, the next year or two on the problems of students who are far below grade level in essential skills because of Covid school closures, or for other reasons:

1. Tutoring. Yes, of course I was going to start with tutoring. The biggest problem in tutoring is that while we have many proven programs for elementary reading, especially for grades K-3, we have far fewer proven programs ready for prime time in the upper elementary grades, and none at all in middle or high school reading. Studies in England have found positive effects of tutoring in their equivalent of middle school, but none of these exist in the U.S. In mathematics, there are few proven tutoring programs in elementary school, and just one I know of for middle school, and one for high school.

How could research funding produce new tutoring programs for middle and high school reading, and for math at all grade levels, in such a short time?  Simple. First, there are already tutoring programs for reading and math at all grade levels, but few have been successfully evaluated, or (in most cases) ever evaluated at all in rigorous experiments. So it would be important to fund evaluations of particularly promising programs that are already working at significant scale.

Another means of rapidly discovering effective tutoring programs would be to fund programs that have been successful in certain grade levels to quickly create programs for adjacent grades. For example, a program proven effective in grades 2-3 should be able to be significantly modified to work in grades 4-5. One that works in grades 4-5 could be modified for use in middle school. Programs proven effective in reading might be modified for use in mathematics at the same grade level, or vice versa. Many programs with successful programs in some grade levels have the staff and experience to quickly create programs in adjacent grade levels.

Also, it might be possible for developers of successful classwide technology programs to create and pilot tutoring models using similar software, but adding the assistance of a tutor for groups of one to four students, perhaps in collaboration with experts on tutoring.

2. Approaches other than tutoring.  There are many effective reading and math programs of all kinds, not just tutoring, that have proven their effectiveness (see www.evidenceforessa.org). Such programs might be ready to go as they are, and others could be evaluated in a form appropriate to the current emergency. Very few programs other than tutoring obtain effect sizes like those typical of the best tutoring programs, but classwide programs with modest effect sizes serve many more students than tutoring programs do. Also, classroom programs might be evaluated for their capacity to maintain gains made due to tutoring.

Tutoring or non-tutoring programs that already exist at scale, or that could be quickly adapted from proven programs, might be ready for rigorous, third-party evaluations as soon as fall, 2021. These programs should be evaluated using rigorous, third-party evaluations, with all programs at a given grade level using identical procedures and measures. In this way, it should be possible to have many new, proven programs by the end of the 2021-2022 school year, ready for dissemination in fall, 2022. This would be in time to greatly add capacity to serve the millions of students who need proven programs to help them make rapid progress toward grade level.

A research program of this kind could be expensive, and it may not provide theoretical breakthroughs. However, given the substantial and obvious need, and the apparent willingness of government to provide major resources to combat Covid learning losses, such a research effort might be feasible. If it were to take place, it might build excitement about R & D as a practical means of enhancing student achievement. And if even a quarter of the experiments found sizable positive impacts, this would add substantially to our armamentarium of proven strategies for struggling students.

There is an old saying in social work: “Never let a good crisis go to waste.” As in World War II, the educational impacts of the Covid pandemic present educational research with a crisis that we must solve, but if we can solve any portion of this problem, this will create benefits for generations of children long after Covid has faded into a distant memory.

Photo credit: User Messybeast on en.wikipedia, CC BY-SA 3.0 <http://creativecommons.org/licenses/by-sa/3.0/>, via Wikimedia Commons

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

A “Called Shot” for Educational Research and Impact

In the 1932 World Series, Babe Ruth stepped up to the plate and pointed to the center field fence. Everyone there understood: He was promising to hit the next pitch over the fence.

And then he did.

That one home run established Babe Ruth as the greatest baseball player ever. Even though several others have long since beaten his record of 60 home runs, no one else ever promised to hit a home run and then did it.

Educational research needs to execute a “called shot” of its own. We need to identify a clear problem, one that must be solved with some urgency, one that every citizen understands and cares about, one that government is willing and able to spend serious money to solve. And then we need to solve it, in a way that is obvious to all. I think the clear need for intensive services for students whose educations have suffered due to Covid-19 school closures provides an opportunity for our own “called shot.”

In my recent Open Letter to President-Elect Biden, I described a plan to provide up to 300,000 well-trained college-graduate tutors to work with up to 12 million students whose learning has been devastated by the Covid-19 school closures, or who are far below grade level for any reason. There are excellent reasons to do this, including making a rapid difference in the reading and mathematics achievement of vulnerable children, providing jobs to hundreds of thousands of college graduates who may otherwise be unemployed, and starting the best of these non-certified tutors on a path to teacher certification. These reasons more than justify the effort. But in today’s blog, I wanted to explain a fourth rationale, one that in the long run may be the most important of all.

A major tutoring enterprise, entirely focusing on high-quality implementation of proven programs, could be the “called shot” evidence-based education needs to establish its value to the American public.

Of course, the response to the Covid-19 pandemic is already supporting a “called shot” in medicine, the rush to produce a vaccine. At this time we do not know what the outcome will be, but throughout the world, people are closely following the progress of dozens of prominent attempts to create a safe and effective vaccine to prevent Covid-19. If this works as hoped, this will provide enormous benefits for entire populations and economies worldwide. But it could also raise the possibility that we can solve many crucial medical problems much faster than we have in the past, without compromising on strict research standards. The funding of many promising alternatives, and rigorous testing of each before they are disseminated, is very similar to what I and my colleagues have proposed for various approaches to tutoring. In both the medical case and the educational case, the size of the problem justifies this intensive, all-in approach. If all goes well with the vaccines, that will be a “called shot” for medicine, but medicine has long since proven its capability to use science to solve big problems. Curing polio, eliminating smallpox, and preventing measles come to mind as examples. In education, we need to earn this confidence, with a “called shot” of our own.

Think of it. Education researchers and leaders who support them would describe a detailed and plausible plan to solve a pressing problem of education. Then we announce that given X amount of money and Y amount of time, we will demonstrate that struggling students can perform substantially better than they would have without tutoring.

We’d know this would work, because part of the process would be identifying a) programs already proven to be effective, b) programs that already exist at some scale that would be successfully evaluated, and c) newly-designed programs that would successfully be evaluated. In each case, programs would have to meet rigorous evaluation standards before qualifying for substantial scale-up. In addition, in order to obtain funding to hire tutors, schools would have to agree to ensure that tutors use the programs with an amount and quality of training, coaching, and support at least as good as what was provided in the successful studies.

Researchers and policy makers who believe in evidence-based reform could confidently predict substantial gains, and then make good on their promises. No intervention in all of education is as effective as tutoring. Tutoring can be expensive, but it does not require a lengthy, uncertain transformation of the entire school. No sensible researcher or reformer would think that tutoring is all schools should do to improve student outcomes, but tutoring should be one element of any comprehensive plan to improve schools, and it happens to respond to the needs of post-Covid education for something that can have a dramatic, relatively quick, and relatively reliable impact.

If all went well in a large-scale tutoring intervention, the entire field of research could gain new respect, a belief among educators and the public that outcomes could be made much better than they are now by systematic applications of research, development, evaluation, and dissemination.

It is important to note that in order to be perceived to work, the tutoring “called shot” need not be proven effective across the board. By my count, there are 18 elementary reading tutoring programs with positive outcomes in randomized evaluations (see below). Let’s say 12 of them are ready for prime time and are put to the test, and 5 of those work very well at scale. That would be a tremendous success, because if we know which five approaches worked, we could make substantial progress on the problem of elementary reading failure. Just as with Covid-19 vaccines, we shouldn’t care how many vaccines failed. All that matters is that one or more of them succeeds, and can then be widely replicated.

I think it is time to do something bold to capture people’s imaginations. Let’s (figuratively) point to the center field fence, and (figuratively) hit the next pitch over it. The conditions today for such an effort are as good as they will ever be, because of universal understanding that the Covid-19 school closures deserve extraordinary investments in proven strategies. Researchers working closely with educators and political leaders can make a huge difference. We just have to make our case and insist on nothing less than whatever it takes. If a “called shot” works for tutoring, perhaps we could use similar approaches to solve other enduring problems of education.

It worked for the Babe. It should work for us, too, with much greater consequences for our children and our society than a mere home run.

*  *  *

Note: A reader of my previous blog asked what specific tutoring programs are proven effective, according to our standards. I’ve listed below reading and math tutoring programs that meet our standards of evidence. I cannot guarantee that all of these programs would be able to go to scale. We are communicating with program providers to try to assess each program’s capacity and interest in going to scale. But these programs are a good place to start in understanding where things stand today.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

An Open Letter To President-Elect Biden: A Tutoring Marshall Plan To Heal Our Students

Dear President-Elect Biden:

            Congratulations on your victory in the recent election. Your task is daunting; so much needs to be set right. I am writing to you about what I believe needs to be done in education to heal the damage done to so many children who missed school due to Covid-19 closures.

            I am aware that there are many basic things that must be done to improve schools, which have to continue to make their facilities safe for students and cope with the physical and emotional trauma that so many have experienced. Schools will be opening into a recession, so just providing ordinary services will be a challenge. Funding to enable schools to fulfill their core functions is essential, but it is not sufficient.

            Returning schools to the way they were when they closed last spring will not heal the damage students have sustained to their educational progress. This damage will be greatest to disadvantaged students in high-poverty schools, most of whom were unable to take advantage of the remote learning most schools provided. Some of these students were struggling even before schools closed, but when they re-open, millions of students will be far behind.

            Our research center at Johns Hopkins University studies the evidence on programs of all kinds for students who are at risk, especially in reading (Neitzel et al., 2020) and mathematics (Pellegrini et al., 2020). What we and many other researchers have found is that the most effective strategy for struggling students, especially in elementary schools, is one-to-one or one-to-small group tutoring. Structured tutoring programs can make a large difference in a short time, exactly what is needed to help students quickly catch up with grade level expectations.

A Tutoring Marshall Plan

            My colleagues and I have proposed a massive effort designed to provide proven tutoring services to the millions of students who desperately need it. Our proposal, based on a similar idea by Senator Coons (D-Del), would ultimately provide funding to enable as many as 300,000 tutors to be recruited, trained in proven tutoring models, and coached to ensure their effectiveness. These tutors would be required to have a college degree, but not necessarily a teaching certificate. Research has found that such tutors, using proven tutoring models with excellent professional development, can improve the achievement of students struggling in reading or mathematics as much as can teachers serving as tutors.

            The plan we are proposing is a bit like the Marshall Plan after World War II, which provided substantial funding to Western European nations devastated by the war. The idea was to put these countries on their feet quickly and effectively so that within a brief period of years, they could support themselves. In a similar fashion, a Tutoring Marshall Plan would provide intensive funding to enable Title I schools nationwide to substantially advance the achievement of their students who suffered mightily from Covid-19 school closures and related trauma. Effective tutoring is likely to enable these children to advance to the point where they can profit from ordinary grade-level instruction. We fear that without this assistance, millions of children will never catch up, and will show the negative effects of the school closures throughout their time in school and beyond.

            The Tutoring Marshall Plan will also provide employment to 300,000 college graduates, who will otherwise have difficulty entering the job market in a time of recession. These people are eager to contribute to society and to establish professional careers, but will need a first step on that ladder. Ideally, the best of the tutors will experience the joys of teaching, and might be offered accelerated certification, opening a new source of teacher candidates who will have had an opportunity to build and demonstrate their skills in school settings. Like the CCC and WPA programs in the Great Depression, these tutors will not only be helped to survive the financial crisis, but will perform essential services to the nation while building skills and confidence.

            The Tutoring Marshall Plan needs to start as soon as possible. The need is obvious, both to provide essential jobs to college graduates and to provide proven assistance to struggling students.

            Our proposal, in brief, is to ask the U.S. Congress to fund the following activities:

Spring, 2021

  • Fund existing tutoring programs to build capacity to scale up their programs to serve thousands of struggling students. This would include funds for installing proven tutoring programs in about 2000 schools nationwide.
  • Fund rigorous evaluations of programs that show promise, but have not been evaluated in rigorous, randomized experiments.
  • Fund the development of new programs, especially in areas in which there are few proven models, such as programs for struggling students in secondary schools.

Fall, 2021 to Spring, 2022

  • Provide restricted funds to Title I schools throughout the United States to enable them to hire up to 150,000 tutors to implement proven programs, across all grade levels, 1-9, and in reading and mathematics. This many tutors, mostly using small-group methods, should be able to provide tutoring services to about 6 million students each year. Schools should be asked to agree to select from among proven, effective programs. Schools would implement their chosen programs using tutors who have college degrees and experience with tutoring, teaching, or mentoring children (such as AmeriCorps graduates who were tutors, camp counselors, or Sunday school teachers).
  • As new programs are completed and piloted, third-party evaluators should be funded to evaluate them in randomized experiments, adding to capacity to serve students in grades 1-9. Those programs that produce positive outcomes would then be added to the list of programs available for tutor funding, and their organizations would need to be funded to facilitate preparation for scale-up.
  • Teacher training institutions and school districts should be funded to work together to design accelerated certification programs for outstanding tutors.

Fall, 2022-Spring, 2023

  • Title I schools should be funded to enable them to hire a total of 300,000 tutors. Again, schools will select among proven tutoring programs, which will train, coach, and evaluate tutors across the U.S. We expect these tutors to be able to work with about 12 million struggling students each year.
  • Development, evaluation, and scale-up of proven programs should continue to enrich the number and quality of proven programs adapted to the needs of all kinds of Title I schools.

            The Tutoring Marshall Plan would provide direct benefits to millions of struggling students harmed by Covid-19 school closures, in all parts of the U.S. It would provide meaningful work with a future to college graduates who might otherwise be unemployed. At the same time, it could establish a model of dramatic educational improvement based on rigorous research, contributing to knowledge and use of effective practice. If all goes well, the Tutoring Marshall Plan could demonstrate the power of scaling up proven programs and using research and development to improve the lives of children.

References

Neitzel, A., Lake, C., Pellegrini, M., & Slavin, R. (2020). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. (2020). Effective programs in elementary mathematics: A best-evidence synthesis. Available at www.bestevidence.com. Manuscript submitted for publication.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Florence Nightingale, Statistician

Everyone knows about Florence Nightingale, whose 200th birthday is this year. You probably know of her courageous reform of hospitals and aid stations in the Crimean War, and her insistence on sanitary conditions for wounded soldiers that saved thousands of lives. You may know that she founded the world’s first school for nurses, and of her lifelong fight for the professionalization of nursing, formerly a refuge for uneducated, often alcoholic young women who had no other way to support themselves. You may know her as a bold feminist, who taught by example what women could accomplish.

But did you know that she was also a statistician? In fact, she was the first woman ever to be admitted to Britain’s Royal Statistical Society, in 1858.

blog_3-12-20_FlorenceNightingale_500x347Nightingale was not only a statistician, she was an innovator among statisticians. Her life’s goal was to improve medical care, public health, and nursing for all, but especially for people in poverty. In her time, landless people were pouring into large, filthy industrial cities. Death rates from unclean water and air, and unsafe working conditions, were appalling. Women suffered most, and deaths from childbirth in unsanitary hospitals were all too common. This was the sentimental Victorian age, and there were people who wanted to help. But how could they link particular conditions to particular outcomes? Opponents of investments in prevention and health care argued that the poor brought the problems on themselves, through alcoholism or slovenly behavior, or that these problems had always existed, or even that they were God’s will. The numbers of people and variables involved were enormous. How could these numbers be summarized in a way that would stand up to scrutiny, but also communicate the essence of the process leading from cause to effect?

As a child, Nightingale and her sister were taught by her brilliant and liberal father. He gave his daughters a mathematics education that few (male) students in the very finest schools could match. She put these skills to work in her work in hospital reform, demonstrating, for example, that when her hospital in the Crimean War ordered reforms such as cleaning out latrines and cesspools, the mortality rate dropped from 42.7 percent to 2.2 percent in a few months. She invented a circular graph that showed changes month by month, as the reforms were implemented. She also made it immediately clear to anyone that deaths due to disease far outnumbered those due to war wounds. No numbers, just colors and patterns, made the situation obvious to the least mathematical of readers.

When she returned from Crimea, Nightingale had a disease, probably spondylitis, that forced her to be bedridden much of the time for the rest of her life. Yet this did not dim her commitment to health reform. In fact, it gave her a lot of time to focus on her statistical work, often published in the top newspapers of the day. From her bedroom, she had a profound effect on the reform of Britain’s Poor Laws, and the repeal of the Contagious Diseases Act, which her statistics showed to be counterproductive.

Note that so far, I haven’t said a word about education. In many ways, the analogy is obvious. But I’d like to emphasize one contribution of Nightingale’s work that has particular importance to our field.

Everyone who works in education cares deeply for all children, and especially for disadvantaged, underserved children. As a consequence of our profound concern, we advocate fiercely for policies and solutions that we believe to be good for children. Each of us comes down on one side or another of controversial policies, and then advocates for our positions, certain that our favored position would be hugely beneficial if it prevails, and disastrous if it does not. The same was true in Victorian Britain, where people had heated, interminable arguments about all sorts of public policy.

What Florence Nightingale did, more than a century ago, was to subject various policies affecting the health and welfare of poor people to statistical analysis. She worked hard to be sure that her findings were correct and that they communicated to readers. Then she advocated in the public arena for the policies that were beneficial, and against those that were counterproductive.

In education, we have loads of statistics that bear on various policies, but we do not often commit ourselves to advocate for the ones that actually work. As one example, there have been arguments for decades about charter schools. Yet a national CREDO (2013) study found that, on average, charter schools made no difference at all on reading or math performance. A later CREDO (2015) study found that effects were slightly more positive in urban settings, but these effects were tiny. Other studies have had similar outcomes, although there are more positive outcomes for “no-excuses” charters such as KIPP, a small percentage of all charter schools.

If charters make no major differences in student learning, I suppose one might conclude that they might be maintained or not maintained based on other factors. Yet neither side can plausibly argue, based on evidence of achievement outcomes, that charters should be an important policy focus in the quest for higher achievement. In contrast, there are many programs that have impacts on achievement far greater than those of charters. Yet use of such programs is not particularly controversial, and is not part of anyone’s political agenda.

The principle that Florence Nightingale established in public health was simple: Follow the data. This principle now dominates policy and practice in medicine. Yet more than a hundred years after Nightingale’s death, have we arrived at that common-sense conclusion in educational policy and practice? We’re moving in that direction, but at the current rate, I’m afraid it will be a very long time before this becomes the core of educational policy or practice.

Photo credit: Florence Nightingale, Illustrated London News (February 24, 1855)

References

CREDO (2013). National charter school study. At http://credo.stanford.edu

CREDO (2015). Urban charter school study. At http://credo.stanford.edu

 This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Why Can’t Education Progress Like Medicine Does?

I recently saw an end-of-year article in The Washington Post called “19 Good Things That Happened in 2019.” Four of them were medical or public health breakthroughs. Scientists announced a new therapy for cystic fibrosis likely to benefit 90% of people with this terrible disease, incurable for most patients before now. The World Health Organization announced a new vaccine to prevent Ebola. The Bill and Melinda Gates Foundation announced that deaths of children before their fifth birthday have now dropped from 82 per thousand births in 1990 to 37 in 2019. The Centers for Disease Control reported a decline of 5.1 percent in deaths from drug overdoses in just one year, from 2017 to 2018.

Needless to say, breakthroughs in education did not make the list. In fact, I’ll bet there has never been an education breakthrough mentioned on such lists.

blog_1-9-20_kiddoctor_337x500 I get a lot of criticism from all sides for comparing education to medicine and public health. Most commonly, I’m told that it’s ever so much easier to give someone a pill than to change complex systems of education. That’s true enough, but not one of the 2019 medical or public health breakthroughs was anything like “taking a pill.” The cystic fibrosis cure involves a series of three treatments personalized to the genetic background of patients. It took decades to find and test this treatment. A vaccine for Ebola may be simple in concept, but it also took decades to develop. Also, Ebola occurs in very poor countries, where ensuring universal coverage with a vaccine is very complex. Reducing deaths of infants and toddlers took massive coordinated efforts of national governments, international organizations, and ongoing research and development. There is still much to do, of course, but the progress made so far is astonishing. Similarly, the drop in deaths due to overdoses required, and still requires, huge investments, cooperation between government agencies of all sorts, and constant research, development, and dissemination. In fact, I would argue that reducing infant deaths and overdose deaths strongly resemble what education would have to do to, for example, eliminate reading failure or enable all students to succeed at middle school mathematics. No one distinct intervention, no one miracle pill has by itself improved infant mortality or overdose mortality, and solutions for reading and math failure will similarly involve many elements and coordinated efforts among many government agencies, private foundations, and educators, as well as researchers and developers.

The difference between evidence-based reform in medicine/public health and education is, I believe, a difference in societal commitment to solving the problems. The general public, especially political leaders, tend to be rather complacent about educational failures. One of our past presidents said he wanted to help, but said, “We have more will than wallet” to solve educational problems. Another focused his education plans on recruiting volunteers to help with reading. These policies hardly communicate seriousness. In contrast, if medicine or public health can significantly reduce death or disease, it’s hard to be complacent.

Perhaps part of the motivational difference is due to the situations of powerful people. Anyone can get a disease, so powerful individuals are likely to have children or other relatives or friends who suffer from a given disease. In contrast, they may assume that children failing in school have inadequate parents or parents who need improved job opportunities or economic security or decent housing, which will take decades, and massive investments to solve. As a result, governments allocate little money for research, development, or dissemination of proven programs.

There is no doubt in my mind that we could, for example, eliminate early reading failure, using the same techniques used to eliminate diseases: research, development, practical experiments, and planful, rapid scale-up. It’s all a question of resources, political leadership, collaboration among many critical agencies and individuals, and a total commitment to getting the job done. The year reading failure drops to near zero nationwide, perhaps education will make the Washington Post list of “50 Good Things That Happened in 2050.”

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

On Replicability: Why We Don’t Celebrate Viking Day

I was recently in Oslo, Norway’s capital, and visited a wonderful museum displaying three Viking ships that had been buried with important people. The museum had all sorts of displays focused on the amazing exploits of Viking ships, always including the Viking landings in Newfoundland, about 500 years before Columbus. Since the 1960s, most people have known that Vikings, not Columbus, were the first Europeans to land in America. So why do we celebrate Columbus Day, not Viking Day?

Given the bloodthirsty actions of Columbus, easily rivaling those of the Vikings, we surely don’t prefer one to the other based on their charming personalities. Instead, we celebrate Columbus Day because what Columbus did was far more important. The Vikings knew how to get back to Newfoundland, but they were secretive about it. Columbus was eager to publicize and repeat his discovery. It was this focus on replication that opened the door to regular exchanges. The Vikings brought back salted cod. Columbus brought back a new world.

In educational research, academics often imagine that if they establish new theories or demonstrate new methods on a small scale, and then publish their results in reputable journals, their job is done. Call this the Viking model: they got what they wanted (promotions or salt cod), and who cares if ordinary people found out about it? Even if the Vikings had published their findings in the Viking Journal of Exploration, this would have had roughly the same effect as educational researchers publishing in their own research journals.

Columbus, in contrast, told everyone about his voyages, and very publicly repeated and extended them. His brutal leadership ended with him being sent back to Spain in chains, but his discoveries had resounding impacts that long outlived him.

blog_11-21-19_vikingship_500x374

Educational researchers only want to do good, but they are unlikely to have any impact at all unless they can make their ideas useful to educators. Many educational researchers would love to make their ideas into replicable programs, evaluate these programs in schools, and if they are found to be effective, disseminate them broadly. However, resources for the early stages of development and research are scarce. Yes, the Institute of Education Sciences (IES) and Education Innovation Research (EIR) fund a lot of development projects, and Small Business Innovation Research (SBIR) provides small grants for this purpose to for-profit companies. Yet these funders support only a tiny proportion of the proposals they receive. In England, the Education Endowment Foundation (EEF) spends a lot on randomized evaluations of promising programs, but very little on development or early-stage research. Innovations that are funded by government or other funding very rarely end up being evaluated in large experiments, fewer still are found to be effective, and vanishingly few eventually enter widespread use. The exceptions are generally programs crated by large for-profit companies, large and entrepreneurial non-profits, or other entities with proven capacity to develop, evaluate, support, and disseminate programs at scale. Even the most brilliant developers and researchers rarely have the interest, time, capital, business expertise, or infrastructure to nurture effective programs through all the steps necessary to bring a practical and effective program to market. As a result, most educational products introduced at scale to schools come from commercial publishers or software companies, who have the capital and expertise to create and disseminate educational programs, but serve a market that primarily wants attractive, inexpensive, easy-to-use materials, software, and professional development, and is not (yet) willing to pay for programs proven to be effective. I discussed this problem in a recent blog on technology, but the same dynamics apply to all innovations, tech and non-tech alike.

How Government Can Promote Proven, Replicable Programs

There is an old saying that Columbus personified the spirit of research. He didn’t know where he was going, he didn’t know where he was when he got there, and he did it all on government funding. The relevant part of this is the government funding. In Columbus’ time, only royalty could afford to support his voyage, and his grant from Queen Isabella was essential to his success. Yet Isabella was not interested in pure research. She was hoping that Columbus might open rich trade routes to the (east) Indies or China, or might find gold or silver, or might acquire valuable new lands for the crown (all of these things did eventually happen). Educational research, development, and dissemination face a similar situation. Because education is virtually a government monopoly, only government is capable of sustained, sizable funding of research, development, and dissemination, and only the U.S. government has the acknowledged responsibility to improve outcomes for the 50 million American children ages 4-18 in its care. So what can government do to accelerate the research-development-dissemination process?

  1. Contract with “seed bed” organizations capable of identifying and supporting innovators with ideas likely to make a difference in student learning. These organizations might be rewarded, in part, based on the number of proven programs they are able to help create, support, and (if effective) ultimately disseminate.
  2. Contract with independent third-party evaluators capable of doing rigorous evaluations of promising programs. These organizations would evaluate promising programs from any source, not just from seed bed companies, as they do now in IES, EIR, and EEF grants.
  3. Provide funding for innovators with demonstrated capacity to create programs likely to be effective and funding to disseminate them if they are proven effective. Developers may also contract with “seed bed” organizations to help program developers succeed with development and dissemination.
  4. Provide information and incentive funding to schools to encourage them to adopt proven programs, as described in a recent blog on technology.  Incentives should be available on a competitive basis to a broad set of schools, such as all Title I schools, to engage many schools in adoption of proven programs.

Evidence-based reform in education has made considerable progress in the past 15 years, both in finding positive examples that are in use today and in finding out what is not likely to make substantial differences. It is time for this movement to go beyond its early achievements to enter a new phase of professionalism, in which collaborations among developers, researchers, and disseminators can sustain a much faster and more reliable process of research, development, and dissemination. It’s time to move beyond the Viking stage of exploration to embrace the good parts of the collaboration between Columbus and Queen Isabella that made a substantial and lasting change in the whole world.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Nobel Experiments

The world of evidence-based policy just got some terrific news. Abhijit Banerjee and Esther Duflo, of MIT, and Michael Kremer of Harvard, were recently awarded the Nobel Prize in economics.

This award honors extraordinary people doing extraordinary work to alleviate poverty in developing countries. I heard Esther Duflo speak at the Society for Research on Effective Education, and saw her amazing Ted Talk on the research that won the Nobel (delivered before they knew this was going to happen). I strongly suggest you view her speech, at https://www.ted.com/talks/esther_duflo_social_experiments_to_fight_poverty?language=en

But the importance of this award goes far beyond its recognition of the scholars who received it. It celebrates the same movement toward evidence-based policy represented by the Institute for Education Sciences, Education Innovation Research, the Arnold Foundation, and others in the U.S., the Education Endowment Foundation in the U.K., and this blog. It also celebrates the work of researchers in education, psychology, sociology, as well as economics, who are committed to using rigorous research to advance human progress. The Nobel awardees represent the international development wing of this movement, largely funded by the World Bank, the InterAmerica Development Bank, and other international aid organizations.

In her Ted Talk, Esther Duflo explains the grand strategy she and her colleagues pursue. They take major societal problems in developing countries, break them down into solvable parts, and then use randomized experiments to test solutions to those parts. Along with Dr. Banerjee (her husband) and Michael Kremer, she first did a study that found that ensuring that students in India had textbooks made no difference in learning. They then successfully tested a plan to provide inexpensive tutors and, later, computers, to help struggling readers in India (Banerjee, Cole, Duflo, & Linden, 2007). One fascinating series of studies tested the cost-effectiveness of various educational treatments in developing countries. The winner? Curing children of intestinal worms. Based on this and other research, the Carter Foundation embarked on a campaign that has virtually eradicated Guinea worm worldwide.

blog_11-7-19_classroomIndia_500x333

Dr. Duflo and her colleagues later tested variations in programs to provide malaria-inhibiting bed nets in developing countries in which malaria is the number one killer of children, especially those less than five years old. Were outcomes best if bed nets (retail cost= $3) were free, or only discounted to varying degrees? Many economists and policy makers worried that people who paid nothing for bed nets would not value them, or might use them for other purposes. But the randomized study found that without question, free bed nets were more often obtained and used than were discounted ones, potentially saving thousands of children’s lives.

For those of us who work in evidence-based education, the types of experiments being done by the Nobel laureates are entirely familiar, even though they have practical aspects quite different from the ones we encounter when we work in the U.S. or the U.K., for example. However, we are far from a majority among researchers in our own countries, and we face major struggles to continue to insist on randomized experiments as the criterion of effectiveness. I’m sure people working in international development face equal challenges. This is why this Nobel Prize in economics means a lot to all of us. People pay a lot of attention to Nobel Prizes, and there isn’t one in educational research, so having a Nobel shared by economists whose main contribution is in the use of randomized experiments to solve questions of great practical and policy importance, including studies in education itself, may be the closest we’ll ever get to Nobel recognition for the principle espoused by many in applied research in psychology, sociology, and education, as it is by many economists.

Nobel Prizes are often used to send a message, to support important new developments in research as well as to recognize deserving researchers who are leaders in this area. This was clearly the case with this award. The Nobel announcement makes it clear how the work of the Nobel laureates has transformed their field, to the point that “their experimental research methodologies entirely dominate developmental economics.”  I hope this event will add further credibility and awareness to the idea that rigorous evidence is a key lever for change that matters in the lives of people

 

Reference

Banerjee, A., Cole, S., Duflo, E., & Linden, L. (2007). Remedying education: Evidence from two randomized experiments in India. The Quarterly Journal of Economics, 122 (3), 1235-1264.

 

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

On Reviews of Research in Education

Not so long ago, every middle class home had at least one encyclopedia. Encyclopedias were prominently displayed, a statement to all that this was a house that valued learning. People consulted the encyclopedia to find out about things of interest to them. Those who did not own encyclopedias found them in the local library, where they were heavily used. As a kid, I loved everything about encyclopedias. I loved to read them, but also loved their musty small, their weight, and their beautiful maps and photos.

There were two important advantages of an encyclopedia. First, it was encyclopedic, so users could be reasonably certain that whatever information they wanted was in there somewhere. Second, they were authoritative. Whatever it said in the encyclopedia was likely to be true, or at least carefully vetted by experts.

blog_10-17-19_encyclopediakid_500x331

In educational research, and all scientific fields, we have our own kinds of encyclopedias. One consists of articles in journals that publish reviews of research. In our field, the Review of Educational Research plays a pre-eminent role in this, but there are many others. Reviews are hugely popular. Invariably, review journals have a much higher citation count than even the most esteemed journals focusing on empirical research. In addition to journals, reviews appear I edited volumes, in online compendia, in technical reports, and other sources. At Johns Hopkins, we produce a bi-weekly newsletter, Best Evidence in Brief (BEiB; https://beibindex.wordpress.com/) that summarizes recent research in education. Two years ago we looked at analytics to find out the favorite articles from BEiB. Although BEiB mostly summarizes individual studies, almost all of its favorite articles were summaries of the findings of recent reviews.

Over time, RER and other review journals become “encyclopedias” of a sort.  However, they are not encyclopedic. No journal tries to ensure that key topics will all be covered over time. Instead, journal reviewers and editors evaluate each review sent to them on its own merits. I’m not criticizing this, but it is the way the system works.

Are reviews in journals authoritative? They are in one sense, because reviews accepted for publication have been carefully evaluated by distinguished experts on the topic at hand. However, review methods vary widely and reviews are written for many purposes. Some are written primarily for theory development, and some are really just essays with citations. In contrast, one category of reviews, meta-analyses, go to great lengths to locate and systematically include all relevant citations. These are not pure types, and most meta-analyses have at least some focus on theory building and discussion of current policy or research issues, even if their main purpose is to systematically review a well-defined set of studies.

Given the state of the art of research reviews in education, how could we create an “encyclopedia” of evidence from all sources on the effectiveness of programs and practices designed to improve student outcomes? The goal of such an activity would be to provide readers with something both encyclopedic and authoritative.

My colleagues and I created two websites that are intended to serve as a sort of encyclopedia of PK-12 instructional programs. The Best Evidence Encyclopedia (BEE; www.bestevidence.org) consists of meta-analyses written by our staff and students, all of which use similar inclusion criteria and review methods. These are used by a wide variety of readers, especially but not only researchers. The BEE has meta-analyses on elementary and secondary reading, reading for struggling readers, writing programs, programs for English learners, elementary and secondary mathematics, elementary and secondary science, early childhood programs, and other topics, so at least as far as achievement outcomes are concerned, it is reasonably encyclopedic. Our second website is Evidence for ESSA, designed more for educators. It seeks to include every program currently in existence, and therefore is truly encyclopedic in reading and mathematics. Sections on social emotional learning, attendance, and science are in progress.

Are the BEE and Evidence for ESSA authoritative as well as encyclopedic? You’ll have to judge for yourself. One important indicator of authoritativeness for the BEE is that all of the meta-analyses are eventually published, so the reviewers for those journals could be considered to be lending authority.

The What Works Clearinghouse (https://ies.ed.gov/ncee/wwc/) could be considered authoritative, as it is a carefully monitored online publication of the U.S. Department of Education. But is it encyclopedic? Probably not, for two reasons. One is that the WWC has difficulty keeping up with new research. Secondly, the WWC does not list programs that do not have any studies that meet its standards. As a result of both of these, a reader who types in the name of a current program may find nothing at all on it. Is this because the program did not meet WWC standards, or because the WWC has not yet reviewed it? There is no way to tell. Still, the WWC makes important contributions in the areas it has reviewed.

Beyond the websites focused on achievement, the most encyclopedic and authoritative source is Blueprints (www.blueprintsprograms.org). Blueprints focuses on drug and alcohol abuse, violence, bullying, social emotional learning, and other topics not extensively covered in other review sources.

In order to provide readers with easy access to all of the reviews meeting a specified level of quality on a given topic, it would be useful to have a source that briefly describes various reviews, regardless of where they appear. For example, a reader might want to know about all of the meta-analyses that focus on elementary mathematics, or dropout prevention, or attendance. These would include review articles published in scientific journals, technical reports, websites, edited volumes, and so on. To be cited in detail, the reviews should have to meet agreed-upon criteria, including a restriction to experimental-control comparison, a broad and well-documented search for eligible studies, documented efforts to include all studies (published or unpublished) that fall within well-specified parameters (e.g., subjects, grade levels, and start and end dates of studies included). Reviews that meet these standards might be highlighted, though others, including less systematic reviews, should be listed as well, as supplementary resources.

Creating such a virtual encyclopedia would be a difficult but straightforward task. At the end, the collection of rigorous reviews would offer readers encyclopedic, authoritative information on the topics of their interest, as well as providing something more important that no paper encyclopedias ever included: contrasting viewpoints from well-informed experts on each topic.

My imagined encyclopedia wouldn’t have the hypnotic musty smell, the impressive heft, or the beautiful maps and photos of the old paper encyclopedias. However, it would give readers access to up-to-date, curated, authoritative, quantitative reviews of key topics in education, with readable and appealing summaries of what was concluded in qualifying reviews.

Also, did I mention that unlike the encyclopedias of old, it would have to be free?

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.