Florence Nightingale, Statistician

Everyone knows about Florence Nightingale, whose 200th birthday is this year. You probably know of her courageous reform of hospitals and aid stations in the Crimean War, and her insistence on sanitary conditions for wounded soldiers that saved thousands of lives. You may know that she founded the world’s first school for nurses, and of her lifelong fight for the professionalization of nursing, formerly a refuge for uneducated, often alcoholic young women who had no other way to support themselves. You may know her as a bold feminist, who taught by example what women could accomplish.

But did you know that she was also a statistician? In fact, she was the first woman ever to be admitted to Britain’s Royal Statistical Society, in 1858.

blog_3-12-20_FlorenceNightingale_500x347Nightingale was not only a statistician, she was an innovator among statisticians. Her life’s goal was to improve medical care, public health, and nursing for all, but especially for people in poverty. In her time, landless people were pouring into large, filthy industrial cities. Death rates from unclean water and air, and unsafe working conditions, were appalling. Women suffered most, and deaths from childbirth in unsanitary hospitals were all too common. This was the sentimental Victorian age, and there were people who wanted to help. But how could they link particular conditions to particular outcomes? Opponents of investments in prevention and health care argued that the poor brought the problems on themselves, through alcoholism or slovenly behavior, or that these problems had always existed, or even that they were God’s will. The numbers of people and variables involved were enormous. How could these numbers be summarized in a way that would stand up to scrutiny, but also communicate the essence of the process leading from cause to effect?

As a child, Nightingale and her sister were taught by her brilliant and liberal father. He gave his daughters a mathematics education that few (male) students in the very finest schools could match. She put these skills to work in her work in hospital reform, demonstrating, for example, that when her hospital in the Crimean War ordered reforms such as cleaning out latrines and cesspools, the mortality rate dropped from 42.7 percent to 2.2 percent in a few months. She invented a circular graph that showed changes month by month, as the reforms were implemented. She also made it immediately clear to anyone that deaths due to disease far outnumbered those due to war wounds. No numbers, just colors and patterns, made the situation obvious to the least mathematical of readers.

When she returned from Crimea, Nightingale had a disease, probably spondylitis, that forced her to be bedridden much of the time for the rest of her life. Yet this did not dim her commitment to health reform. In fact, it gave her a lot of time to focus on her statistical work, often published in the top newspapers of the day. From her bedroom, she had a profound effect on the reform of Britain’s Poor Laws, and the repeal of the Contagious Diseases Act, which her statistics showed to be counterproductive.

Note that so far, I haven’t said a word about education. In many ways, the analogy is obvious. But I’d like to emphasize one contribution of Nightingale’s work that has particular importance to our field.

Everyone who works in education cares deeply for all children, and especially for disadvantaged, underserved children. As a consequence of our profound concern, we advocate fiercely for policies and solutions that we believe to be good for children. Each of us comes down on one side or another of controversial policies, and then advocates for our positions, certain that our favored position would be hugely beneficial if it prevails, and disastrous if it does not. The same was true in Victorian Britain, where people had heated, interminable arguments about all sorts of public policy.

What Florence Nightingale did, more than a century ago, was to subject various policies affecting the health and welfare of poor people to statistical analysis. She worked hard to be sure that her findings were correct and that they communicated to readers. Then she advocated in the public arena for the policies that were beneficial, and against those that were counterproductive.

In education, we have loads of statistics that bear on various policies, but we do not often commit ourselves to advocate for the ones that actually work. As one example, there have been arguments for decades about charter schools. Yet a national CREDO (2013) study found that, on average, charter schools made no difference at all on reading or math performance. A later CREDO (2015) study found that effects were slightly more positive in urban settings, but these effects were tiny. Other studies have had similar outcomes, although there are more positive outcomes for “no-excuses” charters such as KIPP, a small percentage of all charter schools.

If charters make no major differences in student learning, I suppose one might conclude that they might be maintained or not maintained based on other factors. Yet neither side can plausibly argue, based on evidence of achievement outcomes, that charters should be an important policy focus in the quest for higher achievement. In contrast, there are many programs that have impacts on achievement far greater than those of charters. Yet use of such programs is not particularly controversial, and is not part of anyone’s political agenda.

The principle that Florence Nightingale established in public health was simple: Follow the data. This principle now dominates policy and practice in medicine. Yet more than a hundred years after Nightingale’s death, have we arrived at that common-sense conclusion in educational policy and practice? We’re moving in that direction, but at the current rate, I’m afraid it will be a very long time before this becomes the core of educational policy or practice.

Photo credit: Florence Nightingale, Illustrated London News (February 24, 1855)

References

CREDO (2013). National charter school study. At http://credo.stanford.edu

CREDO (2015). Urban charter school study. At http://credo.stanford.edu

 This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Why Can’t Education Progress Like Medicine Does?

I recently saw an end-of-year article in The Washington Post called “19 Good Things That Happened in 2019.” Four of them were medical or public health breakthroughs. Scientists announced a new therapy for cystic fibrosis likely to benefit 90% of people with this terrible disease, incurable for most patients before now. The World Health Organization announced a new vaccine to prevent Ebola. The Bill and Melinda Gates Foundation announced that deaths of children before their fifth birthday have now dropped from 82 per thousand births in 1990 to 37 in 2019. The Centers for Disease Control reported a decline of 5.1 percent in deaths from drug overdoses in just one year, from 2017 to 2018.

Needless to say, breakthroughs in education did not make the list. In fact, I’ll bet there has never been an education breakthrough mentioned on such lists.

blog_1-9-20_kiddoctor_337x500 I get a lot of criticism from all sides for comparing education to medicine and public health. Most commonly, I’m told that it’s ever so much easier to give someone a pill than to change complex systems of education. That’s true enough, but not one of the 2019 medical or public health breakthroughs was anything like “taking a pill.” The cystic fibrosis cure involves a series of three treatments personalized to the genetic background of patients. It took decades to find and test this treatment. A vaccine for Ebola may be simple in concept, but it also took decades to develop. Also, Ebola occurs in very poor countries, where ensuring universal coverage with a vaccine is very complex. Reducing deaths of infants and toddlers took massive coordinated efforts of national governments, international organizations, and ongoing research and development. There is still much to do, of course, but the progress made so far is astonishing. Similarly, the drop in deaths due to overdoses required, and still requires, huge investments, cooperation between government agencies of all sorts, and constant research, development, and dissemination. In fact, I would argue that reducing infant deaths and overdose deaths strongly resemble what education would have to do to, for example, eliminate reading failure or enable all students to succeed at middle school mathematics. No one distinct intervention, no one miracle pill has by itself improved infant mortality or overdose mortality, and solutions for reading and math failure will similarly involve many elements and coordinated efforts among many government agencies, private foundations, and educators, as well as researchers and developers.

The difference between evidence-based reform in medicine/public health and education is, I believe, a difference in societal commitment to solving the problems. The general public, especially political leaders, tend to be rather complacent about educational failures. One of our past presidents said he wanted to help, but said, “We have more will than wallet” to solve educational problems. Another focused his education plans on recruiting volunteers to help with reading. These policies hardly communicate seriousness. In contrast, if medicine or public health can significantly reduce death or disease, it’s hard to be complacent.

Perhaps part of the motivational difference is due to the situations of powerful people. Anyone can get a disease, so powerful individuals are likely to have children or other relatives or friends who suffer from a given disease. In contrast, they may assume that children failing in school have inadequate parents or parents who need improved job opportunities or economic security or decent housing, which will take decades, and massive investments to solve. As a result, governments allocate little money for research, development, or dissemination of proven programs.

There is no doubt in my mind that we could, for example, eliminate early reading failure, using the same techniques used to eliminate diseases: research, development, practical experiments, and planful, rapid scale-up. It’s all a question of resources, political leadership, collaboration among many critical agencies and individuals, and a total commitment to getting the job done. The year reading failure drops to near zero nationwide, perhaps education will make the Washington Post list of “50 Good Things That Happened in 2050.”

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

On Replicability: Why We Don’t Celebrate Viking Day

I was recently in Oslo, Norway’s capital, and visited a wonderful museum displaying three Viking ships that had been buried with important people. The museum had all sorts of displays focused on the amazing exploits of Viking ships, always including the Viking landings in Newfoundland, about 500 years before Columbus. Since the 1960s, most people have known that Vikings, not Columbus, were the first Europeans to land in America. So why do we celebrate Columbus Day, not Viking Day?

Given the bloodthirsty actions of Columbus, easily rivaling those of the Vikings, we surely don’t prefer one to the other based on their charming personalities. Instead, we celebrate Columbus Day because what Columbus did was far more important. The Vikings knew how to get back to Newfoundland, but they were secretive about it. Columbus was eager to publicize and repeat his discovery. It was this focus on replication that opened the door to regular exchanges. The Vikings brought back salted cod. Columbus brought back a new world.

In educational research, academics often imagine that if they establish new theories or demonstrate new methods on a small scale, and then publish their results in reputable journals, their job is done. Call this the Viking model: they got what they wanted (promotions or salt cod), and who cares if ordinary people found out about it? Even if the Vikings had published their findings in the Viking Journal of Exploration, this would have had roughly the same effect as educational researchers publishing in their own research journals.

Columbus, in contrast, told everyone about his voyages, and very publicly repeated and extended them. His brutal leadership ended with him being sent back to Spain in chains, but his discoveries had resounding impacts that long outlived him.

blog_11-21-19_vikingship_500x374

Educational researchers only want to do good, but they are unlikely to have any impact at all unless they can make their ideas useful to educators. Many educational researchers would love to make their ideas into replicable programs, evaluate these programs in schools, and if they are found to be effective, disseminate them broadly. However, resources for the early stages of development and research are scarce. Yes, the Institute of Education Sciences (IES) and Education Innovation Research (EIR) fund a lot of development projects, and Small Business Innovation Research (SBIR) provides small grants for this purpose to for-profit companies. Yet these funders support only a tiny proportion of the proposals they receive. In England, the Education Endowment Foundation (EEF) spends a lot on randomized evaluations of promising programs, but very little on development or early-stage research. Innovations that are funded by government or other funding very rarely end up being evaluated in large experiments, fewer still are found to be effective, and vanishingly few eventually enter widespread use. The exceptions are generally programs crated by large for-profit companies, large and entrepreneurial non-profits, or other entities with proven capacity to develop, evaluate, support, and disseminate programs at scale. Even the most brilliant developers and researchers rarely have the interest, time, capital, business expertise, or infrastructure to nurture effective programs through all the steps necessary to bring a practical and effective program to market. As a result, most educational products introduced at scale to schools come from commercial publishers or software companies, who have the capital and expertise to create and disseminate educational programs, but serve a market that primarily wants attractive, inexpensive, easy-to-use materials, software, and professional development, and is not (yet) willing to pay for programs proven to be effective. I discussed this problem in a recent blog on technology, but the same dynamics apply to all innovations, tech and non-tech alike.

How Government Can Promote Proven, Replicable Programs

There is an old saying that Columbus personified the spirit of research. He didn’t know where he was going, he didn’t know where he was when he got there, and he did it all on government funding. The relevant part of this is the government funding. In Columbus’ time, only royalty could afford to support his voyage, and his grant from Queen Isabella was essential to his success. Yet Isabella was not interested in pure research. She was hoping that Columbus might open rich trade routes to the (east) Indies or China, or might find gold or silver, or might acquire valuable new lands for the crown (all of these things did eventually happen). Educational research, development, and dissemination face a similar situation. Because education is virtually a government monopoly, only government is capable of sustained, sizable funding of research, development, and dissemination, and only the U.S. government has the acknowledged responsibility to improve outcomes for the 50 million American children ages 4-18 in its care. So what can government do to accelerate the research-development-dissemination process?

  1. Contract with “seed bed” organizations capable of identifying and supporting innovators with ideas likely to make a difference in student learning. These organizations might be rewarded, in part, based on the number of proven programs they are able to help create, support, and (if effective) ultimately disseminate.
  2. Contract with independent third-party evaluators capable of doing rigorous evaluations of promising programs. These organizations would evaluate promising programs from any source, not just from seed bed companies, as they do now in IES, EIR, and EEF grants.
  3. Provide funding for innovators with demonstrated capacity to create programs likely to be effective and funding to disseminate them if they are proven effective. Developers may also contract with “seed bed” organizations to help program developers succeed with development and dissemination.
  4. Provide information and incentive funding to schools to encourage them to adopt proven programs, as described in a recent blog on technology.  Incentives should be available on a competitive basis to a broad set of schools, such as all Title I schools, to engage many schools in adoption of proven programs.

Evidence-based reform in education has made considerable progress in the past 15 years, both in finding positive examples that are in use today and in finding out what is not likely to make substantial differences. It is time for this movement to go beyond its early achievements to enter a new phase of professionalism, in which collaborations among developers, researchers, and disseminators can sustain a much faster and more reliable process of research, development, and dissemination. It’s time to move beyond the Viking stage of exploration to embrace the good parts of the collaboration between Columbus and Queen Isabella that made a substantial and lasting change in the whole world.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

A Powerful Hunger for Evidence-Proven Technology

I recently saw a 1954 video of B. F. Skinner showing off a classroom full of eager students using teaching machines. In it, Skinner gave all the usual reasons that teaching machines were soon going to be far superior to ordinary teaching: They were scientifically made to enable students to experience constant success in small steps. They were adapted to students’ needs, so fast students did not need to wait for their slower classmates, and the slower classmates could have the time to solidify their understanding, rather than being whisked from one half-learned topic to the next, never getting a chance to master anything and therefore sinking into greater and greater failure.

Here it is 65 years later and “teaching machines,” now called computer-assisted instruction, are ubiquitous. But are they effective? Computers are certainly effective at teaching students to use technology, but can they teach the core curriculum of elementary or secondary schools? In a series of reviews in the Best Evidence Encyclopedia (BEE; www.bestevidence.org), my colleagues and I have reviewed research on the impacts of technology-infused methods on reading, mathematics, and science, in elementary and secondary schools. Here is a quick summary of my findings:

Mean Effect Sizes for Technology-Based Programs in Recent Reviews
Review Topic No. of Studies Mean Effect Size
Inns et al., in preparation Elementary Reading 23 +0.09
Inns et al., 2019 Struggling Readers 6 +0.06
Baye et al., 2018 Secondary Reading 23 -0.01
Pellegrini et al., 2019 Elementary Mathematics 14 +0.06

If you prefer “months of learning,” these are all about one month, except for secondary reading, which is zero. A study-weighted average across these reviews is an effect size of +0.05. That’s not nothing, but it’s not much. Nothing at all like what Skinner and countless other theorists and advocates have been promising for the past 65 years. I think that even the most enthusiastic fans of technology use in education are beginning to recognize that while technology may be useful in improving achievement on traditional learning outcomes, it has not yet had a revolutionary impact on learning of reading or mathematics.

How can we boost the impact of technology in education?

Whatever you think the effects of technology-based education might be for typical school outcomes, no one could deny that it would be a good thing if that impact were larger than it is today. How could government, the educational technology industry, researchers in and out of ed tech, and practicing educators work together to make technology applications more effective than they are now?

In order to understand how to proceed, it is important to acknowledge a serious problem in the world of ed tech today. Educational technology is usually developed by commercial companies. Like all commercial companies, they must serve their market. Unfortunately, the market for ed tech products is not terribly interested in the evidence supporting technology-based programs. Instead, they tend to pay attention to sales reps or marketing, or they seek opinions from their friends and colleagues, rather than looking at evidence. Technology decision makers often value attractiveness, ease of use, low cost, and current trends or fads, over evidence (see Morrison, Ross & Cheung, 2019, for documentation of these choice strategies).

Technology providers are not uncaring people, and they want their products to truly improve outcomes for children. However, they know that if they put a lot of money into developing and researching an innovative approach to education that happens to use technology, and their method requires a lot of professional development to produce substantially positive effects, their programs might be considered too expensive, and less expensive products that ask less of teachers and other educators would dominate the sector. These problems resemble those faced by textbook publishers, who similarly may have great ideas to increase the effectiveness of their textbooks or to add components that require professional development. Textbook designers are prisoners of their markets just as technology developers are.

The solution, I would propose, requires interventions by government designed to nudge education markets toward use of evidence. Government (federal, state, and local) has a real interest in improving outcomes of education. So how could government facilitate the use of technology-based approaches that are known to enhance student achievement more than those that exist today?

blog_5-24-18_DistStudents_500x332

How government could promote use of proven technology approaches

Government could lead the revolution in educational technology that market-driven technology developers cannot do on their own. It could do this by emphasizing two main strategies: providing funding to assist technology developers of all kinds (e.g., for-profit, non-profit, or universities), providing encouragement and incentives to motivate schools, districts, and states to use programs proven effective in rigorous research, and funding development, evaluation, and dissemination of proven technology-based programs.

Encouraging and incentivizing use of proven technology-based programs

The most important thing government must do to expand the use of proven technology-based approaches (as well as non-technology approaches) is to build a powerful hunger for them among educators, parents, and the public at large. Yes, I realize that this sounds backward; shouldn’t government sponsor development, research, and dissemination of proven programs first? Yes it should, and I’ll address this topic in a moment. Of course we need proven programs. No one will clamor for an empty box. But today, many proven programs already exist, and the bigger problem is getting them (and many others to come) enthusiastically adopted by schools. In fact, we must eventually get to the point where educational leaders value not only individual programs supported by research, but value research itself. That is, when they start looking for technology-based programs, their first step would be to find out what programs are proven to work, rather than selecting programs in the usual way and only then trying to find evidence to support the choice they have already made.

Government at any level could support such a process, but the most likely leader in this would be the federal government. It could provide incentives to schools that select and implement proven programs, and build off of this multifaceted outreach efforts to build hype around proven approaches and the idea that approaches should be proven.

A good example of what I have in mind was the Comprehensive School Reform (CSR) grants of the late 1990s. Schools that adopted whole-school reform models that met certain requirements could receive grants of up to $50,000 per year for three years. By the end of CSR, about 1000 schools got grants in a competitive process, but CSR programs were used in an estimated 6000 schools nationwide. In other words, the hype generated by the CSR grants process led many schools that never got a grant to find other resources to adopt these whole school programs. I should note that only a few of the adopted programs had evidence of effectiveness; in CSR, the core idea was whole-school reform, not evidence (though some had good evidence of effectiveness). But a process like CSR, with highly visible grants and active support from government, illustrates a process that built a powerful hunger for whole-school reform, which could work just as well, I think, if applied to building a powerful hunger for proven technology-based programs and other proven approaches.

“Wait a minute,” I can hear you saying. “Didn’t the ESSA evidence standards already do this?”

This was indeed the intention of ESSA, which established “strong,” “moderate,” and “promising” levels of evidence (as well as lower categories). ESSA has been a great first step in building interest in evidence. However, the only schools that could obtain additional funding for selecting proven programs were among the lowest-achieving schools in the country, so ordinary Title I schools, not to mention non-Title I schools, were not much affected. CSR gave extra points to high-poverty schools, but a much wider variety of schools could get into that game. There is a big different between creating interest in evidence, which ESSA has definitely done, and creating a powerful hunger for proven programs. ESSA was passed four years ago, and it is only now beginning to build knowledge and enthusiasm among schools.

Building many more proven technology-based programs

Clearly, we need many more proven technology-based programs. In our Evidence for ESSA website (www.evidenceforessa.org), we list 113 reading and mathematics programs that meet any of the three top ESSA standards. Only 28 of these (18 reading, 10 math) have a major technology component. This is a good start, but we need a lot more proven technology-based programs. To get them, government needs to continue its productive Institute for Education Sciences (IES) and Education Innovation Research (EIR) initiatives. For for-profit companies, Small Business Innovation Research (SBIR) plays an important role in early development of technology solutions. However, the pace of development and research focused on practical programs for schools needs to accelerate, and to learn from its own successes and failures to increase the success rate of its investments.

Communicating “what works”

There remains an important need to provide school leaders with easy-to-interpret information on the evidence base for all existing programs schools might select. The What Works Clearinghouse and our Evidence for ESSA website do this most comprehensively, but these and other resources need help to keep up with the rapid expansion of evidence that has appeared in the past 10 years.

Technology-based education can still produce the outcomes Skinner promised in his 1954 video, the ones we have all been eagerly awaiting ever since. However, technology developers and researchers need more help from government to build an eager market not just for technology, but for proven achievement outcomes produced by technology.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (2019). Effective reading programs for secondary students. Reading Research Quarterly, 54 (2), 133-166.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (in preparation). A synthesis of quantitative research on elementary reading. Baltimore, MD: Center for Research and Reform in Education, Johns Hopkins University.

Morrison, J. R., Ross, S.M., & Cheung, A.C.K. (2019). From the market to the classroom: How ed-tech products are procured by school districts interacting with vendors. Educational Technology Research and Development, 67 (2), 389-421.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. (2019). Effective programs in elementary mathematics: A best-evidence synthesis. Available at www.bestevidence.com. Manuscript submitted for publication.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Do School Districts Really Have Difficulty Meeting ESSA Evidence Standards?

The Center for Educational Policy recently released a report on how school districts are responding to the Every Student Succeeds Act (ESSA) requirement that schools seeking school improvement grants select programs that meet ESSA’s strong, moderate, or promising standards of evidence. Education Week ran a story on the CEP report.

The report noted that many states, districts, and schools are taking the evidence requirements seriously, and are looking at websites and consulting with researchers to help them identify programs that meet the standards. This is all to the good.

However, the report also notes continuing problems districts and schools are having finding out “what works.” Two particular problems were cited. One was that districts and schools were not equipped to review research to find out what works. The other was that rural districts and schools found few programs proven effective in rural schools.

I find these concerns astounding. The same concerns were expressed when ESSA was first passed, in 2015. But that was almost four years ago. Since 2015, the What Works Clearinghouse has added information to help schools identify programs that meet the top two ESSA evidence categories, strong and moderate. Our own Evidence for ESSA, launched in February, 2017, has up-to-date information on virtually all PK-12 reading and math programs currently in dissemination. Among hundreds of programs examined, 113 meet ESSA standards for strong, moderate, or promising evidence of effectiveness. WWC, Evidence for ESSA, and other sources are available online at no cost. The contents of the entire Evidence for ESSA website were imported into Ohio’s own website on this topic, and dozens of states, perhaps all of them, have informed their districts and schools about these sources.

The idea that districts and schools could not find information on proven programs if they wanted to do so is difficult to believe, especially among schools eligible for school improvement grants. Such schools, and the districts in which they are located, write a lot of grant proposals for federal and state funding. The application forms for school improvement grants always explain the evidence requirements, because that is the law. Someone in every state involved with federal funding knows about the WWC and Evidence for ESSA websites. More than 90,000 unique users have used Evidence for ESSA, and more than 800 more sign on each week.

blog_10-10-19_generickids_500x333

As to rural schools, it is true that many studies of educational programs have taken place in urban areas. However, 47 of the 113 programs qualified by Evidence for ESSA were validated in at least one rural study, or a study including a large enough rural sample to enable researchers to separately report program impacts for rural students. Also, almost all widely disseminated programs have been used in many rural schools. So rural districts and schools that care about evidence can find programs that have been evaluated in rural locations, or at least that were evaluated in urban or suburban schools but widely disseminated in rural schools.

Also, it is important to note that if a program was successfully evaluated only in urban or suburban schools, the program still meets the ESSA evidence standards. If no studies of a given outcome were done in rural locations, a rural school in need of better outcomes could, in effect, be asked to choose between a program proven to work somewhere and probably used in dissemination in rural schools, or they could choose a program not proven to work anywhere. Every school and district has to make the best choices for their kids, but if I were a rural superintendent or principal, I’d read up on proven programs, and then go visit some rural schools using that program nearby. Wouldn’t you?

I have no reason to suspect that the CEP survey is incorrect. There are many indications that district and school leaders often do feel that the ESSA evidence rules are too difficult to meet. So what is really going on?

My guess is that there are many district and school leaders who do not want to know about evidence on proven programs. For example, they may have longstanding, positive relationships with representatives of publishers or software developers, or they may be comfortable and happy with the materials and services they are already using, evidence-proven or not. If they do not have evidence of effectiveness that would pass muster with WWC or Evidence for ESSA, the publishers and software developers may push hard on state and district officials, put forward dubious claims for evidence (such as studies with no control groups), and do their best to get by in a system that increasingly demands evidence that they lack. In my experience, district and state officials often complain about having inadequate staff to review evidence of effectiveness, but their concern may be less often finding out what works as it is defending themselves from publishers, software developers, or current district or school users of programs, who maintain that they have been unfairly rated by WWC, Evidence for ESSA, or other reviews. State and district leaders who stand up to this pressure may have to spend a lot of time reviewing evidence or hearing arguments.

On the plus side, at the same time that publishers and software producers may be seeking recognition for their current products, many are also sponsoring evaluations of some of their products that they feel are mostly likely to perform well in rigorous evaluations. Some may be creating new programs that resemble programs that have met evidence standards. If the federal ESSA law continues to demand evidence for certain federal funding purposes, or even to expand this requirement to additional parts of federal grant-making, then over time the ESSA law will have its desired effect, rewarding the creation and evaluation of programs that do meet standards by making it easier to disseminate such programs. The difficulties the evidence movement is experiencing are likely to diminish over time as more proven programs appear, and as federal, state, district, and school leaders get comfortable with evidence.

Evidence-based reform was always going to be difficult, because of the amount of change it entails and the stakes involved. But sooner or later, it is the right thing to do, and leaders who insist on evidence will see increasing levels of learning among their students, at minimal cost beyond what they already spend on untested or ineffective approaches. Medicine went through a similar transition in 1962, when the U.S. Congress first required that medicines be rigorously evaluated for effectiveness and safety. At first, many leaders in the medical profession resisted the changes, but after a while, they came to insist on them. The key is political leadership willing to support the evidence requirement strongly and permanently, so that educators and vendors alike will see that the best way forward is to embrace evidence and make it work for kids.

Photo courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Evidence and Policy: If You Want to Make a Silk Purse, Why Not Start With…Silk?

Everyone knows that you can’t make a silk purse out of a sow’s ear. This proverb goes back to the 1500s. Yet in education policy, we are constantly trying to achieve stellar results using school and classroom programs of unknown effectiveness, or even those known to be ineffective, even though proven effective programs are readily available.

Note that I am not criticizing teachers. They do the best they can with the tools they have. What I am concerned about is the quality of those tools, the programs, and professional development teachers receive to help them succeed with their children.

An excellent case in point was School Improvement Grants (SIG), a major provision of No Child Left Behind (NCLB). SIG provided major grants to schools scoring in the lowest 5% of their states. For most of its existence, SIG required schools seeking funding to choose among four models. Two of these, school closure and charterization, were rarely selected. Instead, most SIG schools selected either “turnaround” (replacing the principal and at least 50% of the staff), or the most popular, “transformation” (replacing the principal, using data to inform instruction, lengthening the school day or year, and evaluating teachers based on the achievement growth of their students). However, a major, large-scale evaluation of SIG by Mathematica showed no achievement benefits for schools that received SIG grants, compared to similar schools that did not. Ultimately, SIG spent more than $7 billion, an amount that we in Baltimore, at least, consider to be a lot of money. The tragedy, however, is not just the waste of so much money, but the dashing of so many hopes for meaningful improvement.

This is where the silk purse/sow’s ear analogy comes in. Each of the options among which SIG schools had to choose was composed of components that either lacked evidence of effectiveness or actually had evidence of ineffectiveness. If the components of each option are not known to be effective, then why would anyone expect a combination of them to be effective?

Evidence on school closure has found that this strategy diminishes student achievement for a few years, after which student performance returns to where it was before. Research on charter schools by CREDO (2013) has found an average effect size of zero for charters. The exception is “no-excuses” charters, such as KIPP and Success Academies, but these charters only accept students whose parents volunteer, not whole failing schools. Turnaround and transformation schools both require a change of principal, which introduces chaos and, as far as I know, has never been found to improve achievement. The same is true of replacing at least 50% of the teachers. Lots of chaos, no evidence of effectiveness. The other required elements of the popular “transformation” model have been found to have either no impact (e.g., benchmark assessments to inform teachers about progress; Inns et al., 2019), or small effects (e.g., lengthening the school day or year; Figlio et al., 2018). Most importantly, to blog_9-26-19_pig_500x336my knowledge, no one ever did a randomized evaluation of the entire transformation model, with all components included. We did not find out what the joint effect was until the Mathematica study. Guess what? Sewing together swatches of sows’ ears did not produce a silk purse. With a tiny proportion of $7 billion, the Department of Education could have identified and tested out numerous well-researched, replicable programs and then offered SIG schools a choice among the ones that worked best. A selection of silk purses, all made from 100% pure silk. Doesn’t that sound like a better idea?

In later blogs I’ll say more about how the federal government could ensure the success of educational initiatives by ensuring that schools have access to federal resources to adopt and implement proven programs designed to accomplish the goals of the legislation.

References

Figlio, D., Holden, K. L., & Ozek, U. (2018). Do students benefit from longer school days? Regression discontinuity evidence from Florida’s additional hour of literacy instruction. Economics of Education Review, 67, 171-183.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Superman and Statistics

In the 1978 movie “Superman,” Lois Lane, star journalist, crash-lands in a helicopter on top of a 50-story skyscraper.   The helicopter is hanging by a strut to the edge of the roof, and Lois is hanging on to a microphone cord.  Finally, the cord breaks, and Lois falls 45 floors before (of course) she is swooped up by Superman, who flies her back to the roof and sets her down gently. Then he says to her:

“I hope this doesn’t put you off of flying. Statistically speaking, it is the safest form of travel.”

She faints.

blog_8-29-19_superman_333x500
Don’t let the superhero thing fool you: The “S” is for “statistics.”

I’ve often had the very same problem whenever I do public speaking.  As soon as I mention statistics, some of the audience faints dead away. Or perhaps they are falling asleep. But either way, saying the word “statistics” is not usually a good way to make friends and influence people.

 

The fact is, most people don’t like statistics.  Or more accurately, people don’t like statistics except when the statistical findings agree with their prejudices.  At an IES meeting several years ago, a well-respected superintendent was invited to speak to what is perhaps the nerdiest, most statistically-minded group in all of education, except for an SREE conference.  He actually said, without the slightest indication of humor or irony, that “GOOD research is that which confirms what I have always believed.  BAD research is that which disagrees with what I have always believed.”  I’d guess that the great majority of superintendents and other educational leaders would agree, even if few would say so out loud to an IES meeting.

If educational leaders only attend to statistics that confirm their prior beliefs, one might argue that, well, at least they do attend to SOME research.  But research in an applied field like education is of value only if it leads to positive changes in practice.  If influential educators only respect research that confirms their previous beliefs, then they never change their practices or policies because of research, and policies and practices stay the same forever, or change only due to politics, marketing, and fads. Which is exactly how most change does in fact happen in education.  If you wonder why educational outcomes change so slowly, if at all, you need look no further than this.

Why is it that educators pay so little attention to research, whatever its outcomes, much in contrast to the situation in many other fields?  Some people argue that, unlike medicine, where doctors are well trained in research, educators lack such training.  Yet agriculture makes far more practical use of evidence than education does, and most farmers, while outstanding in their fields, are not known for their research savvy.

Farmers are, however, very savvy business owners, and they can clearly see that their financial success depends on using seeds, stock, methods, fertilizers, and insecticides proven to be effective, cost-effective, and sustainable.  Similarly, research plays a crucial role in technology, engineering, materials science, and every applied field in which better methods, with proven outcomes, lead to increased profits.

So one major reason for limited use of research in education is that adopting proven methods in education rarely leads to enhanced profit.  Even in parts of the educational enterprise where profit is involved, economic success still depends far more on politics, marketing, and fads, than on evidence. Outcomes of adopting proven programs or practices may not have an obvious impact on overall school outcomes because achievement is invariably tangled up with factors such as social class of children and schools’ abilities to attract skilled teachers and principals.  Ask parents whether they would rather have their child to go to a school in which all students have educated, upper-middle class parents, or to a school that uses proven instructional strategies in every subject and grade level.  The problem is that there are only so many educated, upper-middle class parents to go around, so schools and parents often focus on getting the best possible demographics in their school rather than on adopting proven teaching methods.

How can education begin to make the rapid, irreversible improvements characteristic of agriculture, technology, and medicine?  The answer has to take into account the fundamental fact that education is a government monopoly.  I’m not arguing whether or not this is a good thing, but it is certain to be true for many years, perhaps forever.  The parts of education that are not part of government are private schools, and these are very few in number (charter schools are funded by government, of course).

Because government funds nearly all schools, it has both the responsibility and the financial capacity to do whatever is feasible to make schools as effective as it possibly can.  This is true of all levels of government, federal, state, and local.  Because it is in charge of all federal research funding, the federal government is the most logical organization to lead any efforts to increase use of proven programs and practices in education, but forward-looking state and local government could also play a major role if they chose to do so.

Government can and must take on the role that profit plays in other research-focused fields, such as agriculture, medicine, and engineering.   As I’ve argued many times, government should use national funding to incentivize schools to adopt proven programs.  For example, the federal government could provide funding to schools to enable them to pay the costs of adopting programs found to be effective in rigorous research.  Under ESSA, it is already doing this, but right now the main focus is only on Title I school improvement grants.   These go to schools that are among the lowest performers in their states.  School improvement is a good place to start, but it affects a modest number of extremely disadvantaged schools.  Such schools do need substantial funding and expertise to make the substantial gains they are asked to make, but they are so unlike the majority of Title I schools that they are not sufficient examples of what evidence-based reform could achieve.  Making all Title I schools eligible for incentive funding to implement proven programs, or at least working toward this goal over time, would arouse the interest and enthusiasm of a much greater set of schools, virtually all of which need major changes in practices to reach national standards.

To make this policy work, the federal government would need to add considerably to the funding it provides for educational research and development, and it would need to rigorously evaluate programs that show the greatest promise to make large, pragmatically important differences in schools’ outcomes in key areas, such as reading, mathematics, science, and English for English learners.  One way to do this cost-effectively would be to allow districts (or consortia of districts) to put forward pairs of matched schools for potential funding.   Districts or consortia awarded grants might then be evaluated by federal contractors, who would randomly assign one school in each pair to receive the program, while the pair members not selected would serve as a control group.  In this way, programs that had been found effective in initial research might have their evaluations replicated many times, at a very low evaluation cost.  This pair evaluation design could greatly increase the number of schools using proven programs, and could add substantially to the set of programs known to be effective.  This design could also give many more districts experience with top-quality experimental research, building support for the idea that research is of value to educators and students.

Getting back to Superman and Lois Lane, it is only natural to expect that Lois might be reluctant to get on another helicopter anytime soon, no matter what the evidence says.  However, when we are making decisions on behalf of children, it’s not enough to just pay attention to our own personal experience.  Listen to Superman.  The evidence matters.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.