Evidence-Based Reform and the Multi-Academy Trust

Recently, I was in England to visit Success for All (SFA) schools there. I saw two of the best SFA schools I’ve ever seen anywhere, Applegarth Primary School in Croyden, south of London, and Houldsworth Primary School in Sussex, southeast of London. Both are very high-poverty schools with histories of poor achievement, violence, and high staff turnover. Applegarth mostly serves the children of African immigrants, and Houldsworth mostly serves White students from very poor homes. Yet I saw every class in each school and in each one, children were highly engaged, excited, and learning like crazy. Both schools were once in the lowest one percent of achievement in England, yet both are now performing at or above national norms.

In my travels, I often see outstanding Success for All schools. However, in this case I learned about an important set of policies that goes beyond Success for All, but could have implications for evidence-based reform more broadly.

blog_12-12-19_UKschoolkids_500x334

Both Applegarth and Houldsworth are in multi-academy trusts (MATs), the STEP Trust and the Unity Trust, respectively. Academies are much like charter schools in the U.S., and multi-academy trusts are organizations that run more than one academy. Academies are far more common in the U.K. than the U.S., constituting 22% of primary (i.e., elementary) schools and 68% of secondary schools. There are 1,170 multi-academy trusts, managing more than 5,000 of Britain’s 32,000 schools, or 16%. Multi-academy trusts can operate within a single local authority (school district) (like Success Academies in New York City) or may operate in many local authorities. Quite commonly, poorly-performing schools in a local authority, or stand-alone academies, may be offered to a successful and capable multi-academy trust, and these hand-overs explain much of the growth in multi-academy trusts in recent years.

What I saw in the STEP and Unity Trusts was something extraordinary. In each case, the exceptional schools I saw were serving as lead schools for the dissemination of Success for All. Staff in these schools had an explicit responsibility to train and mentor future principals, facilitators, and teachers, who spend a year at the lead school learning about SFA and their role in it, and then taking on their roles in a new SFA school elsewhere in the multi-academy trust. Over time, there are multiple lead schools, each of which takes responsibility to mentor new SFA schools other than their own. This cascading dissemination strategy, carried out in close partnership with the national SFA-UK non-profit organization, is likely to produce exceptional implementations.

I’m sure there must be problems with multi-academy trusts that I don’t know about, and in the absence of data on MATs throughout Britain, I would not take a position on them in general. But based on my limited experience with the STEP and Unity Trusts, this policy has particular potential as a means of disseminating very effective forms of programs proven effective in rigorous research.

First, multi-academy trusts have the opportunity and motivation to establish themselves as effective. Ordinary U.S. districts want to do well, of course, but they do not grow (or shrink) because of their success (or lack of it). In contrast, a multi-academy trust in the U.K. is more likely to seek out proven programs and implement them with care and competence, both to increase student success and to establish a “brand” based on their effective use of proven programs. Both STEP and Unity Trusts are building a reputation for succeeding with difficult schools using methods known to be effective. Using cascading professional developing and mentoring from established schools to new ones, a multi-academy trust can build effectiveness and reputation.

Although the schools I saw were using Success for All, any multi-academy trust could use any proven program or programs to create positive outcomes and expand its reach and influence. As other multi-academy trusts see what the pioneers are accomplishing, they may decide to emulate them. One major advantage possessed by multi-academy trusts is that much in contrast to U.S. school districts, especially large, urban ones, multi-academy trusts are likely to remain under consistent leadership for many years. Leaders of multi-academy trusts, and their staff and supporters, are likely to have time to transform practices gradually over time, knowing that they have the stable leadership needed for long-term change.

There is no magic in school governance arrangements, and no guarantee that many multi-academy trusts will use the available opportunities to implement and perfect proven strategies. Yet by their nature, multi-academy trusts have the opportunity to make a substantial difference in the education provided to all students, especially those serving disadvantaged students. I look forward to watching plans unfold in the STEP and Unity Trusts, and to learn more about how the academy movement in the U.K. might provide a path toward widespread and thoughtful use of proven programs, benefiting very large numbers of students. And I’d love to see more U.S. charter networks and traditional school districts use cascading replication to scale up proven, whole-school approaches likely to improve outcomes in disadvantaged schools.

Photo credit: Kindermel [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Achieving Audacious Goals in Education: Amundson and the Fram

On a recent trip to Norway, I visited the Fram Museum in Oslo. The Fram was Roald Amundson’s ship, used to transport a small crew to the South Pole in 1911. The museum is built around the Fram itself, and visitors can go aboard this amazing ship, surrounded by information and displays about polar exploration. What was most impressive about the Fram is the meticulous attention to detail in every aspect of the expedition. Amundson had undertaken other trips to the polar seas to prepare for his trip, and had carefully studied the experiences of other polar explorers. The ship’s hull was special built to withstand crushing from the shifting of polar ice. He carried many huskies to pull sleds over the ice, and trained them to work in teams.. Every possible problem was carefully anticipated in light of experience, and exact amounts of food for men and dogs were allocated and stored. Amundson said that forgetting “a single trouser button” could doom the effort. As it unfolded, everything worked as anticipated, and all the men and dogs returned safely after reaching the South Pole.

blog_12-5-19_Amundsen_500x361
From At the South Pole by Roald Amundsen, 1913 [Public domain]
The story of Amundson and the Fram is an illustration of how to overcome major obstacles to achieve audacious goals. I’d like to build on it to return to a topic I’ve touched on in two previous blogs. The audacious goal: Overcoming the substantial gap in elementary reading achievement between students who qualify for free lunch and those who do not, between African American and White students, and between Hispanic and non-Hispanic students. According to the National Assessment of Educational Progress (NAEP), each of these gaps is about one half of a standard deviation, also known as an effect size of +0.50. This is a very large gap, but it has been overcome in a very small number of intensive programs. These programs were able to increase the achievement of disadvantaged students by an effect size of more than +0.50, but few were able to reproduce these gains under normal circumstances. Our goal is to enable thousands of ordinary schools serving disadvantaged students to achieve such outcomes, at a cost of no more than 5% beyond ordinary per-pupil costs.

Educational Reform and Audacious Goals

Researchers have long been creating and evaluating many different approaches to improving reading achievement. This is necessary in the research and development process to find “what works” and build up from there. However, each individual program or practice has a modest effect on key outcomes, and we rarely combine proven programs to achieve an effect large enough to, for example, overcome the achievement gap. This is not what Amundson, or the Wright Brothers, or the worldwide team that achieved eradication of smallpox did. Instead, they set audacious goals and kept at them systematically, using what works, until they were achieved.

I would argue that we should and could do the same in education. The reading achievement gap is the largest problem of educational practice and policy in the U.S. We need to use everything we know how to do to solve it. This means stating in advance that our goal is to find strategies capable of eliminating reading gaps at scale, and refusing to declare victory until this goal is achieved. We need to establish that the goal can be achieved, by ordinary teachers and principals in ordinary schools serving disadvantaged students.

Tutoring Our Way to the Goal

In a previous blog I proposed that the goal of +0.50 could be reached by providing disadvantaged, low-achieving students tutoring in small groups or, when necessary, one-to-one. As I argued there and elsewhere, there is no reading intervention as effective as tutoring. Recent reviews of research have found that well-qualified teaching assistants using proven methods can achieve outcomes as good as those achieved by certified teachers working as tutors, thereby making tutoring much less expensive and more replicable (Inns et al., 2019). Providing schools with significant numbers of well-trained tutors is one likely means of reaching ES=+0.50 for disadvantaged students. Inns et al. (2019) found an average effect size of +0.38 for tutoring by teaching assistants, but several programs had effect sizes of +0.40 to +0.47. This is not +0.50, but it is within striking distance of the goal. However, each school would need multiple tutors in order to provide high-quality tutoring to most students, to extend the known positive effects of tutoring to the whole school.

Combining Intensive Tutoring With Success for All

Tutoring may be sufficient by itself, but research on tutoring has rarely used tutoring schoolwide, to benefit all students in high-poverty schools. It may be more effective to combine widespread tutoring for students who most need it with other proven strategies designed for the whole school, rather than simply extending a program designed for individuals and small groups. One logical strategy to reach the goal of +0.50 in reading might be to combine intensive tutoring with our Success for All whole-school reform model.

Success for All adds to intensive tutoring in several ways. It provides teachers with professional development on proven reading strategies, as well as cooperative learning and classroom management strategies at all levels. Strengthening core reading instruction reduces the number of children at great risk, and even for students who are receiving tutoring, it provides a setting in which students can apply and extend their skills. For students who do not need tutoring, Success for All provides acceleration. In high-poverty schools, students who are meeting reading standards are likely to still be performing below their potential, and improving instruction for all is likely to help these students excel.

Success for All was created in the late 1980s in an attempt to achieve a goal similar to the +0.50 challenge. In its first major evaluation, a matched study in six high-poverty Baltimore elementary schools, Success for All achieved a schoolwide reading effect size of at least +0.50 schoolwide in grades 1-5 on individually administered reading measures. For students in the lowest 25% of the sample at pretest, the effect size averaged +0.75 (Madden et al., 1993). That experiment provided two to six certified teacher tutors per school, who worked one to one with the lowest-achieving first and second graders. The tutors supplemented a detailed reading program, which used cooperative learning, phonics, proven classroom management methods, parent involvement, frequent assessment, distributed leadership, and other elements (as Success for All still does).

An independent follow-up assessment found that the effect maintained to the eighth grade, and also showed a halving of retentions in grade and a halving of assignments to special education, compared to the control group (Borman & Hewes, 2002). Schools using Success for All since that time have rarely been able to afford so many tutors, instead averaging one or two tutors. Many schools using SFA have not been able to afford even one tutor. Still, across 28 qualifying studies, mostly by third parties, the Success for All effect size has averaged +0.27 (Cheung et al., in press). This is impressive, but it is not +0.50. For the lowest achievers, the mean effect size was +0.62, but again, our goal is +0.50 for all disadvantaged students, not just the lowest achievers.

Over a period of years, could schools using Success for All with five or more teaching assistant tutors reach the +0.50 goal? I’m certain of it. Could we go even further, perhaps creating a similar approach for secondary schools or adding in an emphasis on mathematics? That would be the next frontier.

The Policy Importance of +0.50

If we can routinely achieve an effect size of +0.50 in reading in most Title I schools, this would provide a real challenge for policy makers. Many policy makers argue that money does not make much difference in education, or that housing, employment, and other basic economic improvements are needed before major improvements in the education of disadvantaged students will be possible. But what if it became widely known that outcomes in high-poverty schools could be reliably and substantially improved at a modest cost, compared to the outcomes? Policy makers would hopefully focus on finding ways to provide the resources needed if they could be confident in the outcomes.

As Amundson knew, difficult goals can be attained with meticulous planning and high-quality implementation. Every element of his expedition had been tested extensively in real arctic conditions, and had been found to be effective and practical. We would propose taking a similar path to universal success in reading. Each component of a practical plan to reach an effect size of +0.50 or more must be proven to be effective in schools serving many disadvantaged students. Combining proven approaches, we can add sufficiently to the reading achievement of disadvantaged schools to enable them to perform as well as middle class students do. It just takes an audacious goal and the commitment and resources to accomplish it.

References

Borman, G., & Hewes, G. (2002).  Long-term effects and cost effectiveness of Success for All.  Educational Evaluation and Policy Analysis, 24 (2), 243-266.

Cheung, A., Xie, C., Zhang, T., & Slavin, R. E. (in press). Success for All: A quantitative synthesis of evaluations. Education Research Review.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Madden, N. A., Slavin, R. E., Karweit, N. L., Dolan, L., & Wasik, B. (1993). Success for All:  Longitudinal effects of a schoolwide elementary restructuring program. American Educational Reseach Journal, 30, 123-148.

Madden, N. A., & Slavin, R. E. (2017). Evaluations of technology-assisted small-group tutoring for struggling readers. Reading & Writing Quarterly, 1-8. http://dx.doi.org/10.1080/10573569.2016.1255577

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

On Replicability: Why We Don’t Celebrate Viking Day

I was recently in Oslo, Norway’s capital, and visited a wonderful museum displaying three Viking ships that had been buried with important people. The museum had all sorts of displays focused on the amazing exploits of Viking ships, always including the Viking landings in Newfoundland, about 500 years before Columbus. Since the 1960s, most people have known that Vikings, not Columbus, were the first Europeans to land in America. So why do we celebrate Columbus Day, not Viking Day?

Given the bloodthirsty actions of Columbus, easily rivaling those of the Vikings, we surely don’t prefer one to the other based on their charming personalities. Instead, we celebrate Columbus Day because what Columbus did was far more important. The Vikings knew how to get back to Newfoundland, but they were secretive about it. Columbus was eager to publicize and repeat his discovery. It was this focus on replication that opened the door to regular exchanges. The Vikings brought back salted cod. Columbus brought back a new world.

In educational research, academics often imagine that if they establish new theories or demonstrate new methods on a small scale, and then publish their results in reputable journals, their job is done. Call this the Viking model: they got what they wanted (promotions or salt cod), and who cares if ordinary people found out about it? Even if the Vikings had published their findings in the Viking Journal of Exploration, this would have had roughly the same effect as educational researchers publishing in their own research journals.

Columbus, in contrast, told everyone about his voyages, and very publicly repeated and extended them. His brutal leadership ended with him being sent back to Spain in chains, but his discoveries had resounding impacts that long outlived him.

blog_11-21-19_vikingship_500x374

Educational researchers only want to do good, but they are unlikely to have any impact at all unless they can make their ideas useful to educators. Many educational researchers would love to make their ideas into replicable programs, evaluate these programs in schools, and if they are found to be effective, disseminate them broadly. However, resources for the early stages of development and research are scarce. Yes, the Institute of Education Sciences (IES) and Education Innovation Research (EIR) fund a lot of development projects, and Small Business Innovation Research (SBIR) provides small grants for this purpose to for-profit companies. Yet these funders support only a tiny proportion of the proposals they receive. In England, the Education Endowment Foundation (EEF) spends a lot on randomized evaluations of promising programs, but very little on development or early-stage research. Innovations that are funded by government or other funding very rarely end up being evaluated in large experiments, fewer still are found to be effective, and vanishingly few eventually enter widespread use. The exceptions are generally programs crated by large for-profit companies, large and entrepreneurial non-profits, or other entities with proven capacity to develop, evaluate, support, and disseminate programs at scale. Even the most brilliant developers and researchers rarely have the interest, time, capital, business expertise, or infrastructure to nurture effective programs through all the steps necessary to bring a practical and effective program to market. As a result, most educational products introduced at scale to schools come from commercial publishers or software companies, who have the capital and expertise to create and disseminate educational programs, but serve a market that primarily wants attractive, inexpensive, easy-to-use materials, software, and professional development, and is not (yet) willing to pay for programs proven to be effective. I discussed this problem in a recent blog on technology, but the same dynamics apply to all innovations, tech and non-tech alike.

How Government Can Promote Proven, Replicable Programs

There is an old saying that Columbus personified the spirit of research. He didn’t know where he was going, he didn’t know where he was when he got there, and he did it all on government funding. The relevant part of this is the government funding. In Columbus’ time, only royalty could afford to support his voyage, and his grant from Queen Isabella was essential to his success. Yet Isabella was not interested in pure research. She was hoping that Columbus might open rich trade routes to the (east) Indies or China, or might find gold or silver, or might acquire valuable new lands for the crown (all of these things did eventually happen). Educational research, development, and dissemination face a similar situation. Because education is virtually a government monopoly, only government is capable of sustained, sizable funding of research, development, and dissemination, and only the U.S. government has the acknowledged responsibility to improve outcomes for the 50 million American children ages 4-18 in its care. So what can government do to accelerate the research-development-dissemination process?

  1. Contract with “seed bed” organizations capable of identifying and supporting innovators with ideas likely to make a difference in student learning. These organizations might be rewarded, in part, based on the number of proven programs they are able to help create, support, and (if effective) ultimately disseminate.
  2. Contract with independent third-party evaluators capable of doing rigorous evaluations of promising programs. These organizations would evaluate promising programs from any source, not just from seed bed companies, as they do now in IES, EIR, and EEF grants.
  3. Provide funding for innovators with demonstrated capacity to create programs likely to be effective and funding to disseminate them if they are proven effective. Developers may also contract with “seed bed” organizations to help program developers succeed with development and dissemination.
  4. Provide information and incentive funding to schools to encourage them to adopt proven programs, as described in a recent blog on technology.  Incentives should be available on a competitive basis to a broad set of schools, such as all Title I schools, to engage many schools in adoption of proven programs.

Evidence-based reform in education has made considerable progress in the past 15 years, both in finding positive examples that are in use today and in finding out what is not likely to make substantial differences. It is time for this movement to go beyond its early achievements to enter a new phase of professionalism, in which collaborations among developers, researchers, and disseminators can sustain a much faster and more reliable process of research, development, and dissemination. It’s time to move beyond the Viking stage of exploration to embrace the good parts of the collaboration between Columbus and Queen Isabella that made a substantial and lasting change in the whole world.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Ensuring the Physical Health of Students: How Schools Can Play an Essential Role

           Schools have a lot to do. They are responsible for ensuring that their students develop skills and confidence in all subjects, as well as social-emotional learning, citizenship, patriotism, and much more.

            Yet schools also have a unique capability and a strong need to ensure the physical health of their students, particularly in areas of health that affect success in the schools’ traditional goals. This additional goal is especially crucial in high-poverty urban and rural schools, where traditional health services may be lacking and families often struggle to ensure their children’s health. In high-poverty schools, there are many children who will unnecessarily suffer from asthma, lack of needed eyeglasses, hearing problems, and other common ailments that can have a substantial deleterious effect on student learning.

            In partnership with health providers and parents, schools are ideally situated to solve such chronic problems as uncontrolled asthma, uncorrected vision problems, and uncorrected hearing problems. One reason this is so is that every student attends school, especially in the elementary grades, where the staff is likely to know each child and parents are most likely to have good relationships with school staff.

            Every school should have a qualified nurse every day to deal with routine health problems. It is shocking that there are no nurses, or just part-time nurses, in many high-poverty schools. However, in this blog, I am proposing a strategy that could have a substantial impact on the health problems that need constant attention but could be managed by well-trained health aides, following up on more time-limited assistance from other health professionals. The idea is that each school would have a full- or part-time Preventive Health Aide (PHA) who would work with students in need of preventive care.

            Asthma. In big cities, such as Baltimore, as many as 20% or more of all children suffer from uncontrolled asthma. For some, this is just an occasional problem, but for others it is a serious and sometimes life-threatening disease. In Baltimore and similar cities, asthma is responsible for the largest number of emergency department visits, the largest number of hospitalizations, and the largest number of deaths from all causes for school-aged students. Asthma can also cause serious problems with attendance, leading to negative effects on learning and motivation.

            There is a very simple solution to most asthma problems. Based on a doctor’s diagnosis, a student can use an inhaler: safe, effective, and reliable if used every day. However, in high-poverty schools, the great majority of students known to have asthma do not take their medicine regularly, and they are therefore at serious risk.

            Asthma cannot be cured, but it can be managed with daily inhaler use (plus, as necessary, access to rescue inhalers for acute situations). For the many children in high-poverty schools who are not regularly using their inhalers, there is a simple and effective backup: Directly Observed Therapy (DOT), which involves a health aide or nurse, most often, giving students their full daily dose of inhalant. As one example, Baltimore’s KIPP school has a specially-funded health clinic, and they have a health aide work in a room near the cafeteria to give DOT to all students who need it. Research on DOT for asthma has found substantial reductions in emergency department visits and hospitalizations, possibly saving children’s lives. By the way, at a cost of about $7,500 per hospitalization and $820 per emergency room visit, it would not take much reduction in asthma to pay the salary of a health aide.

            Vision. Along with the Wilmer Eye Clinic at Johns Hopkins Hospital, the Baltimore Department of Health, the Baltimore City Public Schools, Vision to Learn (which has vans that do vision services at school sites) and Warby Parker (an eyeglass company that provides free eyeglasses for disadvantaged children), we have been working for years on a project to provide eyeglasses to all Baltimore City K-8 students who need them. We have provided almost 10,000 pairs of eyeglasses so far. It is crucial to give students eyeglasses if they need them, but we have discovered that giving out free eyeglasses does not fully solve the problem. Kids being kids, they often lose or break their glasses, or just fail to use them. We have developed strategies to observe classes at random to see how many students are wearing eyeglasses, with celebrations or awards for the classes in which the most students are wearing their eyeglasses, but this is difficult to do across the whole city. Preventive Health Aides could easily build into their schedules random opportunities to observe in teachers’ classes to note and celebrate the wearing of eyeglasses once students have them.

            Hearing. Many children cannot hear well enough to benefit from lessons. The Baltimore City Health Department screens students at school entry, first grade, and eighth grade. Few students need hearing aids, but many suffer from smaller problems, such as excessive earwax. Health aides might supplement infrequent hearing screenings with more frequent assessments, especially for children known to have had problems in the past. Preventive Health Aides could see that children with hearing problems are getting the most effective and cost-effective treatments able to ensure that their hearing is sufficient for school.

            Other Ailments. A trained Preventive Health Aide ensuring that treatments are being administered or monitored could make a big difference for many common ailments. For example, many students take medication for ADHD (attention deficit-hyperactivity disorder). Yet safe and effective forms of ADHD medication work best if the medication is taken routinely. A treatment like DOT could easily do this. Other more rare problems that could be managed with regular medication and observation could also help many children. With greater knowledge and collaboration with experts on many diseases, it should be possible to provide cost-effective services on a broad scale.

            Health care for children in school is not a frill. As noted earlier, many common health care problems have serious impacts on attendance, and on vision, hearing, and other school-relevant skills. If school staff take up these responsibilities, there needs to be dedicated funding allocated for this purpose. It would be unfair and counter-productive to simply load another set of unfunded responsibilities on already overburdened schools. However, because they may reduce the need for very expensive hospital services, these school-based services may pay for themselves.

            You hear a lot these days about the “whole child.” I hope this emphasis can be extended to the health of children. It just stands to reason that children should be healthy if they are to be fully successful in school.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Why Not the Best?

In 1879, Thomas Edison invented the first practical lightbulb. The main problem he faced was in finding a filament that would glow, but not burn out too quickly. To find it, he tried more than 6000 different substances that had some promise as filaments. The one he found was carbonized cotton, which worked far better than all the others (tungsten, which we use now, came much later).

Of course, the incandescent light changed the world. It replaced far more expensive gas lighting systems, and was much more versatile. The lightbulb captured the evening and nighttime hours for every kind of human activity.

blog_9-19-19_lightbulb_500x347Yet if the lightbulb had been an educational innovation, it probably would have been proclaimed a dismal failure. Skeptics would have noted that only one out of six thousand filaments worked. Meta-analysts would have averaged the effect sizes for all 6000 experiments and concluded that the average effect size across the 6000 filaments was only +0.000000001. Hardly worthwhile. If Edison’s experiments were funded by government, politicians would have complained that 5,999 of Edison’s filaments were a total waste of taxpayers’ money. Economists would have computed benefit-cost ratios and concluded that even if Edison’s light worked, the cost of making the first one was astronomical, not to mention the untold cost of setting up electrical generation and wiring systems.

This is all ridiculous, you must be saying. But in the world of evidence-based education, comparable things happen all the time. In 2003, Borman et al. did a meta-analysis of 300 studies of 29 comprehensive (whole-school) reform designs. They identified three as having solid evidence of effectiveness. Rather than celebrating and disseminating those three (and continuing research and development to identify more of them), the U.S. Congress ended its funding for dissemination of comprehensive school reform programs. Turn out the light before you leave, Mr. Edison!

Another common practice in education is to do meta-analyses averaging outcomes across an entire category of programs or policies, and ignoring the fact that some distinctively different and far more effective programs are swallowed up in the averages. A good example is charter schools. Large-scale meta-analyses by Stanford’s CREDO (2013) found that the average effect sizes for charter schools are effectively zero. A 2015 analysis found better, but still very small effect sizes in urban districts (ES = +0.04 in reading, +0.05 in math). The What Works Clearinghouse published a 2010 review that found slight negative effects of middle school charters. These findings are useful in disabusing us of the idea that charter schools are magic, and get positive outcomes just because they are charter schools. However, they do nothing to tell us about extraordinary charter schools using methods that other schools (perhaps including non-charters) could also use. There is more positive evidence relating to “no-excuses” schools, such as KIPP and Success Academies, but among the thousands of charters that now exist, is this the only type of charter worth replicating? There must be some bright lights among all these bulbs.

As a third example, there are now many tutoring programs used in elementary reading and math with struggling learners. The average effect sizes for all forms of tutoring average about +0.30, in both reading and math. But there are reading tutoring approaches with effect sizes of +0.50 or more. If these programs are readily available, why would schools adopt programs less effective than the best? The average is useful for research purposes, and there are always considerations of costs and availability, but I would think any school would want to ignore the average for all types of programs and look into the ones that can do the most for their kids, at a reasonable cost.

I’ve often heard teachers and principals point out that “parents send us the best kids they have.” Yes they do, and for this reason it is our responsibility as educators to give those kids the best programs we can. We often describe educating students as enlightening them, or lifting the lamp of learning, or fiat lux. Perhaps the best way to fiat a little more lux is to take a page from Edison, the great luxmeister: Experiment tirelessly until we find what works. Then use the best we have.

Reference

Borman, G.D., Hewes, G. M., Overman, L.T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125-230.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

 

Cost-Effectiveness of Small Solutions

Imagine that you were shopping for a reliable new car, one that is proven to last an average of at least 100,000 miles with routine maintenance and repairs. You are looking at a number of options that fit your needs for around $24,000.

You happen to be talking to your neighbor, an economist, about your plans. “$24,000?” she says. “That’s crazy. You can get a motorcycle that would go at least 100,000 miles for only $12,000, and save a lot on gas as well!”blog_8-22-19_vessuv_500x333

You point out to your neighbor that motorcycles might be nice for some purposes, but you need a car to go to the grocery store, transport the kids, and commute to work, even in rain or snow. “Sure,” says your neighbor, “but you posed a question of cost-effectiveness, and on that basis a motorcycle is the right choice. Or maybe a bicycle.”

In education, school leaders and policy makers are often faced with choices like this. They want to improve their students’ achievement, and they have limited resources. But the available solutions vary in cost, effectiveness, and many other factors.

To help leaders make good choices, economists have devised measures of cost-effectiveness, which means (when educational achievement is the goal) the amount of achievement gain you might expect from purchasing a given product or service divided by all costs of making that choice. Cost-effectiveness can be very useful in educational policy and practice in helping decision makers weigh the potential benefits of each of a set of choice available to them. The widespread availability of effect sizes indicating the outcomes and costs of various programs and practices, easily located in sources such as the What Works Clearinghouse and Evidence for ESSA, make it a lot easier to compare outcomes and costs of available programs. For example, a district might seek to improve high school math performance by adopting software and professional development for a proven technology program, or by adopting a proven professional development approach. All costs need to be considered as well as all benefits, and the school leaders might make the choice that produces the largest gains at the most affordable cost. Cost-effectiveness might not entirely determine which choice is made, but, one might argue, it should always be a key part of the decision-making process. Quantitative researchers in education and economics would agree. So far, so good.

But here is where things get a little dodgy. In recent years, there has arisen a lot of interest in super-cheap interventions that have super-small impacts, but the ratio between the benefits and the costs makes the super-cheap interventions look cost-effective. Such interventions are sometimes called “nudge strategies,” meaning that simple reminders or minimal actions activate a set of psychological process that can lead to important impacts. A very popular example right now is Carol Dweck’s Growth Mindset strategy, in which students are asked to write a brief essay stating a belief that intelligence is not a fixed attribute of people, but that learning comes from effort. Her work has found small impacts of this essentially cost-free treatment in several studies, although others have failed to find this effect.

Other examples include sending messages to students or parents on cell phones, or sending postcards to parents on the importance of regular attendance. These strategies can cost next to nothing, yet large-scale experiments often show positive effects in the range of +0.03 to +0.05, averaging across multiple studies.

Approaches of this kind, including Growth Mindset, are notoriously difficult to replicate by others. However, assume for the sake of argument that at least some of them do have reliably positive effects that are very small, but because of their extremely small cost, they appear very cost-effective. Should schools use them?

One might take a view that interventions like Growth Mindset are so inexpensive and so sensible that what the heck, go ahead. However, others take some time and effort on the part of staff.

Schools are charged with a very important responsibility, ensuring the academic success, psychological adjustment, and pro-social character of young people. Their financial resources are always limited, but even more limited is their schoolwide capacity to focus on a small number of essential goals and stick with those goals until they are achieved. The problem is that spending a lot of time on small solutions with small impacts may exhaust a school’s capacity to focus on what truly matters. If a school could achieve an effect size of +0.30 on important achievement measures with one comprehensive program, or (for half the price) could adopt ten small interventions with effect sizes averaging +0.03, which should it do? Any thoughtful educator would say, “Invest in the one program with the big effect.” The little programs are not likely to add up to a big effect, and any collection of unrelated, uncoordinated mini-reforms is likely to deplete the staff’s energy and enthusiasm over a period of time.

This is where the car-motorcycle analogy comes in. A motorcycle may appear more cost-effective than a car, but it just does not do what a car does. Motorcycles are fine for touring in nice weather, but for most people they do not solve essential problems. In school reform, large programs with large effects may be composed of smaller effective components, but because these components are an integrated part of a well-thought-out plan, they add up to something more likely to work and to keep working over time.

Cost-effectiveness is a useful concept for schools seeking to make big differences in achievement, using serious resources. For small interventions with small impacts, don’t bother to calculate cost-effectiveness, or if you do, don’t compare the results to those of big interventions with big impacts. To do so is like bragging about the gas mileage you get on your motorcycle driving Aunt Sally and the triplets to the grocery store. It just doesn’t make sense.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Achieving Breakthroughs in Education By Transforming Effective But Expensive Approaches to be Affordable at Scale

It’s summer in Baltimore. The temperatures are beastly, the humidity worse. I grew up in Washington, DC, which has the same weather. We had no air conditioning, so summers could be torture. No one could sleep, so we all walked around like zombies, yearning for fall.

Today, however, summers in Baltimore are completely bearable. The reason, of course, is air conditioning. Air conditioning existed when I was a kid, but hardly anyone could afford it.  I think the technology has gradually improved, but there was no scientific or technical breakthrough, as far as I know.  Yet somehow, all but the poorest families can afford air conditioning, so summer in Baltimore can be survived. Families that cannot afford air conditioning need assistance, especially for health reasons, but this number is small.

blog_8-15-19_airconditioning_500x357

The story of air conditioning resembles that of much other technology. What happens is that a solution is devised for a very important problem.  The solution is too expensive for ordinary people to use, so initially, it is used in circumstances that justify the cost.  For example, early automobiles were far too expensive for the general public, but they were used for important applications in which the benefits were particularly obvious, such as delivery trucks and cars for doctors and veterinarians.  Also, wealthy individuals and race car drivers could afford the early autos.  These applications provided experience with the manufacture, use, and repair of automobiles and encouraged investments in infrastructure, paving the way (so to speak) for mass production of cars (such as the Model T) that could be afforded by a much larger portion of the population and economy.  Modest improvements are constantly being made, but the focus is on making the technology less expensive, so it can be more widely used.  In medicine, penicillin was invented in the 1920s, but not until the advent of World War II was it made inexpensive enough for practical use.  It saved millions of lives not because it had been invented, but because the Merck Company was commissioned to find a way to make it practicable (the solution involved growing penicillin on rotting squash).

Innovations in education can work in a similar way.  One obvious example is instructional technology, which existed before the 1970s but is only now becoming universally available, mostly because it is falling in price.  However, what education has rarely done is to create expensive but hugely effective interventions and then figure out how to do them cheaply, without reducing their impact.

Until now.

If you are a regular reader of my blog, you can guess where I am going: Tutoring.  As everyone knows, one-to one tutoring by certified teachers is extremely effective.  No surprise there. As you regulars will also know, rigorous research over the past 20 years has established that tutoring by well-trained, well-supervised teaching assistants using proven methods routinely produces outcomes just as good as tutoring by certified teachers, at half the cost.  Further, one-to-small group tutoring, up to one to four, can be almost as effective as one-to-one tutoring in reading, and equally effective in mathematics (see www.bestevidence.org).

One-to-four tutoring by teaching assistants requires about one-eighth of the cost of one-to-one tutoring by teachers.  The mean outcomes for both types of tutoring are about an effect size of +0.30, but several programs are able to produce effect sizes in excess of +0.50, the national mean difference on NAEP between disadvantaged and middle-class students.  (As a point of comparison, average effects of technology applications with elementary struggling readers average +0.05 in reading, and in math, they average +0.07 for all elementary students.  Urban charter schools average +0.04 in reading, +0.05 in math).

Reducing the cost of tutoring should not be seen as a way for schools to save money.  Instead, it should be seen as a way to provide the benefits of tutoring to much larger numbers of students.  Because of its cost, tutoring has been largely restricted to the primary grades (especially first), to perhaps a semester of service, and to reading, but not math.  If tutoring is much less expensive but equally effective, then tutoring can be extended to older students and to math.  Students who need more than a semester of tutoring, or need “booster shots” to maintain their gains into later grades, should be able to receive the tutoring they need, for as long as they need it.

Tutoring has been how rich and powerful people educated their children since the beginning of time.  Ancient Romans, Greeks, and Egyptians had their children tutored if they could afford it.  The great Russian educational theorist, Lev Vygotsky, never saw the inside of a classroom as a child, because his parents could afford to have him tutored.  As a slave, Frederick Douglass received one-to-one tutoring (secretly and illegally) from his owner’s wife, right here in Baltimore.  When his master found out and forbade his wife to continue, Douglass sought further tutoring from immigrant boys on the docks where he worked, in exchange for his master’s wife’s fresh-cooked bread.  Helen Keller received tutoring from Anne Sullivan.  Tutoring has long been known to be effective.  The only question is, or should be, how do we maximize tutoring’s effectiveness while minimizing its cost, so that all students who need it can receive it?

If air conditioning had been like education, we might have celebrated its invention, but sadly concluded that it would never be affordable by ordinary people.  If penicillin had been like education, it would have remained a scientific curiosity until today, and millions would have died due to the lack of it.  If cars had been like education, only the rich would have them.

Air conditioning for all?  What a cool idea.  Cost-effective tutoring for all who need it?  Wouldn’t that be smart?

Photo credit: U.S. Navy photo by Pat Halton [Public domain]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.