Science of Reading: Can We Get Beyond Our 30-Year Pillar Fight?

How is it possible that the “reading wars” are back on? The reading wars primarily revolve around what are often called the five pillars of early reading: phonemic awareness, phonics, comprehension, vocabulary, and fluency. Actually, there is little debate about the importance of comprehension, vocabulary, or fluency, so the reading wars are mainly about phonemic awareness and phonics. Diehard anti-phonics advocates exist, but in all of educational research, there are few issues that have been more convincingly settled by high-quality evidence. The National Reading Panel (2000), the source of the five pillars, has been widely cited as conclusive evidence that success in the early stages of reading depends on ensuring that students are all successful in phonemic awareness, phonics, and the other pillars. I was invited to serve on that panel, but declined, because I thought it was redundant. Just a short time earlier, the National Research Council’s Committee on the Prevention of Reading Difficulties (Snow, Burns, & Griffin, 1998) had covered essentially the same ground and came to essentially the same conclusion, as had Marilyn Adams’ (1990) Beginning to Read, and many individual studies. To my knowledge, there is little credible evidence to the contrary. Certainly, then and now there have been many students who learn to read successfully with or without a focus on phonemic awareness and phonics. However, I do not think there are many students who could succeed with non-phonetic approaches but cannot learn to read with phonics-emphasis methods. In other words, there is little if any evidence that phonemic awareness or phonics cause harm, but a great deal of evidence that for perhaps more than half of students, effective instruction emphasizing phonemic awareness and phonics are essential.  Since it is impossible to know in advance which students will need phonics and which will not, it just makes sense to teach using methods likely to maximize the chances that all children (those who need phonics and those who would succeed with or without them) will succeed in reading.

However…

The importance of the five pillars of the National Reading Panel (NRP) catechism are not in doubt among people who believe in rigorous evidence, as far as I know. The reading wars ended in the 2000s and the five pillars won. However, this does not mean that knowing all about these pillars and the evidence behind them is sufficient to solve America’s reading problems. The NRP pillars describe essential elements of curriculum, but not of instruction.

blog_3-19-20_readinggroup_333x500Improving reading outcomes for all children requires the five pillars, but they are not enough. The five pillars could be extensively and accurately taught in every school of education, and this would surely help, but it would not solve the problem. State and district standards could emphasize the five pillars and this would help, but would not solve the problem. Reading textbooks, software, and professional development could emphasize the five pillars and this would help, but it would not solve the problem.

The reason that such necessary policies would still not be sufficient is that teaching effectiveness does not just depend on getting curriculum right. It also depends on the nature of instruction, classroom management, grouping, and other factors. Teaching reading without teaching phonics is surely harmful to large numbers of students, but teaching phonics does not guarantee success.

As one example, consider grouping. For a very long time, most reading teachers have used homogeneous reading groups. For example, the “Stars” might contain the highest-performing readers, the “Rockets” the middle readers, and the “Planets” the lowest readers. The teacher calls up groups one at a time. No problem there, but what are the students doing back at their desks? Mostly worksheets, on paper or computers. The problem is that if there are three groups, each student spends two thirds of reading class time doing, well, not much of value. Worse, the students are sitting for long periods of time, with not much to do, and the teacher is fully occupied elsewhere. Does anyone see the potential for idle hands to become the devil’s playground? The kids do.

There are alternatives to reading groups, such as the Joplin Plan (cross-grade grouping by reading level), forms of whole-class instruction, or forms of cooperative learning. These provide active teaching to all students all period. There is good evidence for these alternatives (Slavin, 1994, 2017). My main point is that a reading strategy that follows NRP guidelines 100% may still succeed or fail based on its grouping strategy. The same could be true of the use of proven classroom management strategies or motivational strategies during reading periods.

To make the point most strongly, imagine that a district’s teachers have all thoroughly mastered all five pillars of science of reading, which (we’ll assume) are strongly supported by their district and state. In an experiment, 40 teachers of grades 1 to 3 are selected, and 20 of these are chosen at random to receive sufficient tutors to work with their lowest-achieving 33% of students in groups of four, using a proven model based on science of reading principles. The other 20 schools just use their usual materials and methods, also emphasizing science of reading curricula and methods.

The evidence from many studies of tutoring (Inns et al., 2020), as well as common sense, tell us what would happen. The teachers supported by tutors would produce far greater achievement among their lowest readers than would the other equally science-of-reading-oriented teachers in the control group.

None of these examples diminish the importance of science of reading. But they illustrate that knowing science of reading is not enough.

At www.evidenceforessa.org, you can find 65 elementary reading programs of all kinds that meet high standards of effectiveness. Almost all of these use approaches that emphasize the five pillars. Yet Evidence for ESSA also lists many programs that equally emphasize the five pillars and yet have not found positive impacts. Rather than re-starting our thirty-year-old pillar fight, don’t you think we might move on to advocating programs that not only use the right curricula, but are also proven to get excellent results for kids?

References

Adams, M.J. (1990).  Beginning to read:  Thinking and learning about print.  Cambridge, MA:  MIT Press.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2020). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

National Reading Panel (2000).  Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction.  Rockville, MD: National Institute of Child Health and Human Development.

Slavin, R. E. (1994). School and classroom organization in beginning reading:  Class size, aides, and instructional grouping. In R. E. Slavin, N. L. Karweit, and B. A. Wasik (Eds.), Preventing early school failure. Boston:  Allyn and Bacon.

Slavin, R. E. (2017). Instruction based on cooperative learning. In R. Mayer & P. Alexander (Eds.), Handbook of research on learning and instruction. New York: Routledge.

Snow, C.E., Burns, S.M., & Griffin, P. (Eds.) (1998).  Preventing reading difficulties in young children.  Washington, DC: National Academy Press.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

 

How Tutoring Works (Cooking With The Grandkids)

My wife, Nancy, and I have three grandkids: Adaya (4 ½), Leo (3 ½), and Ava (8 months). They all live in Baltimore, so we see quite a lot of them, which is wonderful.

As with most grandparents and grandkids, one of our favorite activities with Adaya and Leo is cooking. We have two folding stepladders in the kitchen, which the kids work from. They help make pancakes, scrambled eggs, spaghetti, and other family classics. We start off giving the kids easy and safe tasks, like measuring and pouring ingredients into bowls and mixing, and as they become proficient, we let them pour ingredients into hot pans, scramble eggs on the stove, and so on. They love every bit of this, and are so proud of their accomplishments.

So here is my question. What are we making when we cook with the grandkids? If you say pancakes and eggs, that’s not wrong, but perhaps these are the least important things we are doing.

blog_1-30-20_kidcooking_500x333

What we are really doing is building the thrill of mastery in a loving and supportive context. All children are born into a confusing world. They want to understand their world and to learn to operate effectively in it. They want to do what the big people do. They also want to be loved and valued.

Now consider children who need tutoring because they are behind in reading. These kids are in very big trouble, and they know it. All of them understand what the purpose of school is. It is to learn to read. Yet they know they are not succeeding.

The solution, I believe, is a lot like cooking with people who love you. In other words, it’s tutoring, in small groups or one-to-one.

The effectiveness of tutoring is very well established in rigorous research, as I’ve noted more than once in this series of blogs. No surprise there. But what is surprising is that well-trained, caring tutors without teaching certificates using well-structured materials get outcomes just as good as those obtained by certified teachers. How can this be? If tutoring works primarily because it enables teachers to adapt instruction to meet the learning needs of individual students, then you’d expect that students who receive tutoring from certified, experienced teachers would get much better outcomes than those tutored by teaching assistants. But they don’t, on average. Further, a U.K. study of one-to-one tutoring over the internet found an effect size of zero. These and other unexpected findings support a conclusion that while the ability to individualize instruction is important in tutoring, it is not enough. The additional factor that explains much of the powerful impacts of tutoring, I believe, is love. Most tutors, with or without teaching certificates, love the children they tutor in a way that a teacher with 25 or 30 students usually cannot. A tutor with one or just a few children at a time is certain to get to know those children, and to care about them deeply. From the perspective of struggling children, their tutor is not just a teacher. She or he is a lifeline, a new chance to achieve the mastery they crave. Someone who knows and cares about then and will stick with them until they can read.

This is why individual or small-group tutoring is a bit like cooking with your grandparents. In both settings, children receive the two things they need and value the most: love and mastery.

My point here is not sentimental or idealistic. It is deadly practical. We already know a lot about how to use tutoring effectively and cost-effectively. Yet there is a great deal more we need to learn to maximize the benefits and minimize the costs of effective tutoring. We need to find out how to extend positive effects to larger numbers of students, to learn how to maintain and build on initial successes in the early grades, how to successfully tutor upper-elementary and secondary students, and how to reach students who still do not succeed despite small-group tutoring. We need to experiment with adaptations of tutoring for English learners.

We know that tutoring is powerful, but we need to make it more cost-effective without reducing its impact, so that many more children can experience the thrill of mastery. To do that, we have a lot of work to do. Let’s get cooking!

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

What Works in Professional Development

I recently read an IES-funded study, called “The Effects of a Principal Professional Development Program Focused on Instructional Leadership.” The study, reported by a research team at Mathematica (Hermann et al., 2019), was a two-year evaluation of a Center for Educational Leadership (CEL) program in which elementary principals received 188 hours of PD, including a 28-hour summer institute at the beginning of the program, quarterly virtual professional learning community sessions in which principals met other principals and CEL coaches, and 50 hours per year of individual coaching in which principals worked with their CEL coaches to set goals, implement strategies, and analyze effects of strategies. Principals helped teachers improve instruction by observing teachers, giving feedback, and selecting curricula; sought to improve their recruitment, management, and retention strategies, held PD sessions for teachers; and focused on setting a school mission, improving school climate, and deploying resources effectively.

A total of 100 low-achieving schools were recruited. Half received the CEL program, and half served as controls. After one, two, and three years, there were no differences between experimental and control schools on standardized measures of student reading or mathematics achievement, no differences on school climate, and no differences on principal or teacher retention.

So what happened? First, it is important to note that previous studies of principal professional development have also found zero (e.g., Jacob et al., 2014) or very small and inconsistent effects (e.g., Nunnery et al., 2011, 2016). Second, numerous studies of certain types of professional development for teachers have also found very small or zero impacts. For example, a review of research on elementary mathematics programs by Pellegrini et al. (2019) identified 12 qualifying studies of professional development for mathematics content and pedagogy. The average effect size was essentially zero (ES=+0.04).

What does work in professional development?

In sharp contrast to these dismal findings, there are many forms of professional development that work very well. For example, in the Pellegrini et al. (2019) mathematics review, professional development designed to teach teachers to use specific instructional processes were very effective, averaging ES=+0.25. These included studies of cooperative learning, classroom management strategies, and individualized instruction. In fact, other than one-to-one and one-to-small group tutoring, no other type of approach was as effective. In a review of research on programs for elementary struggling readers by Inns et al. (2019), programs incorporating cooperative learning had an effect size of +0.29, more effective than any other programs except tutoring. A review of research on secondary reading programs by Baye et al. (2018) found that cooperative learning programs and whole-school models incorporating cooperative learning, along with writing-focused models also incorporating cooperative learning, had larger impacts than anything other than tutoring.

How can it be that professional development on cooperative learning and classroom management are so much more effective than professional development on content, pedagogy, and general teaching strategies?

One reason, I would submit, is that it is very difficult to teach someone to improve practices that they already know how to do. For example, if as an adult you took a course in tennis or golf or sailing or bridge, you probably noticed that you learned very rapidly, retained what you learned, and quickly improved your performance in that new skill. Contrast this with a course on dieting or parenting. The problem with improving your eating or parenting is that you already know very well how to eat, and if you already have kids, you know how to parent. You could probably stand some improvement in these areas, which is why you took the course, but no matter how motivated you are to improve, over time you are likely to fall back on well-established routines, or even bad habits. The same is true of teaching. Early in their careers teachers develop routine ways of performing each of the tasks of teaching: lecturing, planning, praising, dealing with misbehavior, and so on. Teachers know their content and settle into patterns of communicating that content to students. Then one day a professional developer shows up, who watches teachers teaching and gives them advice. The advice might take, but quite often teachers give it a try, run into difficulties, and then settle back into comfortable routines.

Now consider a more specific, concrete set of strategies that are distinctly different from what teachers typically do: cooperative learning. Teachers can readily learn the key components. They put their students in mixed groups of four or five. After an initial lesson, they give students opportunities to work together to make sure that everyone can succeed at the task. Teachers observe and assist students during team practice. They assess student learning, and celebrate student success. Every one of these components is a well-defined, easily learned, and easily observed step. Teachers need training and coaching to succeed at first, but after a while, cooperative learning itself becomes second nature. It helps that almost all kids love to be noisy and engaged, and love to work with each other, so they are rooting for the teacher to succeed. But for most teachers, structured cooperative learning is distinctly different from ordinary teaching, so it is easy to learn and maintain.

blog_12-19-19_celebratingteachers_500x341

As another example, consider classroom management strategies used in many programs. Trainers show teachers how to use Popsicle sticks with kids’ names on them to call on students, so all kids have to pay attention in case they are called. To get students’ immediate attention, teachers may learn to raise their hands and have students raise theirs, or to ring a bell, or to say a phrase like “one, two, three, look at me.” Teachers may learn to give points to groups or individuals who are meeting class expectations. They may learn to give students or groups privileges, such as lining up first to go outside or having the privilege of selecting and leading their favorite team or class cheer. These and many other teacher behaviors are clear, distinct, easily learned, and immediately solve persistent problems of low-level disturbances.

The point is not that these cooperative learning or classroom management strategies are more important than content knowledge or pedagogy. However, they are easily learned, retained, and institutionalized ways of solving critical daily problems of teaching, and they are so well-defined and clear that when they have started working, teachers are likely to hold on to them indefinitely and are unlikely to fall back on other strategies that may be less effective but are already deeply ingrained.

I am not suggesting that only observable, structural classroom reforms such as cooperative learning or classroom management strategies are good uses of professional development resources. All aspects of teaching need successive improvement, of course. But I am using these examples to illustrate why certain types of professional development are very difficult to make effective. It may be that improving the content and pedagogy teachers use day in and day out may require more concrete, specific strategies. I hope developers and researchers will create and successfully evaluate such new approaches, so that teachers can continually improve their effectiveness in all areas. But there are whole categories of professional development that research repeatedly finds are just not working. Researchers and educators need to focus on why this is true, and then design new PD strategies that are less subtle, more observable, and deal more with actual teacher and student behavior.

References

Hermann, M., Clark, M., James-Burdumy, S., Tuttle, C., Kautz, T., Knechtel, V., Dotter, D., Wulsin, C.S., & Deke, J. (2019). The effects of a principal professional development program focused on instructional leadership (NCEE 2020-0002). Washington, DC: Naitonal Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Jacob, R., Goddard, K., Miller, R., & Goddard, Y. (2014). Exploring the causal impact of the McREL Balanced Leadership Program on leadership, principal efficacy, instructional climate, educator turnover, and student achievement. Educational Evaluation and Policy Analysis, 52 187-220.

Nunnery, J., Ross, S., Chappel, S., Pribesh, S., & Hoag-Carhart, E. (2011). The impact of the National Institute for School Leadership’s Executive Development Program on school performance trends in Massachusetts: Cohort 2 Results. Norfolk, VA: Center for Educational Partnerships, Old Dominion University.

Nunnery, J., Ross, S., & Reilly, J. (2016). An evaluation of the National Institute for School Leadership: Executive Development Program in Milwaukee Public Schools. Norfolk, VA: Center for Educational Partnerships, Old Dominion University.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. (2019). Effective programs in elementary mathematics: A best-evidence synthesis. Available at www.bestevidence.com. Manuscript submitted for publication.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Achieving Audacious Goals in Education: Amundson and the Fram

On a recent trip to Norway, I visited the Fram Museum in Oslo. The Fram was Roald Amundson’s ship, used to transport a small crew to the South Pole in 1911. The museum is built around the Fram itself, and visitors can go aboard this amazing ship, surrounded by information and displays about polar exploration. What was most impressive about the Fram is the meticulous attention to detail in every aspect of the expedition. Amundson had undertaken other trips to the polar seas to prepare for his trip, and had carefully studied the experiences of other polar explorers. The ship’s hull was special built to withstand crushing from the shifting of polar ice. He carried many huskies to pull sleds over the ice, and trained them to work in teams.. Every possible problem was carefully anticipated in light of experience, and exact amounts of food for men and dogs were allocated and stored. Amundson said that forgetting “a single trouser button” could doom the effort. As it unfolded, everything worked as anticipated, and all the men and dogs returned safely after reaching the South Pole.

blog_12-5-19_Amundsen_500x361
From At the South Pole by Roald Amundsen, 1913 [Public domain]
The story of Amundson and the Fram is an illustration of how to overcome major obstacles to achieve audacious goals. I’d like to build on it to return to a topic I’ve touched on in two previous blogs. The audacious goal: Overcoming the substantial gap in elementary reading achievement between students who qualify for free lunch and those who do not, between African American and White students, and between Hispanic and non-Hispanic students. According to the National Assessment of Educational Progress (NAEP), each of these gaps is about one half of a standard deviation, also known as an effect size of +0.50. This is a very large gap, but it has been overcome in a very small number of intensive programs. These programs were able to increase the achievement of disadvantaged students by an effect size of more than +0.50, but few were able to reproduce these gains under normal circumstances. Our goal is to enable thousands of ordinary schools serving disadvantaged students to achieve such outcomes, at a cost of no more than 5% beyond ordinary per-pupil costs.

Educational Reform and Audacious Goals

Researchers have long been creating and evaluating many different approaches to improving reading achievement. This is necessary in the research and development process to find “what works” and build up from there. However, each individual program or practice has a modest effect on key outcomes, and we rarely combine proven programs to achieve an effect large enough to, for example, overcome the achievement gap. This is not what Amundson, or the Wright Brothers, or the worldwide team that achieved eradication of smallpox did. Instead, they set audacious goals and kept at them systematically, using what works, until they were achieved.

I would argue that we should and could do the same in education. The reading achievement gap is the largest problem of educational practice and policy in the U.S. We need to use everything we know how to do to solve it. This means stating in advance that our goal is to find strategies capable of eliminating reading gaps at scale, and refusing to declare victory until this goal is achieved. We need to establish that the goal can be achieved, by ordinary teachers and principals in ordinary schools serving disadvantaged students.

Tutoring Our Way to the Goal

In a previous blog I proposed that the goal of +0.50 could be reached by providing disadvantaged, low-achieving students tutoring in small groups or, when necessary, one-to-one. As I argued there and elsewhere, there is no reading intervention as effective as tutoring. Recent reviews of research have found that well-qualified teaching assistants using proven methods can achieve outcomes as good as those achieved by certified teachers working as tutors, thereby making tutoring much less expensive and more replicable (Inns et al., 2019). Providing schools with significant numbers of well-trained tutors is one likely means of reaching ES=+0.50 for disadvantaged students. Inns et al. (2019) found an average effect size of +0.38 for tutoring by teaching assistants, but several programs had effect sizes of +0.40 to +0.47. This is not +0.50, but it is within striking distance of the goal. However, each school would need multiple tutors in order to provide high-quality tutoring to most students, to extend the known positive effects of tutoring to the whole school.

Combining Intensive Tutoring With Success for All

Tutoring may be sufficient by itself, but research on tutoring has rarely used tutoring schoolwide, to benefit all students in high-poverty schools. It may be more effective to combine widespread tutoring for students who most need it with other proven strategies designed for the whole school, rather than simply extending a program designed for individuals and small groups. One logical strategy to reach the goal of +0.50 in reading might be to combine intensive tutoring with our Success for All whole-school reform model.

Success for All adds to intensive tutoring in several ways. It provides teachers with professional development on proven reading strategies, as well as cooperative learning and classroom management strategies at all levels. Strengthening core reading instruction reduces the number of children at great risk, and even for students who are receiving tutoring, it provides a setting in which students can apply and extend their skills. For students who do not need tutoring, Success for All provides acceleration. In high-poverty schools, students who are meeting reading standards are likely to still be performing below their potential, and improving instruction for all is likely to help these students excel.

Success for All was created in the late 1980s in an attempt to achieve a goal similar to the +0.50 challenge. In its first major evaluation, a matched study in six high-poverty Baltimore elementary schools, Success for All achieved a schoolwide reading effect size of at least +0.50 schoolwide in grades 1-5 on individually administered reading measures. For students in the lowest 25% of the sample at pretest, the effect size averaged +0.75 (Madden et al., 1993). That experiment provided two to six certified teacher tutors per school, who worked one to one with the lowest-achieving first and second graders. The tutors supplemented a detailed reading program, which used cooperative learning, phonics, proven classroom management methods, parent involvement, frequent assessment, distributed leadership, and other elements (as Success for All still does).

An independent follow-up assessment found that the effect maintained to the eighth grade, and also showed a halving of retentions in grade and a halving of assignments to special education, compared to the control group (Borman & Hewes, 2002). Schools using Success for All since that time have rarely been able to afford so many tutors, instead averaging one or two tutors. Many schools using SFA have not been able to afford even one tutor. Still, across 28 qualifying studies, mostly by third parties, the Success for All effect size has averaged +0.27 (Cheung et al., in press). This is impressive, but it is not +0.50. For the lowest achievers, the mean effect size was +0.62, but again, our goal is +0.50 for all disadvantaged students, not just the lowest achievers.

Over a period of years, could schools using Success for All with five or more teaching assistant tutors reach the +0.50 goal? I’m certain of it. Could we go even further, perhaps creating a similar approach for secondary schools or adding in an emphasis on mathematics? That would be the next frontier.

The Policy Importance of +0.50

If we can routinely achieve an effect size of +0.50 in reading in most Title I schools, this would provide a real challenge for policy makers. Many policy makers argue that money does not make much difference in education, or that housing, employment, and other basic economic improvements are needed before major improvements in the education of disadvantaged students will be possible. But what if it became widely known that outcomes in high-poverty schools could be reliably and substantially improved at a modest cost, compared to the outcomes? Policy makers would hopefully focus on finding ways to provide the resources needed if they could be confident in the outcomes.

As Amundson knew, difficult goals can be attained with meticulous planning and high-quality implementation. Every element of his expedition had been tested extensively in real arctic conditions, and had been found to be effective and practical. We would propose taking a similar path to universal success in reading. Each component of a practical plan to reach an effect size of +0.50 or more must be proven to be effective in schools serving many disadvantaged students. Combining proven approaches, we can add sufficiently to the reading achievement of disadvantaged schools to enable them to perform as well as middle class students do. It just takes an audacious goal and the commitment and resources to accomplish it.

References

Borman, G., & Hewes, G. (2002).  Long-term effects and cost effectiveness of Success for All.  Educational Evaluation and Policy Analysis, 24 (2), 243-266.

Cheung, A., Xie, C., Zhang, T., & Slavin, R. E. (in press). Success for All: A quantitative synthesis of evaluations. Education Research Review.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Madden, N. A., Slavin, R. E., Karweit, N. L., Dolan, L., & Wasik, B. (1993). Success for All:  Longitudinal effects of a schoolwide elementary restructuring program. American Educational Reseach Journal, 30, 123-148.

Madden, N. A., & Slavin, R. E. (2017). Evaluations of technology-assisted small-group tutoring for struggling readers. Reading & Writing Quarterly, 1-8. http://dx.doi.org/10.1080/10573569.2016.1255577

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Why Not the Best?

In 1879, Thomas Edison invented the first practical lightbulb. The main problem he faced was in finding a filament that would glow, but not burn out too quickly. To find it, he tried more than 6000 different substances that had some promise as filaments. The one he found was carbonized cotton, which worked far better than all the others (tungsten, which we use now, came much later).

Of course, the incandescent light changed the world. It replaced far more expensive gas lighting systems, and was much more versatile. The lightbulb captured the evening and nighttime hours for every kind of human activity.

blog_9-19-19_lightbulb_500x347Yet if the lightbulb had been an educational innovation, it probably would have been proclaimed a dismal failure. Skeptics would have noted that only one out of six thousand filaments worked. Meta-analysts would have averaged the effect sizes for all 6000 experiments and concluded that the average effect size across the 6000 filaments was only +0.000000001. Hardly worthwhile. If Edison’s experiments were funded by government, politicians would have complained that 5,999 of Edison’s filaments were a total waste of taxpayers’ money. Economists would have computed benefit-cost ratios and concluded that even if Edison’s light worked, the cost of making the first one was astronomical, not to mention the untold cost of setting up electrical generation and wiring systems.

This is all ridiculous, you must be saying. But in the world of evidence-based education, comparable things happen all the time. In 2003, Borman et al. did a meta-analysis of 300 studies of 29 comprehensive (whole-school) reform designs. They identified three as having solid evidence of effectiveness. Rather than celebrating and disseminating those three (and continuing research and development to identify more of them), the U.S. Congress ended its funding for dissemination of comprehensive school reform programs. Turn out the light before you leave, Mr. Edison!

Another common practice in education is to do meta-analyses averaging outcomes across an entire category of programs or policies, and ignoring the fact that some distinctively different and far more effective programs are swallowed up in the averages. A good example is charter schools. Large-scale meta-analyses by Stanford’s CREDO (2013) found that the average effect sizes for charter schools are effectively zero. A 2015 analysis found better, but still very small effect sizes in urban districts (ES = +0.04 in reading, +0.05 in math). The What Works Clearinghouse published a 2010 review that found slight negative effects of middle school charters. These findings are useful in disabusing us of the idea that charter schools are magic, and get positive outcomes just because they are charter schools. However, they do nothing to tell us about extraordinary charter schools using methods that other schools (perhaps including non-charters) could also use. There is more positive evidence relating to “no-excuses” schools, such as KIPP and Success Academies, but among the thousands of charters that now exist, is this the only type of charter worth replicating? There must be some bright lights among all these bulbs.

As a third example, there are now many tutoring programs used in elementary reading and math with struggling learners. The average effect sizes for all forms of tutoring average about +0.30, in both reading and math. But there are reading tutoring approaches with effect sizes of +0.50 or more. If these programs are readily available, why would schools adopt programs less effective than the best? The average is useful for research purposes, and there are always considerations of costs and availability, but I would think any school would want to ignore the average for all types of programs and look into the ones that can do the most for their kids, at a reasonable cost.

I’ve often heard teachers and principals point out that “parents send us the best kids they have.” Yes they do, and for this reason it is our responsibility as educators to give those kids the best programs we can. We often describe educating students as enlightening them, or lifting the lamp of learning, or fiat lux. Perhaps the best way to fiat a little more lux is to take a page from Edison, the great luxmeister: Experiment tirelessly until we find what works. Then use the best we have.

Reference

Borman, G.D., Hewes, G. M., Overman, L.T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125-230.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

 

The Gap

Recently, Maryland released its 2019 state PARCC scores.  I read an article about the scores in the Baltimore Sun.  The pattern of scores was the same as usual, some up, some down. Baltimore City was in last place, as usual.  The Sun helpfully noted that this was probably due to high levels of poverty in Baltimore.  Then the article noted that there was a serious statewide gap between African American and White students, followed by the usual shocked but resolute statements about closing the gap from local superintendents.

Some of the superintendents said that in order to combat the gap, they were going to take a careful look at the curriculum.  There is nothing wrong with looking at curriculum.  All students should receive the best curriculum we can provide them.  However, as a means of reducing the gap, changing the curriculum is not likely to make much difference.

First, there is plentiful evidence from rigorous studies showing that changing from one curriculum to another, or one textbook to another, or one set of standards to another, makes little difference in student achievement.  Some curricula have more interesting or up to date content than others. Some meet currently popular standards better than others. But actual meaningful increases in achievement compared to a control group using the old curriculum?  This hardly ever happens. We once examined all of the textbooks rated “green” (the top ranking on EdReports, which reviews textbooks for alignment with college- and career-ready standards). Out of dozens of reading and math texts with this top rating,  two had small positive impacts on learning, compared to control groups.  In contrast, we have found more than 100 reading and math programs that are not textbooks or curricula that have been found to significantly increase student achievement more than control groups using current methods (see www.evidenceforessa.org).

But remember that at the moment, I am talking about reducing gaps, not increasing achievement overall.  I am unaware of any curriculum, textbook, or set of standards that is proven to reduce gaps. Why should they?  By definition, a curriculum or set of standards is for all students.  In the rare cases when a curriculum does improve achievement overall, there is little reason to expect it to increase performance for one  specific group or another.

The way to actually reduce gaps is to provide something extremely effective for struggling students. For example, the Sun article on the PARCC scores highlighted Lakeland Elementary/Middle, a Baltimore City school that gained 20 points on PARCC since 2015. How did they do it? The University of Maryland, Baltimore County (UMBC) sent groups of undergraduate education majors to Lakeland to provide tutoring and mentoring.  The Lakeland kids were very excited, and apparently learned a lot. I can’t provide rigorous evidence for the UMBC program, but there is quite a lot of evidence for similar programs, in which capable and motivated tutors without teaching certificates work with small groups of students in reading or math.

Tutoring programs and other initiatives that focus on the specific kids who are struggling have an obvious link to reducing gaps, because they go straight to where the problem is rather than doing something less targeted and less intensive.

blog_9-5-19_leap_500x375

Serious gap-reduction approaches can be used with any curriculum or set of standards. Districts focused on standards-based reform may also provide tutoring or other proven gap-reduction approaches along with new textbooks to students who need them.  The combination can be powerful. But the tutoring would most likely have worked with the old curriculum, too.

If all struggling students received programs effective enough to bring all of them to current national averages, the U.S. would be the highest-performing national school system in the world.  Social problems due to inequality, frustration, and inadequate skills would disappear. Schools would be happier places for kids and teachers alike.

The gap is a problem we can solve, if we decide to do so.  Given the stakes involved for our economy, society, and future, how could we not?

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Achieving Breakthroughs in Education By Transforming Effective But Expensive Approaches to be Affordable at Scale

It’s summer in Baltimore. The temperatures are beastly, the humidity worse. I grew up in Washington, DC, which has the same weather. We had no air conditioning, so summers could be torture. No one could sleep, so we all walked around like zombies, yearning for fall.

Today, however, summers in Baltimore are completely bearable. The reason, of course, is air conditioning. Air conditioning existed when I was a kid, but hardly anyone could afford it.  I think the technology has gradually improved, but there was no scientific or technical breakthrough, as far as I know.  Yet somehow, all but the poorest families can afford air conditioning, so summer in Baltimore can be survived. Families that cannot afford air conditioning need assistance, especially for health reasons, but this number is small.

blog_8-15-19_airconditioning_500x357

The story of air conditioning resembles that of much other technology. What happens is that a solution is devised for a very important problem.  The solution is too expensive for ordinary people to use, so initially, it is used in circumstances that justify the cost.  For example, early automobiles were far too expensive for the general public, but they were used for important applications in which the benefits were particularly obvious, such as delivery trucks and cars for doctors and veterinarians.  Also, wealthy individuals and race car drivers could afford the early autos.  These applications provided experience with the manufacture, use, and repair of automobiles and encouraged investments in infrastructure, paving the way (so to speak) for mass production of cars (such as the Model T) that could be afforded by a much larger portion of the population and economy.  Modest improvements are constantly being made, but the focus is on making the technology less expensive, so it can be more widely used.  In medicine, penicillin was invented in the 1920s, but not until the advent of World War II was it made inexpensive enough for practical use.  It saved millions of lives not because it had been invented, but because the Merck Company was commissioned to find a way to make it practicable (the solution involved growing penicillin on rotting squash).

Innovations in education can work in a similar way.  One obvious example is instructional technology, which existed before the 1970s but is only now becoming universally available, mostly because it is falling in price.  However, what education has rarely done is to create expensive but hugely effective interventions and then figure out how to do them cheaply, without reducing their impact.

Until now.

If you are a regular reader of my blog, you can guess where I am going: Tutoring.  As everyone knows, one-to one tutoring by certified teachers is extremely effective.  No surprise there. As you regulars will also know, rigorous research over the past 20 years has established that tutoring by well-trained, well-supervised teaching assistants using proven methods routinely produces outcomes just as good as tutoring by certified teachers, at half the cost.  Further, one-to-small group tutoring, up to one to four, can be almost as effective as one-to-one tutoring in reading, and equally effective in mathematics (see www.bestevidence.org).

One-to-four tutoring by teaching assistants requires about one-eighth of the cost of one-to-one tutoring by teachers.  The mean outcomes for both types of tutoring are about an effect size of +0.30, but several programs are able to produce effect sizes in excess of +0.50, the national mean difference on NAEP between disadvantaged and middle-class students.  (As a point of comparison, average effects of technology applications with elementary struggling readers average +0.05 in reading, and in math, they average +0.07 for all elementary students.  Urban charter schools average +0.04 in reading, +0.05 in math).

Reducing the cost of tutoring should not be seen as a way for schools to save money.  Instead, it should be seen as a way to provide the benefits of tutoring to much larger numbers of students.  Because of its cost, tutoring has been largely restricted to the primary grades (especially first), to perhaps a semester of service, and to reading, but not math.  If tutoring is much less expensive but equally effective, then tutoring can be extended to older students and to math.  Students who need more than a semester of tutoring, or need “booster shots” to maintain their gains into later grades, should be able to receive the tutoring they need, for as long as they need it.

Tutoring has been how rich and powerful people educated their children since the beginning of time.  Ancient Romans, Greeks, and Egyptians had their children tutored if they could afford it.  The great Russian educational theorist, Lev Vygotsky, never saw the inside of a classroom as a child, because his parents could afford to have him tutored.  As a slave, Frederick Douglass received one-to-one tutoring (secretly and illegally) from his owner’s wife, right here in Baltimore.  When his master found out and forbade his wife to continue, Douglass sought further tutoring from immigrant boys on the docks where he worked, in exchange for his master’s wife’s fresh-cooked bread.  Helen Keller received tutoring from Anne Sullivan.  Tutoring has long been known to be effective.  The only question is, or should be, how do we maximize tutoring’s effectiveness while minimizing its cost, so that all students who need it can receive it?

If air conditioning had been like education, we might have celebrated its invention, but sadly concluded that it would never be affordable by ordinary people.  If penicillin had been like education, it would have remained a scientific curiosity until today, and millions would have died due to the lack of it.  If cars had been like education, only the rich would have them.

Air conditioning for all?  What a cool idea.  Cost-effective tutoring for all who need it?  Wouldn’t that be smart?

Photo credit: U.S. Navy photo by Pat Halton [Public domain]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.