After the Pandemic: Can We Welcome Students Back to Better Schools?

I am writing in March, 2020, at what may be the scariest point in the COVID-19 pandemic in the U.S. We are just now beginning to understand the potential catastrophe, and also to begin taking actions most likely to reduce the incidence of the disease.

One of the most important preventive measures is school closure. At this writing, thirty entire states have closed their schools, as have many individual districts, including Los Angeles. It is clear that school closures will go far beyond this, both in the U.S. and elsewhere.

I am not an expert on epidemiology, but I did want to make some observations about how widespread school closure could affect education, and (ever the optimist) how this disaster could provide a basis for major improvements in the long run.

Right now, schools are closing for a few weeks, with an expectation that after spring break, all will be well again, and schools might re-open. From what I read, this is unlikely. The virus will continue to spread until it runs out of vulnerable people. The purpose of school closures is to reduce the rate of transmission. Children themselves tend not to get the disease, for some reason, but they do transmit the disease, mostly at school (and then to adults). Only when there are few new cases to transmit can schools be responsibly re-opened. No one knows for sure, but a recent article in Education Week predicted that schools will probably not re-open this school year (Will, 2020). Kansas is the first state to announce that schools will be closed for the rest of the school year, but others will surely follow.

Will students suffer from school closure? There will be lasting damage if students lose parents, grandparents, and other relatives, of course. Their achievement may take a dip, but a remarkable study reported by Ceci (1991) examined the impact of two or more years of school closures in the Netherlands in World War II, and found an initial loss in IQ scores that quickly rebounded after schools re-opened after the war. From an educational perspective, the long-term impact of closure itself may not be so bad. A colleague, Nancy Karweit (1989), studied achievement in districts with long teacher strikes, and did not find much of a lasting impact.

In fact, there is a way in which wise state and local governments might use an opportunity presented by school closures. If schools closing now stay closed through the end of the school year, that could leave large numbers of teachers and administrators with not much to do (assuming they are not furloughed, which could happen). Imagine that, where feasible, this time were used for school leaders to consider how they could welcome students back to much improved schools, and to blog_3-26_20_teleconference2_500x334provide teachers with (electronic) professional development to implement proven programs. This might involve local, regional, or national conversations focused on what strategies are known to be effective for each of the key objectives of schooling. For example, a national series of conversations could take place on proven strategies for beginning reading, for middle school mathematics, for high school science, and so on. By design, the conversations would be focused not just on opinions, but on rigorous evidence of what works. A focus on improving health and disease prevention would be particularly relevant to the current crisis, along with implementing proven academic solutions.

Particular districts might decide to implement proven programs, and then use school closure to provide time for high-quality professional development on instructional strategies that meet the ESSA evidence standards.

Of course, all of the discussion and professional development would have to be done using electronic communications, for obvious reasons of public health. But might it be possible to make wise use of school closure to improve the outcomes of schooling using professional development in proven strategies? With rapid rollout of existing proven programs and dedicated funding, it certainly seems possible.

States and districts are making a wide variety of decisions about what to do during the time that schools are closed. Many are moving to e-learning, but this may be of little help in areas where many students lack computers or access to the internet at home. In some places, a focus on professional development for next school year may be the best way to make the best of a difficult situation.

There have been many times in the past when disasters have led to lasting improvements in health and education. This could be one of these opportunities, if we seize the moment.

Photo credit: Liam Griesacker

References

Ceci, S. J. (1991). How much does schooling influence general intelligence and its cognitive components? A reassessment of the evidence. Developmental Psychology, 27(5), 703–722. https://doi.org/10.1037/0012-1649.27.5.703

Karweit, N. (1989). Time and learning: A review. In R. E. Slavin (Ed.), School and Classroom Organization. Hillsdale, NJ: Erlbaum.

Will, M. (2020, March 15). School closure for the coronavirus could extend to the end of the school year, some say. Education Week.

 This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

What Works in Professional Development

I recently read an IES-funded study, called “The Effects of a Principal Professional Development Program Focused on Instructional Leadership.” The study, reported by a research team at Mathematica (Hermann et al., 2019), was a two-year evaluation of a Center for Educational Leadership (CEL) program in which elementary principals received 188 hours of PD, including a 28-hour summer institute at the beginning of the program, quarterly virtual professional learning community sessions in which principals met other principals and CEL coaches, and 50 hours per year of individual coaching in which principals worked with their CEL coaches to set goals, implement strategies, and analyze effects of strategies. Principals helped teachers improve instruction by observing teachers, giving feedback, and selecting curricula; sought to improve their recruitment, management, and retention strategies, held PD sessions for teachers; and focused on setting a school mission, improving school climate, and deploying resources effectively.

A total of 100 low-achieving schools were recruited. Half received the CEL program, and half served as controls. After one, two, and three years, there were no differences between experimental and control schools on standardized measures of student reading or mathematics achievement, no differences on school climate, and no differences on principal or teacher retention.

So what happened? First, it is important to note that previous studies of principal professional development have also found zero (e.g., Jacob et al., 2014) or very small and inconsistent effects (e.g., Nunnery et al., 2011, 2016). Second, numerous studies of certain types of professional development for teachers have also found very small or zero impacts. For example, a review of research on elementary mathematics programs by Pellegrini et al. (2019) identified 12 qualifying studies of professional development for mathematics content and pedagogy. The average effect size was essentially zero (ES=+0.04).

What does work in professional development?

In sharp contrast to these dismal findings, there are many forms of professional development that work very well. For example, in the Pellegrini et al. (2019) mathematics review, professional development designed to teach teachers to use specific instructional processes were very effective, averaging ES=+0.25. These included studies of cooperative learning, classroom management strategies, and individualized instruction. In fact, other than one-to-one and one-to-small group tutoring, no other type of approach was as effective. In a review of research on programs for elementary struggling readers by Inns et al. (2019), programs incorporating cooperative learning had an effect size of +0.29, more effective than any other programs except tutoring. A review of research on secondary reading programs by Baye et al. (2018) found that cooperative learning programs and whole-school models incorporating cooperative learning, along with writing-focused models also incorporating cooperative learning, had larger impacts than anything other than tutoring.

How can it be that professional development on cooperative learning and classroom management are so much more effective than professional development on content, pedagogy, and general teaching strategies?

One reason, I would submit, is that it is very difficult to teach someone to improve practices that they already know how to do. For example, if as an adult you took a course in tennis or golf or sailing or bridge, you probably noticed that you learned very rapidly, retained what you learned, and quickly improved your performance in that new skill. Contrast this with a course on dieting or parenting. The problem with improving your eating or parenting is that you already know very well how to eat, and if you already have kids, you know how to parent. You could probably stand some improvement in these areas, which is why you took the course, but no matter how motivated you are to improve, over time you are likely to fall back on well-established routines, or even bad habits. The same is true of teaching. Early in their careers teachers develop routine ways of performing each of the tasks of teaching: lecturing, planning, praising, dealing with misbehavior, and so on. Teachers know their content and settle into patterns of communicating that content to students. Then one day a professional developer shows up, who watches teachers teaching and gives them advice. The advice might take, but quite often teachers give it a try, run into difficulties, and then settle back into comfortable routines.

Now consider a more specific, concrete set of strategies that are distinctly different from what teachers typically do: cooperative learning. Teachers can readily learn the key components. They put their students in mixed groups of four or five. After an initial lesson, they give students opportunities to work together to make sure that everyone can succeed at the task. Teachers observe and assist students during team practice. They assess student learning, and celebrate student success. Every one of these components is a well-defined, easily learned, and easily observed step. Teachers need training and coaching to succeed at first, but after a while, cooperative learning itself becomes second nature. It helps that almost all kids love to be noisy and engaged, and love to work with each other, so they are rooting for the teacher to succeed. But for most teachers, structured cooperative learning is distinctly different from ordinary teaching, so it is easy to learn and maintain.

blog_12-19-19_celebratingteachers_500x341

As another example, consider classroom management strategies used in many programs. Trainers show teachers how to use Popsicle sticks with kids’ names on them to call on students, so all kids have to pay attention in case they are called. To get students’ immediate attention, teachers may learn to raise their hands and have students raise theirs, or to ring a bell, or to say a phrase like “one, two, three, look at me.” Teachers may learn to give points to groups or individuals who are meeting class expectations. They may learn to give students or groups privileges, such as lining up first to go outside or having the privilege of selecting and leading their favorite team or class cheer. These and many other teacher behaviors are clear, distinct, easily learned, and immediately solve persistent problems of low-level disturbances.

The point is not that these cooperative learning or classroom management strategies are more important than content knowledge or pedagogy. However, they are easily learned, retained, and institutionalized ways of solving critical daily problems of teaching, and they are so well-defined and clear that when they have started working, teachers are likely to hold on to them indefinitely and are unlikely to fall back on other strategies that may be less effective but are already deeply ingrained.

I am not suggesting that only observable, structural classroom reforms such as cooperative learning or classroom management strategies are good uses of professional development resources. All aspects of teaching need successive improvement, of course. But I am using these examples to illustrate why certain types of professional development are very difficult to make effective. It may be that improving the content and pedagogy teachers use day in and day out may require more concrete, specific strategies. I hope developers and researchers will create and successfully evaluate such new approaches, so that teachers can continually improve their effectiveness in all areas. But there are whole categories of professional development that research repeatedly finds are just not working. Researchers and educators need to focus on why this is true, and then design new PD strategies that are less subtle, more observable, and deal more with actual teacher and student behavior.

References

Hermann, M., Clark, M., James-Burdumy, S., Tuttle, C., Kautz, T., Knechtel, V., Dotter, D., Wulsin, C.S., & Deke, J. (2019). The effects of a principal professional development program focused on instructional leadership (NCEE 2020-0002). Washington, DC: Naitonal Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2019). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at www.bestevidence.org. Manuscript submitted for publication.

Jacob, R., Goddard, K., Miller, R., & Goddard, Y. (2014). Exploring the causal impact of the McREL Balanced Leadership Program on leadership, principal efficacy, instructional climate, educator turnover, and student achievement. Educational Evaluation and Policy Analysis, 52 187-220.

Nunnery, J., Ross, S., Chappel, S., Pribesh, S., & Hoag-Carhart, E. (2011). The impact of the National Institute for School Leadership’s Executive Development Program on school performance trends in Massachusetts: Cohort 2 Results. Norfolk, VA: Center for Educational Partnerships, Old Dominion University.

Nunnery, J., Ross, S., & Reilly, J. (2016). An evaluation of the National Institute for School Leadership: Executive Development Program in Milwaukee Public Schools. Norfolk, VA: Center for Educational Partnerships, Old Dominion University.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. (2019). Effective programs in elementary mathematics: A best-evidence synthesis. Available at www.bestevidence.com. Manuscript submitted for publication.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

What Works in Teaching Writing?

“I’ve learned that people will forget what you said, people will forget what you did, but people will never forget how you made them feel. The idea is to write it so that people hear it and it slides through the brain and goes straight to the heart.”   -Maya Angelou

It’s not hard to make an argument that creative writing is the noblest of all school subjects. To test this, try replacing the word “write” in this beautiful quotation from Maya Angelou with “read” or “compute.” Students must be proficient in reading and mathematics and other subjects, of course, but in what other subject must learners study how to reach the emotions of their readers?

blog_3-21-19_mangelou2_394x500

Good writing is the mark of an educated person. Perhaps especially in the age of electronic communications, we know most of the people we know largely through their writing. Job applications depend on the ability of the applicant to make themselves interesting to someone they’ve never seen. Every subject–science, history, reading, and many more–requires its own exacting types of writing.

Given the obvious importance of writing in people’s lives, you’d naturally expect that writing would occupy a central place in instruction. But you’d be wrong. Before secondary school, writing plays third fiddle to the other two of the 3Rs, reading and ‘rithmetic, and in secondary school, writing is just one among many components of English. College professors, employers, and ordinary people complain incessantly about the poor writing skills of today’s youth. The fact is that writing is not attended to as much as it should be, and the results are apparent to all.

Not surprisingly, the inadequate focus on writing in U.S. schools extends to an inadequate focus on research on this topic as well. My colleagues and I recently carried out a review of research on secondary reading programs. We found 69 studies that met rigorous inclusion criteria (Baye, Lake, Inns, & Slavin, in press). Recently, our group completed a review of secondary writing using similar inclusion standards, under funding from the Education Endowment Foundation in England (Slavin, Lake, Inns, Baye, Dachet, & Haslam, 2019). Yet we found only 14 qualifying studies, of which 11 were in secondary schools (we searched down to third grade).

To be fair, our inclusion standards were pretty tough. We required that studies compare experimental groups to randomized or matched control groups on measures independent of the experimental treatment. Tests could not have been made up by teachers or researchers, and they could not be scored by the teachers who taught the classes. Experimental and control groups had to be well-matched at pretest and have nearly equal attrition (loss of subjects over time). Studies had to have a duration of at least 12 weeks. Studies could include students with IEPs, but they could not be in self-contained, special education settings.

We divided the studies into three categories. One was studies of writing process models, in which students worked together to plan, draft, revise, and edit compositions in many genres. A very similar category was cooperative learning models, most of which also used a plan-draft-revise-edit cycle, but placed a strong emphasis on use of cooperative learning teams. A third category was programs that balanced writing with reading instruction.

Remarkably, the average effect sizes of each of the three categories were virtually identical, with a mean effect size of +0.18. There was significant variation within categories, however. In the writing process category, the interesting story concerned a widely used U.S. program, Self-Regulated Strategy Development (SRSD), evaluated in two qualifying studies in England. In one, the program was implemented in rural West Yorkshire and had huge impacts on struggling writers, the students for whom SRSD was designed. The effect size was +0.74. However, in a much larger study in urban Leeds and Lancashire, outcomes were not so positive (ES= +0.01), although effects were largest for struggling writers. There were many studies of SRSD in the U.S, but none of them qualified, due to a lack of control group, brief experiments, measures made up by researchers, and located in all-special education classrooms.

Three programs that emphasize cooperative learning had notably positive impacts. These were Writing Wings (ES = +0.13), Student Team Writing (ES = +0.38), and Expert 21 (ES = +0.58).

Among programs emphasizing reading and writing, two had a strong focus on English learners: Pathway (ES = +0.32) and ALIAS (ES = +0.18). Another two approaches had an explicit focus on preparing students for freshman English: College Ready Writers Program (ES = +0.18) and Expository Reading and Writing Course (ES = =0.13).

Looking across all categories, there were several factors common to successful programs that stood out:

  • Cooperative Learning. Cooperative learning usually aids learning in all subjects, but it makes particular sense in writing, as a writing team gives students opportunities to give and receive feedback on their compositions, facilitating their efforts to gain insight into how their peers think about writing, and giving them a sympathetic and ready audience for their writing.
  • Writing Process. Teaching students step-by-step procedures to work with others to plan, draft, revise, and edit compositions in various genres appears to be very beneficial. The first steps focus on helping students get their ideas down on paper without worrying about mechanics, while the later stages help students progressively improve the structure, organization, grammar, and punctuation of their compositions. These steps help students reluctant to write at all to take risks at the outset, confident that they will have help from peers and teachers to progressively improve their writing.
  • Motivation and Joy in Self-Expression. In the above quote, Maya Angelou talks about the importance in writing of “sliding through the brain to get to the heart.” But to the writer, this process must work the other way, too. Good writing starts in the heart, with an urge to say something of importance. The brain shapes writing to make it readable, but writing must start with a message that the writer cares about. This principle is demonstrated most obviously in writing process and cooperative learning models, where every effort is made to motivate students to find exciting and interesting topics to share with their peers. In programs balancing reading and writing, reading is used to help students have something important to write.
  • Extensive Professional Development. Learning to teach writing well is not easy. Teachers need opportunities to learn new strategies and to apply them in their own writing. All of the successful writing programs we identified in our review provided extensive, motivating, and cooperative professional development, often designed as much to help teachers catch the spirit of writing as to follow a set of procedures.

Our review of writing research found that there is considerable consensus in how to teach writing. There were more commonalities than differences across the categories. Effects were generally positive, however, because control teachers were not using these consensus strategies, or were not doing so with the skills imparted by the professional development characteristic of all of the successful approaches.

We cannot expect writing instruction to routinely produce Maya Angelous or Mark Twains. Great writers add genius to technique. However, we can create legions of good writers, and our students will surely benefit.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (in press). Effective reading programs for secondary students. Reading Research Quarterly.

Slavin, R. E., Lake, C. Inns, A., Baye, A., Dachet, D., & Haslam, J. (2019). A quantitative synthesis of research on writing approaches in Key Stage 2 and secondary schools. London: Education Endowment Foundation.

Photo credit: Kyle Tsui from Washington, DC, USA [CC BY 2.0 (https://creativecommons.org/licenses/by/2.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

A Mathematical Mystery

My colleagues and I wrote a review of research on elementary mathematics (Pellegrini, Lake, Inns, & Slavin, 2018). I’ve written about it before, but I wanted to hone in on one extraordinary set of findings.

In the review, there were 12 studies that evaluated programs that focused on providing professional development for elementary teachers of mathematics content and mathematics –-specific pedagogy. I was sure that this category would find positive effects on student achievement, but it did not. The most remarkable (and depressing) finding involved the huge year-long Intel study in which 80 teachers received 90 hours of very high-quality in-service during the summer, followed by an additional 13 hours of group discussions of videos of the participants’ class lessons. Teachers using this program were compared to 85 control teachers. After all this, students in the Intel classes scored slightly worse than controls on standardized measures (Garet et al., 2016).

If the Intel study were the only disappointment, one might look for flaws in their approach or their evaluation design or other things specific to that study. But as I noted earlier, all 12 of the studies of this kind failed to find positive effects, and the mean effect size was only +0.04 (n.s.).

Lest anyone jump to the conclusion that nothing works in elementary mathematics, I would point out that this is not the case. The most impactful category was tutoring programs, so that’s a special case. But the second most impactful category had many features in common with professional development focused on mathematics content and pedagogy, but had an average effect size of +0.25. This category consisted of programs focused on classroom management and motivation: Cooperative learning, classroom management strategies using group contingencies, and programs focusing on social emotional learning.

So there are successful strategies in elementary mathematics, and they all provided a lot of professional development. Yet programs for mathematics content and pedagogy, all of which also provided a lot of professional development, did not show positive effects in high-quality evaluations.

I have some ideas about what may be going on here, but I advance them cautiously, as I am not certain about them.

The theory of action behind professional development focused on mathematics content and pedagogy assumes that elementary teachers have gaps in their understanding of mathematics content and mathematics-specific pedagogy. But perhaps whatever gaps they have are not so important. Here is one example. Leading mathematics educators today take a very strong view that fractions should never be taught using pizza slices, but only using number lines. The idea is that pizza slices are limited to certain fractional concepts, while number lines are more inclusive of all uses of fractions. I can understand and, in concept, support this distinction. But how much difference does it make? Students who are learning fractions can probably be divided into three pizza slices. One slice represents students who understand fractions very well, however they are presented, and another slice consists of students who have no earthly idea about fractions. The third slice consists of students who could have learned fractions if it were taught with number lines but not pizzas. The relative sizes of these slices vary, but I’d guess the third slice is the smallest. Whatever it is, the number of students whose success depends on fractions vs. number lines is unlikely to be large enough to shift the whole group mean very much, and that is what is reported in evaluations of mathematics approaches. For example, if the “already got it” slice is one third of all students, and the “probably won’t get it” slice is also one third, the slice consisting of students who might get the concept one way but not the other is also one third. If the effect size for the middle slice were as high as an improbable +0.20, the average for all students would be less than +0.07, averaging across the whole pizza.

blog_2-14-19_slices_500x333

A related possibility relates to teachers’ knowledge. Assume that one slice of teachers already knows a lot of the content before the training. Another slice is not going to learn or use it. The third slice, those who did not know the content before but will use it effectively after training, is the only slice likely to show a benefit, but this benefit will be swamped by the zero effects for the teachers who already knew the content and those who will not learn or use it.

If teachers are standing at the front of the class explaining mathematical concepts, such as proportions, a certain proportion of students are learning the content very well and a certain proportion are bored, terrified, or just not getting it. It’s hard to imagine that the successful students are gaining much from a change of content or pedagogy, and only a small proportion of the unsuccessful students will all of a sudden understand what they did not understand before, just because it is explained better. But imagine that instead of only changing content, the teacher adopts cooperative learning. Now the students are having a lot of fun working with peers. Struggling students have an opportunity to ask for explanations and help in a less threatening environment, and they get a chance to see and ultimately absorb how their more capable teammates approach and solve difficult problems. The already high-achieving students may become even higher achieving, because as every teacher knows, explanation helps the explainer as much as the student receiving the explanation.

The point I am making is that the findings of our mathematics review may reinforce a general lesson we take away from all of our reviews: Subtle treatments produce subtle (i.e., small) impacts. Students quickly establish themselves as high or average or low achievers, after which time it is difficult to fundamentally change their motivations and approaches to learning. Making modest changes in content or pedagogy may not be enough to make much difference for most students. Instead, dramatically changing motivation, providing peer assistance, and making mathematics more fun and rewarding, seems more likely to make a significant change in learning than making subtle changes in content or pedagogy. That is certainly what we have found in systematic reviews of elementary mathematics and elementary and secondary reading.

Whatever the student outcomes are compared to controls, there may be good reason to improve mathematics content and pedagogy. But if we are trying to improve achievement for all students, the whole pizza, we need to use methods that make a more profound impact on all students. And that is true any way you slice it.

References

Garet, M. S., Heppen, J. B., Walters, K., Parkinson, J., Smith, T. M., Song, M., & Borman, G. D. (2016). Focusing on mathematical knowledge: The impact of content-intensive teacher professional development (NCEE 2016-4010). Washington, DC: U.S. Department of Education.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. E. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the Society for Research on Effective Education, Washington, DC.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

 

Succeeding Faster in Education

“If you want to increase your success rate, double your failure rate.” So said Thomas Watson, the founder of IBM. What he meant, of course, is that people and organizations thrive when they try many experiments, even though most experiments fail. Failing twice as often means trying twice as many experiments, leading to twice as many failures—but also, he was saying, many more successes.

blog_9-20-18_TJWatson_500x488
Thomas Watson

In education research and innovation circles, many people know this quote, and use it to console colleagues who have done an experiment that did not produce significant positive outcomes. A lot of consolation is necessary, because most high-quality experiments in education do not produce significant positive outcomes. In studies funded by the Institute for Education Sciences (IES), Investing in Innovation (i3), and England’s Education Endowment Foundation (EEF), all of which require very high standards of evidence, fewer than 20% of experiments show significant positive outcomes.

The high rate of failure in educational experiments is often shocking to non-researchers, especially the government agencies, foundations, publishers, and software developers who commission the studies. I was at a conference recently in which a Peruvian researcher presented the devastating results of an experiment in which high-poverty, mostly rural schools in Peru were randomly assigned to receive computers for all of their students, or to continue with usual instruction. The Peruvian Ministry of Education was so confident that the computers would be effective that they had built a huge model of the specific computers used in the experiment and attached it to the Ministry headquarters. When the results showed no positive outcomes (except for the ability to operate computers), the Ministry quietly removed the computer statue from the top of their building.

Improving Success Rates

Much as I believe Watson’s admonition (“fail more”), there is another principle that he was implying, or so I expect: We have to learn from failure, so we can increase the rate of success. It is not realistic to expect government to continue to invest substantial funding in high-quality educational experiments if the success rate remains below 20%. We have to get smarter, so we can succeed more often. Fortunately, qualitative measures, such as observations, interviews, and questionnaires, are becoming required elements of funded research, facilitating finding out what happened so that researchers can find out what went wrong. Was the experimental program faithfully implemented? Were there unexpected responses toward the program by teachers or students?

In the course of my work reviewing positive and disappointing outcomes of educational innovations, I’ve noticed some patterns that often predict that a given program is likely or unlikely to be effective in a well-designed evaluation. Some of these are as follows.

  1. Small changes lead to small (or zero) impacts. In every subject and grade level, researchers have evaluated new textbooks, in comparison to existing texts. These almost never show positive effects. The reason is that textbooks are just not that different from each other. Approaches that do show positive effects are usually markedly different from ordinary practices or texts.
  2. Successful programs almost always provide a lot of professional development. The programs that have significant positive effects on learning are ones that markedly improve pedagogy. Changing teachers’ daily instructional practices usually requires initial training followed by on-site coaching by well-trained and capable coaches. Lots of PD does not guarantee success, but minimal PD virtually guarantees failure. Sufficient professional development can be expensive, but education itself is expensive, and adding a modest amount to per-pupil cost for professional development and other requirements of effective implementation is often the best way to substantially enhance outcomes.
  3. Effective programs are usually well-specified, with clear procedures and materials. Rarely do programs work if they are unclear about what teachers are expected to do, and helped to do it. In the Peruvian study of one-to-one computers, for example, students were given tablet computers at a per-pupil cost of $438. Teachers were expected to figure out how best to use them. In fact, a qualitative study found that the computers were considered so valuable that many teachers locked them up except for specific times when they were to be used. They lacked specific instructional software or professional development to create the needed software. No wonder “it” didn’t work. Other than the physical computers, there was no “it.”
  4. Technology is not magic. Technology can create opportunities for improvement, but there is little understanding of how to use technology to greatest effect. My colleagues and I have done reviews of research on effects of modern technology on learning. We found near-zero effects of a variety of elementary and secondary reading software (Inns et al., 2018; Baye et al., in press), with a mean effect size of +0.05 in elementary reading and +0.00 in secondary. In math, effects were slightly more positive (ES=+0.09), but still quite small, on average (Pellegrini et al., 2018). Some technology approaches had more promise than others, but it is time that we learned from disappointing as well as promising applications. The widespread belief that technology is the future must eventually be right, but at present we have little reason to believe that technology is transformative, and we don’t know which form of technology is most likely to be transformative.
  5. Tutoring is the most solid approach we have. Reviews of elementary reading for struggling readers (Inns et al., 2018) and secondary struggling readers (Baye et al., in press), as well as elementary math (Pellegrini et al., 2018), find outcomes for various forms of tutoring that are far beyond effects seen for any other type of treatment. Everyone knows this, but thinking about tutoring falls into two camps. One, typified by advocates of Reading Recovery, takes the view that tutoring is so effective for struggling first graders that it should be used no matter what the cost. The other, also perhaps thinking about Reading Recovery, rejects this approach because of its cost. Yet recent research on tutoring methods is finding strategies that are cost-effective and feasible. First, studies in both reading (Inns et al., 2018) and math (Pellegrini et al., 2018) find no difference in outcomes between certified teachers and paraprofessionals using structured one-to-one or one-to-small group tutoring models. Second, although one-to-one tutoring is more effective than one-to-small group, one-to-small group is far more cost-effective, as one trained tutor can work with 4 to 6 students at a time. Also, recent studies have found that tutoring can be just as effective in the upper elementary and middle grades as in first grade, so this strategy may have broader applicability than it has in the past. The real challenge for research on tutoring is to develop and evaluate models that increase cost-effectiveness of this clearly effective family of approaches.

The extraordinary advances in the quality and quantity of research in education, led by investments from IES, i3, and the EEF, have raised expectations for research-based reform. However, the modest percentage of recent studies meeting current rigorous standards of evidence has caused disappointment in some quarters. Instead, all findings, whether immediately successful or not, should be seen as crucial information. Some studies identify programs ready for prime time right now, but the whole body of work can and must inform us about areas worthy of expanded investment, as well as areas in need of serious rethinking and redevelopment. The evidence movement, in the form it exists today, is completing its first decade. It’s still early days. There is much more we can learn and do to develop, evaluate, and disseminate effective strategies, especially for students in great need of proven approaches.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (in press). Effective reading programs for secondary students. Reading Research Quarterly.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

 Photo credit: IBM [CC BY-SA 3.0  (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

 

Rethinking Technology in Education

Antonine de Saint Exupéry, in his 1931 classic Night Flight, had a wonderful line about early airmail service in Patagonia, South America:

“When you are crossing the Andes and your engine falls out, well, there’s nothing to do but throw in your hand.”

blog_10-4-18_Saint_Exupery_363x500

I had reason to think about this quote recently, as I was attending a conference in Santiago, Chile, the presumed destination of the doomed pilot. The conference focused on evidence-based reform in education.

Three of the papers described large scale, randomized evaluations of technology applications in Latin America, funded by the Inter-American Development Bank (IDB). Two of them documented disappointing outcomes of large-scale, traditional uses of technology. One described a totally different application.

One of the studies, reported by Santiago Cueto (Cristia et al., 2017), randomly assigned 318 high-poverty, mostly rural primary schools in Peru to receive sturdy, low-cost, practical computers, or to serve as a control group. Teachers were given great latitude in how to use the computers, but limited professional development in how to use them as pedagogical resources. Worse, the computers had software with limited alignment to the curriculum, and teachers were expected to overcome this limitation. Few did. Outcomes were essentially zero in reading and math.

In another study (Berlinski & Busso, 2017), the IDB funded a very well-designed study in 85 schools in Costa Rica. Schools were randomly assigned to receive one of five approaches. All used the same content on the same schedule to teach geometry to seventh graders. One group used traditional lectures and questions with no technology. The others used active learning, active learning plus interactive whiteboards, active learning plus a computer lab, or active learning plus one computer per student. “Active learning” emphasized discussions, projects, and practical exercises.

On a paper-and-pencil test covering the content studied by all classes, all four of the experimental groups scored significantly worse than the control group. The lowest performance was seen in the computer lab condition, and, worst of all, the one computer per child condition.

The third study, in Chile (Araya, Arias, Bottan, & Cristia, 2018), was funded by the IDB and the International Development Research Center of the Canadian government. It involved a much more innovative and unusual application of technology. Fourth grade classes within 24 schools were randomly assigned to experimental or control conditions. In the experimental group, classes in similar schools were assigned to serve as competitors to each other. Within the math classes, students studied with each other and individually for a bi-monthly “tournament,” in which students in each class were individually given questions to answer on the computers. Students were taught cheers and brought to fever pitch in their preparations. The participating classes were compared to the control classes, which studied the same content using ordinary methods. All classes, experimental and control, were studying the national curriculum on the same schedule, and all used computers, so all that differed was the tournaments and the cooperative studying to prepare for the tournaments.

The outcomes were frankly astonishing. The students in the experimental schools scored much higher on national tests than controls, with an effect size of +0.30.

The differences in the outcomes of these three approaches are clear. What might explain them, and what do they tell us about applications of technology in Latin America and anywhere?

In Peru, the computers were distributed as planned and generally functioned, but teachers receive little professional development. In fact, teachers were not given specific strategies for using the computers, but were expected to come up with their own uses for them.

The Costa Rica study did provide computer users with specific approaches to math and gave teachers much associated professional development. Yet the computers may have been seen as replacements for teachers, and the computers may just not have been as effective as teachers. Alternatively, despite extensive PD, all four of the experimental approaches were very new to the teachers and may have not been well implemented.

In contrast, in the Chilean study, tournaments and cooperative study were greatly facilitated by the computers, but the computers were not central to program effectiveness. The theory of action emphasized enhanced motivation to engage in cooperative study of math. The computers were only a tool to achieve this goal. The tournament strategy resembles a method from the 1970s called Teams-Games-Tournaments (TGT) (DeVries & Slavin, 1978). TGT was very effective, but was complicated for teachers to use, which is why it was not widely adopted. In Chile, computers helped solve the problems of complexity.

It is important to note that in the United States, technology solutions are also not producing major gains in student achievement. Reviews of research on elementary reading (ES=+0.05; Inns et al. 2018) and secondary reading (ES= -0.01; Baye et al., in press) have reported near-zero effects of technology-assisted effects of technology-assisted approaches. Outcomes in elementary math are only somewhat better, averaging an effect size of +0.09 (Pellegrini et al., 2018).

The findings of these rigorous studies of technology in the U.S. and Latin America lead to a conclusion that there is nothing magic about technology. Applications of technology can work if the underlying approach is sound. Perhaps it is best to consider which non-technology approaches are proven or likely to increase learning, and only then imagine how technology could make effective methods easier, less expensive, more motivating, or more instructionally effective. As an analogy, great audio technology can make a concert more pleasant or audible, but the whole experience still depends on great composition and great performances. Perhaps technology in education should be thought of in a similar enabling way, rather than as the core of innovation.

St. Exupéry’s Patagonian pilots crossing the Andes had no “Plan B” if their engines fell out. We do have many alternative ways to put technology to work or to use other methods, if the computer-assisted instruction strategies that have dominated technology since the 1970s keep showing such small or zero effects. The Chilean study and certain exceptions to the overall pattern of research findings in the U.S. suggest appealing “Plans B.”

The technology “engine” is not quite falling out of the education “airplane.” We need not throw in our hand. Instead, it is clear that we need to re-engineer both, to ask not what is the best way to use technology, but what is the best way to engage, excite, and instruct students, and then ask how technology can contribute.

Photo credit: Distributed by Agence France-Presse (NY Times online) [Public domain], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

References

Araya, R., Arias, E., Bottan, N., & Cristia, J. (2018, August 23). Conecta Ideas: Matemáticas con motivatión social. Paper presented at the conference “Educate with Evidence,” Santiago, Chile.

Baye, A., Lake, C., Inns, A., & Slavin, R. (in press). Effective reading programs for secondary students. Reading Research Quarterly.

Berlinski, S., & Busso, M. (2017). Challenges in educational reform: An experiment on active learning in mathematics. Economics Letters, 156, 172-175.

Cristia, J., Ibarraran, P., Cueto, S., Santiago, A., & Severín, E. (2017). Technology and child development: Evidence from the One Laptop per Child program. American Economic Journal: Applied Economics, 9 (3), 295-320.

DeVries, D. L., & Slavin, R. E. (1978). Teams-Games-Tournament:  Review of ten classroom experiments. Journal of Research and Development in Education, 12, 28-38.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018, March 3). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018, March 3). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

What’s the Evidence that Evidence Works?

I recently gave a couple of speeches on evidence-based reform in education in Barcelona.  In preparing for them, one of the organizers asked me an interesting question: “What is your evidence that evidence works?”

At one level, this is a trivial question. If schools select proven programs and practices aligned with their needs and implement them with fidelity and intelligence, with levels of resources similar to those used in the original successful research, then of course they’ll work, right? And if a school district adopts proven programs, encourages and funds them, and monitors their implementation and outcomes, then of course the appropriate use of all these programs is sure to enhance achievement district-wide, right?

Although logic suggests that a policy of encouraging and funding proven programs is sure to increase achievement on a broad scale, I like to be held to a higher standard: Evidence. And, it so happens, I happen to have some evidence on this very topic. This evidence came from a large-scale evaluation of an ambitious, national effort to increase use of proven and promising schoolwide programs in elementary and middle schools, in a research center funded by the Institute for Education Sciences (IES) called the Center for Data-Driven Reform in Education, or CDDRE (see Slavin, Cheung, Holmes, Madden, & Chamberlain, 2013). The name of the program the experimental schools used was Raising the Bar.

How Raising the Bar Raised the Bar

The idea behind Raising the Bar was to help schools analyze their own needs and strengths, and then select whole-school reform models likely to help them meet their achievement goals. CDDRE consultants provided about 30 days of on-site professional development to each district over a 2-year period. The PD focused on review of data, effective use of benchmark assessments, school walk-throughs by district leaders to see the degree to which schools were already using the programs they claimed to be using, and then exposing district and school leaders to information and data on schoolwide programs available to them, from several providers. If districts selected a program to implement, their district and school received PD on ensuring effective implementation and principals and teachers received PD on the programs they chose.

blog_7-26-18_polevault_375x500

Evaluating Raising the Bar

In the study of Raising the Bar we recruited a total of 397 elementary and 225 middle schools in 59 districts in 7 states (AL, AZ, IN, MS, OH, TN). All schools were Title I schools in rural and mid-sized urban districts. Overall, 30% of students were African-American, 20% were Hispanic, and 47% were White. Across three cohorts, starting in 2005, 2006, or 2007, schools were randomly assigned to either use Raising the Bar, or to continue with what they were doing. The study ended in 2009, so schools could have been in the Raising the Bar group for two, three, or four years.

Did We Raise the Bar?

State test scores were obtained from all schools and transformed to z-scores so they could be combined across states. The analyses focused on grades 5 and 8, as these were the only grades tested in some states at the time. Hierarchical linear modeling, with schools nested within districts, were used for analysis.

For reading in fifth grade, outcomes were very good. By Year 3, the effect sizes were significant, with significant individual-level effect sizes of +0.10 in Year 3 and +0.19 in Year 4. In middle school reading, effect sizes reached an effect size of +0.10 by Year 4.

Effects were also very good in fifth grade math, with significant effects of +0.10 in Year 3 and +0.13 in Year 4. Effect sizes in middle school math were also significant in Year 4 (ES=+0.12).

Note that these effects are for all schools, whether they adopted a program or not. Non-experimental analyses found that by Year 4, elementary schools that had chosen and implemented a reading program (33% of schools by Year 3, 42% by Year 4) scored better than matched controls in reading. Schools that chose any reading program usually chose our Success for All reading program, but some chose other models. Even in schools that did not adopt reading or math programs, scores were always higher, on average, (though not always significantly higher) than for schools that did not choose programs.

How Much Did We Raise the Bar?

The CDDRE project was exceptional because of its size and scope. The 622 schools, in 59 districts in 7 states, were collectively equivalent to a medium-sized state. So if anyone asks what evidence-based reform could do to help an entire state, this study provides one estimate. The student-level outcome in elementary reading, an effect size of +0.19, applied to NAEP scores, would be enough to move 43 states to the scores now only attained by the top 10. If applied successfully to schools serving mostly African American and Hispanic students or to students receiving free- or reduced-price lunches regardless of ethnicity, it would reduce the achievement gap between these and White or middle-class students by about 38%. All in four years, at very modest cost.

Actually, implementing something like Raising the Bar could be done much more easily and effectively today than it could in 2005-2009. First, there are a lot more proven programs to choose from than there were then. Second, the U.S. Congress, in the Every Student Succeeds Act (ESSA), now has definitions of strong, moderate, and promising levels of evidence, and restricts school improvement grants to schools that choose such programs. The reason only 42% of Raising the Bar schools selected a program is that they had to pay for it, and many could not afford to do so. Today, there are resources to help with this.

The evidence is both logical and clear: Evidence works.

Reference

Slavin, R. E., Cheung, A., Holmes, G., Madden, N. A., & Chamberlain, A. (2013). Effects of a data-driven district reform model on state assessment outcomes. American Educational Research Journal, 50 (2), 371-396.

Photo by Sebastian Mary/Gio JL [CC BY-SA 2.0  (https://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.