How Can You Tell When The Findings of a Meta-Analysis Are Likely to Be Valid?

In Baltimore, Faidley’s, founded in 1886, is a much loved seafood market inside Lexington Market. Faidley’s used to be a real old-fashioned market, with sawdust on the floor and an oyster bar in the center. People lined up behind their favorite oyster shucker. In a longstanding tradition, the oyster shuckers picked oysters out of crushed ice and tapped them with their oyster knives. If they sounded full, they opened them. But if they did not, the shuckers discarded them.

I always noticed that the line was longer behind the shucker who was discarding the most oysters. Why? Because everyone knew that the shucker who was pickier was more likely to come up with a dozen fat, delicious oysters, instead of say, nine great ones and three…not so great.

I bring this up today to tell you how to pick full, fair meta-analyses on educational programs. No, you can’t tap them with an oyster knife, but otherwise, the process is similar. You want meta-analysts who are picky about what goes into their meta-analyses. Your goal is to make sure that a meta-analysis produces results that truly represent what teachers and schools are likely to see in practice when they thoughtfully implement an innovative program. If instead you pick the meta-analysis with the biggest effect sizes, you will always be disappointed.

As a special service to my readers, I’m going to let you in on a few trade secrets about how to quickly evaluate a meta-analysis in education.

One very easy way to evaluate a meta-analysis is to look at the overall effect size, probably shown in the abstract. If the overall mean effect size is more than about +0.40, you probably don’t have to read any further. Unless the treatment is tutoring or some other treatment that you would expect to make a massive difference in student achievement, it is rare to find a single legitimate study with an effect size that large, much less an average that large. A very large effect size is almost a guarantee that a meta-analysis is full of studies with design features that greatly inflate effect sizes, not studies with outstandingly effective treatments.

Next, go to the Methods section, which will have within it a section on inclusion (or selection) criteria. It should list the types of studies that were or were not accepted into the study. Some of the criteria will have to do with the focus of the meta-analysis, specifying, for example, “studies of science programs for students in grades 6 to 12.” But your focus is on the criteria that specify how picky the meta-analysis is. As one example of a picky set of critera, here are the main ones we use in Evidence for ESSA and in every analysis we write:

  1. Studies had to use random assignment or matching to assign students to experimental or control groups, with schools and students in each specified in advance.
  2. Students assigned to the experimental group had to be compared to very similar students in a control group, which uses business-as-usual. The experimental and control students must be well matched, within a quarter standard deviation at pretest (ES=+0.25), and attrition (loss of subjects) must be no more than 15% higher in one group than the other at the end of the study. Why? It is essential that experimental and control groups start and remain the same in all ways other than the treatment. Controls for initial differences do not work well when the differences are large.
  3. There must be at least 30 experimental and 30 control students. Analyses of combined effect sizes must control for sample sizes. Why? Evidence finds substantial inflation of effect sizes in very small studies.
  4. The treatments must be provided for at least 12 weeks. Why? Evidence finds major inflation of effect sizes in very brief studies, and brief studies do not represent the reality of the classroom.
  5. Outcome measures must be measures independent of the program developers and researchers. Usually, this means using national tests of achievement, though not necessarily standardized tests. Why? Research has found that tests made by researchers can inflate effect sizes by double, or more, and research-made measures do not represent the reality of classroom assessment.

There may be other details, but these are the most important. Note that there is a double focus of these standards. Each is intended both to minimize bias, but also to maximize similarity to the conditions faced by schools. What principal or teacher who cares about evidence would be interested in adopting a program evaluated in comparison to a very different control group? Or in a study with few subjects, or a very brief duration? Or in a study that used measures made by the developers or researchers? This set is very similar to what the What Works Clearinghouse (WWC) requires, except #5 (the WWC requires exclusion of “overaligned” measures, but not developer-/researcher-made measures).

If these criteria are all there in the “Inclusion Standards,” chances are you are looking at a top-quality meta-analysis. As a rule, it will have average effect sizes lower than those you’ll see in reviews without some or all of these standards, but the effect sizes you see will probably be close to what you will actually get in student achievement gains if your school implements a given program with fidelity and thoughtfulness.

What I find astonishing is how many meta-analyses do not have standards this high. Among experts, these criteria are not controversial, except for the last one, which shouldn’t be. Yet meta-analyses are often written, and accepted by journals, with much lower standards, thereby producing greatly inflated, unrealistic effect sizes.

As one example, there was a meta-analysis of Direct Instruction programs in reading, mathematics, and language, published in the Review of Educational Research (Stockard et al., 2016). I have great respect for Direct Instruction, which has been doing good work for many years. But this meta-analysis was very disturbing.

The inclusion and exclusion criteria in this meta-analysis did not require experimental-control comparisons, did not require well-matched samples, and did not require any minimum sample size or duration. It was not clear how many of the outcomes measures were made by program developers or researchers, rather than independent of the program.

With these minimal inclusion standards, and a very long time span (back to 1966), it is not surprising that the review found a great many qualifying studies. 528, to be exact. The review also reported extraordinary effect sizes: +0.51 for reading, +0.55 for math, and +0.54 for language. If these effects were all true and meaningful, it would mean that DI is much more effective than one-to-one tutoring, for example.

But don’t get your hopes up. The article included an online appendix that showed the sample sizes, study designs, and outcomes of every study.

First, the authors identified eight experimental designs (plus single-subject designs, which were treated separately). Only two of these would meet anyone’s modern standards of meta-analysis: randomized and matched. The others included pre-post gains (no control group), comparisons to test norms, and other pre-scientific designs.

Sample sizes were often extremely small. Leaving aside single-case experiments, there were dozens of single-digit sample sizes (e.g., six students), often with very large effect sizes. Further, there was no indication of study duration.

What is truly astonishing is that RER accepted this study. RER is the top-rated journal in all of education, based on its citation count. Yet this review, and the Kulik & Fletcher (2016) review I cited in a recent blog, clearly did not meet minimal standards for meta-analyses.

My colleagues and I will be working in the coming months to better understand what has gone wrong with meta-analysis in education, and to propose solutions. Of course, our first step will be to spend a lot of time at oyster bars studying how they set such high standards. Oysters and beer will definitely be involved!

Photo credit: Annette White / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

References

Kulik, J. A., & Fletcher, J. D. (2016). Effectiveness of intelligent tutoring systems: a meta-analytic review. Review of Educational Research, 86(1), 42-78.

Stockard, J., Wood, T. W., Coughlin, C., & Rasplica Khoury, C. (2018). The effectiveness of Direct Instruction curricula: A meta-analysis of a half century of research. Review of Educational Research88(4), 479–507. https://doi.org/10.3102/0034654317751919

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Meta-Analysis or Muddle-Analysis?

One of the best things about living in Baltimore is eating steamed hard shell crabs every summer.  They are cooked in a very spicy mix of spices, and with Maryland corn and Maryland beer, these define the very peak of existence for Marylanders.  (To be precise, the true culture of the crab also extends into Virginia, but does not really exist more than 20 miles inland from the bay).  

As every crab eater knows, a steamed crab comes with a lot of inedible shell and other inner furniture.  So you get perhaps an ounce of delicious meat for every pound of whole crab. Here is a bit of crab math.  Let’s say you have ten pounds of whole crabs, and I have 20 ounces of delicious crabmeat.  Who gets more to eat?  Obviously I do, because your ten pounds of crabs will only yield 10 ounces of meat. 

How Baltimoreans learn about meta-analysis.

All Baltimoreans instinctively understand this from birth.  So why is this same principle not understood by so many meta-analysts?

I recently ran across a meta-analysis of research on intelligent tutoring programs by Kulik & Fletcher (2016),  published in the Review of Educational Research (RER). The meta-analysis reported an overall effect size of +0.66! Considering that the single largest effect size of one-to-one tutoring in mathematics was “only” +0.31 (Torgerson et al., 2013), it is just plain implausible that the average effect size for a computer-assisted instruction intervention is twice as large. Consider that a meta-analysis our group did on elementary mathematics programs found a mean effect size of +0.19 for all digital programs, across 38 rigorous studies (Slavin & Lake, 2008). So how did Kulik & Fletcher come up with +0.66?

The answer is clear. The authors excluded very few studies except for those of less than 30 minutes’ duration. The studies they included used methods known to greatly inflate effect sizes, but they did not exclude or control for them. To the authors’ credit, they then carefully documented the effects of some key methodological factors. For example, they found that “local” measures (presumably made by researchers) had a mean effect size of +0.73, while standardized measures had an effect size of +0.13, replicating findings of many other reviews (e.g., Cheung & Slavin, 2016). They found that studies with sample sizes less than 80 had an effect size of +0.78, while those with samples of more than 250 had an effect size of +0.30. Brief studies had higher effect sizes than those of longer studies, as found in many studies. All of this is nice to know, but even knowing it all, Kulik & Fletcher failed to control for any of it, not even to weight by sample size. So, for example, the implausible mean effect size of +0.66 includes a study with a sample size of 33, a duration of 80 minutes, and an effect size of +1.17, on a “local” test. Another had 48 students, a duration of 50 minutes, and an effect size of +0.95. Now, if you believe that 80 minutes on a computer is three times as effective for math achievement than months of one-to-one tutoring by a teacher, then I have a lovely bridge in Baltimore I’d like to sell you.

I’ve long been aware of these problems with meta-analyses that neither exclude nor control for characteristics of studies known to greatly inflate effect sizes. This was precisely the flaw for which I criticized John Hattie’s equally implausible reviews. But what I did not know until recently was just how widespread this is.

I was working on a proposal to do a meta-analysis of research on technology applications in mathematics. A colleague located every meta-analysis published on this topic since 2013. She found 20 of them. After looking at the remarkable outcomes on a few, I computed a median effect size across all twenty. It was +0.44. That is, to put it mildly, implausible. Looking further, I discovered that only one of the reviews adjusted for sample size (inverse variances). Its mean effect size was +0.05. Every one of the other 19 meta-analyses, all in respectable journals, did not control for methodological features or exclude studies based on them, and reported effect sizes up to +1.02 and +1.05.

Meta-analyses are important, because they are widely read and widely cited, in comparison to individual studies. Yet until meta-analyses start consistently excluding, or at least controlling for studies with factors known to inflate mean effect sizes, then they will have little if any meaning for practice. As things stand now, the overall mean impacts reported by meta-analyses in education depend on how stringent the inclusion standards were, not how effective the interventions truly were.

This is a serious problem for evidence-based reform. Our field knows how to solve it, but all too many meta-analysts do not do so. This needs to change. We see meta-analyses claiming huge impacts, and then wonder why these effects do not transfer to practice. In fact, these big effect sizes do not transfer because they are due to methodological artifacts, not to actual impacts teachers are likely to obtain in real schools with real students.

Ten pounds (160 ounces) of crabs only appear to be more than 20 ounces of crabmeat,  because the crabs contain a lot you need to discard.  The same is true of meta-analyses.  Using small samples, brief durations, and researcher-made measures in evaluations inflate effect sizes without adding anything to the actual impact of treatments for students.  Our job as meta-analysts is to strip away the bias the best we can, and get to the actual impact.  Then we can make comparisons and generalizations that make sense, and move forward understanding of what really works in education.

In our research group, when we deal with thorny issues of meta-analysis, I often ask my colleagues to consider that they had a sister who is a principal.  “What would you say to her,” I ask, “if she asked what really works, all BS aside?  Would you suggest a program that was very effective in a 30-minute study?  One that has only been evaluated with 20 students?  One that has only been shown to be effective if the researcher gets to make the measure?  Principals are sharp, and appropriately skeptical.  Your sister would never accept such evidence.  Especially if she’s experienced with Baltimore crabs.”

References

Cheung, A., & Slavin, R. (2016). How methodological features affect effect sizes in education. Educational Researcher, 45 (5), 283-292.

Kulik, J. A., & Fletcher, J. D. (2016). Effectiveness of intelligent tutoring systems: a meta-analytic review. Review of Educational Research, 86(1), 42-78.

Slavin, R., & Lake, C. (2008). Effective programs in elementary mathematics: A best-evidence synthesis. Review of Educational Research, 78 (3), 427-515.

Torgerson, C. J., Wiggins, A., Torgerson, D., Ainsworth, H., & Hewitt, C. (2013). Every Child Counts: Testing policy effectiveness using a randomised controlled trial, designed, conducted and reported to CONSORT standards. Research In Mathematics Education, 15(2), 141–153. doi:10.1080/14794802.2013.797746.

Photo credit: Kathleen Tyler Conklin/(CC BY 2.0)

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

How Much Have Students Lost in The COVID-19 Shutdowns?

Everyone knows that school closures due to the COVID-19 pandemic are having a serious negative impact on student achievement, and that this impact is sure to be larger for disadvantaged students than for others. However, how large will the impact turn out to be? This is not a grim parlor game for statisticians, but could have real meaning for policy and practice. If the losses turn out to be modest comparable to the “summer slide” we are used to (but which may not exist), then one might argue that when schools open, they might continue where they left off, and students might eventually make up their losses, as they do with summer slide. If, on the other hand, losses are very large, then we need to take emergency action.

Some researchers have used data from summer losses and from other existing data on, for example, teacher strikes, to estimate COVID losses (e.g., Kuhfeld et al., 2020). But now we have concrete evidence, from a country similar to the U.S. in most ways.

A colleague came across a study that has, I believe, the first actual data on this question. It is a recent study from Belgium (Maldonado & DeWitte, 2020) that assessed COVID-19 losses among Dutch-speaking students in that country.

The news is very bad.

The researchers obtained end-of-year test scores from all sixth graders who attend publicly-funded Catholic schools, which are attended by most students in Dutch-speaking Belgium. Sixth grade is the final year of primary school, and while schools were mostly closed from March to June due to COVID, the sixth graders were brought back to their schools in late May to prepare for and take their end-of primary tests. Before returning, the sixth graders had missed about 30% of the days in their school year. They were offered on-line teaching at home, as in the U.S.

The researchers compared the June test scores to those of students in the same schools in previous years, before COVID. After adjustments for other factors, students scored an effect size of -0.19 in mathematics, and -0.29 in Dutch (reading, writing, language). Schools serving many disadvantaged students had significantly larger losses in both subjects; inequality within the schools increased by 17% in mathematics and 20% in Dutch, and inequality between schools increased by 7% in math and 18% in Dutch.

There is every reason to expect that the situation in the U.S. will be much worse than that in Belgium. Most importantly, although Belgium had one of the worst COVID-19 death rates in the world, it has largely conquered the disease by now (fall), and its schools are all open. In contrast, most U.S. schools are closed or partially closed this fall. Students are usually offered remote instruction, but many disadvantaged students lack access to technology and supervision, and even students who do have equipment and supervision do not seem to be learning much, according to anecdotal reports.

In many U.S. schools that have opened fully or partially, outbreaks of the disease are disrupting schooling, and many parents are refusing to send their children to school. Although this varies greatly by regions of the U.S., the average American student is likely to have missed several more effective months of in-person schooling by the time schools return to normal operation.

But even if average losses turn out to be no worse than those seen in Belgium, the consequences are terrifying, for Belgium as well as for the U.S. and other COVID-inflicted countries.

Effect sizes of -0.19 and -0.29 are very large. From the Belgian data on inequality, we might estimate that for disadvantaged students (those in the lowest 25% of socioeconomic status), losses could have been -0.29 in mathematics and -0.39 in Dutch. What do we have in our armamentarium that is strong enough to overcome losses this large?

In a recent blog, I compared average effect sizes from studies of various solutions currently being proposed to remedy students’ losses from COVID shutdowns: Extended school days, after-school programs, summer school, and tutoring. Only tutoring, both one-to-one and one-to-small group, in reading and mathematics, had an effect size larger than +0.10. In fact, there are several one-to-one and one-to-small group tutoring models with effect sizes of +0.40 or more, and averages are around +0.30. Research in both reading and mathematics has shown that well-trained teaching assistants using structured tutoring materials or software can obtain outcomes as good as those obtained by certified teachers as tutors. On the basis of these data, I’ve been writing about a “Marshall Plan” to hire thousands of tutors in every state to provide tutoring to students scoring far below grade level in reading and math, beginning with elementary reading (where the evidence is strongest).

I’ve also written about national programs in the Netherlands and in England to provide tutoring to struggling students. Clearly, we need a program of this kind in the U.S. And if our scores are like the Belgian scores, we need it as quickly as possible. Students who have fallen far below grade level cannot be left to struggle without timely and effective assistance, powerful enough to bring them at least to where they would have been without the COVID school closures. Otherwise, these students are likely to lose motivation, and to suffer lasting damage. An entire generation of students, harmed through no fault of their own, cannot be allowed to sink into failure and despair.

References

Kuhfeld, M., Soland, J., Tarasawa, B., Johnson, A., Ruzek, E., & Liu, J. (2020). Projecting the potential impacts of COVID-19 school closures on academic achievement. (EdWorkingPaper: 20-226). Retrieved from Annenberg Institute at Brown University: https://doi.org/10.26300/cdrv-yw05

Maldonado, J. E., & DeWitte, K. (2020). The effect of school closures on standardized student test outcomes.Leuven, Belgium: University of Leuven.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Learning from International Schools Part II: Outbreaks after COVID-19 Re-openings: The Case of Israel

By guest blogger Nathan Storey*

The summer is over and fall semester is underway across the United States. Schools are reopening and students are back in the classroom, either virtually or in the flesh. Up to now, the focus of discussion has been about whether and how to open schools: in person, using remote instruction, or some mix of the two. But as schools actually open, those with any element of in-person teaching are starting to worry about how they will handle any outbreaks, should they occur. In fact, many countries that opened their schools before the U.S. have actually experienced outbreaks, and this blog focuses on learning from the tragic experience of Israel.  

In in-person schooling, outbreaks are all but inevitable. “We have to be realistic…if we are reopening schools, there will be some Covid,” says Dr. Benjamin Linas, associate professor of medicine and epidemiology at Boston University (Nierenberg & Pasick, 2020). Even though U.S. schools have already reopened, it is not too late to put outbreak plans into place in order to stem any future outbreaks and allow schools to remain in session.

Israel

On Thursday, September 17, Israel’s school system was shut down due to rising positivity rates; 5,523 new cases were recorded in one day prior to the decision, in a country about one fortieth the size of the U.S. The closures are due to last until October 11, though special education and youth-at-risk programs are continuing. The spike in COVID cases reported by health officials centered around children 10 years of age and up. “The government made the wrong decision, against professional recommendations,” COVID commissioner and Professor Ronni Gamzu wrote in a letter to Health Minister Yuli Edelstein and Education Minister Yoav Gallant.

Israel has been a cautionary tale since reopening schools in May. By July, 977 students and teachers were diagnosed with COVID, 22,520 had been quarantined, and 393 schools and kindergartens had been closed by the Education Ministry (Kershner & Belluck, 2020; Tarnopolsky, 2020). At the beginning of September, 30 “red” cities and neighborhoods were placed under lockdown due to spikes. Almost 4,000 students and over 1,600 teachers are currently in quarantine, while more than 900 teachers and students have been diagnosed with the virus (Savir, 2020).

Schools initially reopened following a phased approach and using social distancing and mask protocols. Students with diagnosed family members were not allowed back, and older staff members and those at risk were told not to return to the classroom. It seemed as if they were doing everything right. But then, a heat wave wiped all the progress away.

Lifting the face mask requirement for four days and allowing schools to shut their windows (so they could air condition) offered new opportunities for the virus to run rampant. An outbreak at Gymnasia Rehavia, a high school in Jerusalem, turned into the largest single-school outbreak seen so far, soon reaching to students’ homes and communities. Outbreaks also appeared outside of the Jerusalem area, including in an elementary school in Jaffa. Reflecting on the nationwide spread of the virus, researchers have estimated that as much as 47% of the total new infections in the whole of Israel could be traced to Israeli schools (Tarnopolsky, 2020), introduced to schools by adult teachers and employees, and spread by students, particularly middle-school aged children.

This crisis serves to illustrate just how important it is for education leaders, teachers, and students to remain vigilant in prevention efforts. The Israeli schools largely had the right ideas to ensure prevention. Some challenges existed, particularly related to fitting students into classrooms while maintaining six feet separation given large class sizes (in some cases, classrooms of 500 square feet have to hold as many as 38 students). But by relaxing their distancing regulations, the schools opened students, staff, and communities to a major outbreak.

Schools responded with quarantining individual students, classmates of infected students, teachers, and staff; and when a second unconnected case was detected, schools would close for two weeks. But Israel did not place a priority on contact tracing and testing. Students and staff were tested following outbreaks, but they experienced long wait times to take the test, increasing the opportunities for spread. In the case of one school outbreak, Professor Eli Waxman of Weizmann Institute of Science reported that school officials could not identify which buses students took to reach school (Kershner & Belluck, 2020). Having this type of information is vital for tracing who infected students may have come into contact with, especially for younger students who may not be able to list all those with whom they’ve been in close contact.

Before the fall semester began, it looked as if Israel had learned from their previous mistakes. The Education Ministry disseminated new regulations adapted to the local level based on infection rates, and once more planned a phased reopening approach starting with K-4th grades, followed by middle- and high-school students, who were set to follow a hybrid remote and in-person instruction approach. Schools planned to use plastic barriers to separate students in the classroom. Education leaders were to develop a guidebook to support the transition from in-person to distance learning and procedures to maintain distancing during celebrations or graduation ceremonies.

These precautions and adaptive plans suggested that Israel had learned from the mistakes made in the summer. Upon reopening, a new lesson was learned. Schools cannot reopen in a sustainable and long-term manner if community positivity rates are not under control.

*Nathan Storey is a graduate student at the Johns Hopkins University School of Education

References

Couzin-Frankel, J., Vogel, G., & Weil, M. (2020, July 7). School openings across globe suggest ways to keep coronavirus at bay, despite outbreaks. Science | AAAS. https://www.sciencemag.org/news/2020/07/school-openings-across-globe-suggest-ways-keep-coronavirus-bay-despite-outbreaks

Jaffe-Hoffman, M. (2020, September 16). 5,500 new coronavirus cases, as gov’t rules to close schools Thursday. The Jerusalem Post. https://www.jpost.com/breaking-news/coronavirus-4973-new-cases-in-the-last-day-642338

Kauffman, J. (2020, July 29). Israel’s hurried school reopenings serve as a cautionary tale. The World from PRX. https://www.pri.org/stories/2020-07-29/israels-hurried-school-reopenings-serve-cautionary-tale

Kershner, I., & Belluck, P. (2020, August 4). When Covid subsided, Israel reopened its schools. It didn’t go well. The New York Times. https://www.nytimes.com/2020/08/04/world/middleeast/coronavirus-israel-schools-reopen.html

Nierenberg, A., & Pasick, A. (2020, September 16). For school outbreaks, it’s when, not if—The New York Times. The New York Times. https://www.nytimes.com/2020/09/16/us/for-school-outbreaks-its-when-not-if.html

Savir, A. (2020, September 1). 2.4 million Israeli students go back to school in shadow of COVID-19. J-Wire. https://www.jwire.com.au/2-4-million-israeli-students-go-back-to-school-in-shadow-of-covid-19/

Schwartz, F., & Lieber, D. (2020, July 14). Israelis fear schools reopened too soon as Covid-19 cases climb. Wall Street Journal. https://www.wsj.com/articles/israelis-fear-schools-reopened-too-soon-as-covid-19-cases-climb-11594760001

Tarnopolsky, N. (2020, July 14). Israeli data show school openings were a disaster that wiped out lockdown gains. The Daily Beast. https://www.thedailybeast.com/israeli-data-show-school-openings-were-a-disaster-that-wiped-out-lockdown-gains

Photo credit: Talmoryair / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Learning from International Schools: Outbreaks after COVID-19 Re-openings: The Case of the United Kingdom

By guest blogger Nathan Storey, Johns Hopkins University*

For much of the summer, U.S. education leaders and media have questioned how to safely reopen schools to students and teachers. Districts have struggled to put together concrete plans for how to structure classes, how much of the instruction would be in person, how to maintain social distancing in the classroom, and how to minimize health risks.

Most school districts have focused on preventing outbreaks through masks and social distancing, among other measures. However, this has left a gap—what happens to these well-thought-out plans if and when there’s an outbreak? While many school districts (including 12 of the 15 largest in the United States) have opted to start schooling remotely, many others plan to or have already restarted in-person schooling, often without detailed prevention and response plans in place.

For those districts committed to in-person schooling, outbreaks in at least some schools are all but inevitable. Community positivity rates within the United States remain high, with some states experiencing positivity rates of up to 5.4% (CDC, 2020), compared to 2.3% in Scotland or 0.8% across the entire United Kingdom (JHU, 2020). The image of students without masks packed into the hallways of a Georgia school have already spread nationwide. It is clearly important to put these plans into place as soon as possible in order to stem any outbreaks and allow schools to remain in session.

In a series of case studies, I will examine the experiences of how other countries with similar education systems dealt with outbreaks in their schools and share lessons learned for the United States.

United Kingdom

Schools in England and Wales finally reopened last week for the fall semester, but Scottish schools reopened the week of August 10. Outbreaks in Scotland have been minimal, but a cluster of school outbreaks cropped up in the Glasgow region, most notably at Bannerman High School. Affected schools soon closed for one week following the positive tests, but students who tested positive remained at home in self-isolation for 14 days.

What makes this outbreak notable is that through testing of students and community members, researchers were able to trace the outbreak to a cluster of infections amongst senior managers at McVities biscuit factory, also in Glasgow. Having successfully traced the infections to this source, education leaders and researchers were able to determine that cases were not being transmitted within schools, and put into effect appropriate isolation procedures for potentially infected students and faculty.

Testing and contact tracing were conducted first during the spring and summer months when schools first reopened in the UK, following the national shutdown in March. Researchers (Ismail et al., 2020) were able to determine sources of outbreaks and prevalence amongst students and faculty, finding that transmission was less common within schools, providing crucial information to improve COVID understanding and informing quarantine and school lockdown protocols in the country.

Scotland has put into place a strong contact tracing protocol, coupled with self-isolation, social distancing, and more intensive hygiene protocols. Scientists from England have urged weekly testing of teachers, as well as “test and trace” protocols, but the schools minister, Nick Gibb, instead committed to testing of symptomatic individuals only. Researcher Michael Fischer recently launched the COVID-19 Volunteer Testing Network, hoping to create a network of laboratories across the UK using basic equipment common in most labs (specifically, a polymerase chain reaction or PCR machine) to provide rapid testing. Eventually, as many as 1,000 labs could each do 800 tests a day, providing rapid response to COVID-19 tests and enabling more effective contact tracing and allowing schools to isolate students and staff members without requiring entire schools to be shut down.

Another means of accelerating testing and contact tracing is through group or pooled testing. One scientist in England pointed to this form of testing—in which multiple individuals’ samples are pooled together and tested simultaneously, with subsequent individual tests in the event of a positive test result—as a means of providing quick testing even if testing materials are limited. This could be particularly useful for schools implementing clustered classrooms or educational pods, keeping students together throughout the day and limiting contact with other students and staff.

Through careful and thorough testing and contact tracing, as exemplified by the United Kingdom’s efforts, coupled with careful social distancing and preventative measures, United States school districts in areas with low positivity rates, comparable to those in the United Kingdom, could more systematically address outbreaks, avoiding entire school shutdowns, which can be disruptive to education for students. Preventative measures alone are not likely to be enough to get students and staff through what promises to be a difficult school year. These outbreak responsive systems are likely to be necessary as well.

References

Brazell, E. (2020, April 2). Scientist donates £1,000,000 to massively increase UK coronavirus testing. Metro. https://metro.co.uk/2020/04/02/scientist-donates-1000000-massively-increase-uk-coronavirus-testing-12499729/

CDC. (2020, September 4). COVIDView, Key Updates for Week 33. Centers for Disease Control and Prevention. https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html

Davis, N. (2020, August 10). Scientists urge routine Covid testing when English schools reopen. The Guardian. https://www.theguardian.com/education/2020/aug/10/scientists-urge-routine-covid-testing-when-english-schools-reopen

Duffy, E. (2020, August 19). Scots school closes with immediate effect after multiple confirmed cases of Covid-19. The Herald. https://www.heraldscotland.com/news/18662461.kingspark-school-dundee-school-closes-multiple-cases-covid-19-confirmed/

Government of United Kingdom. (2020, September 8). Coronavirus (COVID-19) in the UK: UK Summary. https://coronavirus.data.gov.uk/

Ismail, S. A., Saliba, V., Bernal, J. L., Ramsay, M. E., & Ladhani, S. N. (2020). SARS-CoV-2 infection and transmission in educational settings: Cross-sectional analysis of clusters and outbreaks in England (pp. 1–28). Public Health England. https://doi.org/10.1101/2020.08.21.20178574

Johns Hopkins University. (2020, September 8). Daily Testing Trends in the US – Johns Hopkins. Johns Hopkins Coronavirus Resource Center. https://coronavirus.jhu.edu/testing/individual-states

Macpherson, R. (2020, August 16). Coronavirus Scotland: Another pupil at Bannerman High School in Glasgow tests positive as cluster hits 12 cases – The Scottish Sun. https://www.thescottishsun.co.uk/news/5937611/coronavirus-scotland-bannerman-high-school-covid19/

Palmer, M. (2020, April 1). Call for small UK labs to embrace Dunkirk spirit and produce Covid-19 tests. Sifted. https://sifted.eu/articles/uk-labs-coronavirus-testing/

*Nathan Storey is a graduate student at the Johns Hopkins University School of Education

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Healing Covid-19’s Educational Losses: What is the Evidence?

I’ve written several blogs (here, here, here, here, here, and here) on what schools can do when they finally open permanently, to remedy what will surely be serious harm to the educational progress of millions of students. Without doubt, the students who are suffering the most from lengthy school closures are disadvantaged students, who are most likely to lack access to remote technology or regular support when their schools have been closed.

 Recently, there have been several articles circulated in the education press (e.g., Sawchuk, 2020) and newsletters laying out the options schools might consider to greatly improve the achievement of students who lost the most, and are performing far behind grade level.

The basic problem is that if schools simply start off with usual teaching for each grade level, this may be fine for students at or just below grade level, but for those who are far below level, this is likely to add catastrophe to catastrophe. Students who cannot read the material they are being taught, or who lack the prerequisite skills for their grade level, will experience failure and frustration. So the challenge is to provide students who are far behind with intensive, additional services likely to quickly accelerate their progress, so that they can then profit from ordinary, at-grade-level lessons.

In the publications I’ve seen, there have been several solutions frequently put forward. I thought this might be a good time to review the most common prescriptions in terms of their evidence basis in rigorous experimental or quasi-experimental research.

Extra Time

One proposal is to extend the school day or school year to provide additional time for instruction. This sounds logical; if the problem is time out of school, let’s add time in school.

The effects of extra time depend, of course, on what schools provide during that additional time. Simply providing more clock hours in which typical instruction is provided makes little difference. For example, in a large Florida study (Figlio, Holden, & Ozek, 2018), high-poverty schools were given a whole hour every day for a year, for additional reading instruction. This had a small impact on reading achievement (ES=+0.09) at a cost of about $800 per student, or $300,000-$400,000 per school. Also, in a review of research on secondary reading programs by Baye, Lake, Inns & Slavin (2019), my colleagues and I examined whether remedial programs were more effective if they were provided during additional time (one class period a day more than what the control group received for one or more years) or if they were provided during regular class time (the same amount of time the control group also received). The difference was essentially zero. The extra time did not matter. What did matter was what the schools provided (here and here).

After-School Programs

Some sources suggest providing after-school programs for students experiencing difficulties. A review of research on this topic by Kidron & Lindsay (2014) examined effects of after-school programs on student achievement in reading and mathematics. The effects were essentially zero. One problem is that students often did not attend regularly, or were poorly motivated when they did attend.

Summer School

As noted in a recent blog, positive effects of summer school were found only when intensive phonics instruction was provided in grades K or 1, but even in these cases, positive effects did not last to the following spring. Summer school is also very expensive.

Tutoring

By far the most effective approach for students struggling in reading or mathematics is tutoring (see blogs here, here, and here). Outcomes for one-to-one or one-to-small group tutoring average +0.20 to +0.30 in both reading and mathematics, and there are several particular programs that routinely report outcomes of +0.40 or more. Using teaching assistants with college degrees as tutors can make tutoring very cost-effective, especially in small-group programs.

Whole-School Reforms

There are a few whole-school reforms that can have substantial impacts on reading and mathematics achievement. A recent review of our elementary school reform model, Success for All (Cheung et al., 2020), found an average effect size of +0.24 for all students across 17 studies, and an average of +0.54 for low achievers.

A secondary reform model called BARR has reported positive reading and mathematics outcomes for ninth graders (T. Borman et al., 2017)

Conclusion

Clearly, something needs to be done about students returning to in-person education who are behind grade level in reading and/or mathematics. But resources devoted to helping these students need to be focused on approaches proven to work. This is not the time to invest in plausible but unproven programs. Students need the best we have that has been repeatedly shown to work.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (2019). Effective reading programs for secondary students. Reading Research Quarterly, 54 (2), 133-166.

Borman, T., Bos, H., O’Brien, B. C., Park, S. J., & Liu, F. (2017). i3 BARR validation study impact findings: Cohorts 1 and 2. Washington, DC: American Institutes for Research.

Cheung, A., Xie, C., Zhang, T., Neitzel, A., & Slavin, R. E. (2020). Success for All: A quantitative synthesis of evaluations. Manuscript submitted for publication. (Contact us for a copy.)

Figlio, D. N., Holden, K. L., & Ozek, U. (2018). Do students benefit from longer school days? Regression discontinuity evidence from Florida’s additional hour of literacy instruction. Economics of Education Review, 67, 171-183. https://doi.org/10.1016/j.econedurev.2018.06.003

Kidron, Y., & Lindsay, J. (2014). The effects of increased learning time on student academic and nonacademic outcomes: Findings from a meta‑analytic review (REL 2014-015). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Appalachia.

Sawchuk, S. (2020, August 26). Overcoming Covid-19 learning loss. Education Week, 40 (2), 6.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Extraordinary Gains: Making Them Last

One of the great frustrations of evidence-based reform in education is that while we do have some interventions that have a strong impact on students’ learning, these outcomes usually fade over time. The classic example is intensive, high-quality preschool programs. There is no question about the short-term impacts of quality preschool, but after fifty years, the Perry Preschool study remains the only case in which a randomized experiment found long-term positive impacts of preschool. I think the belief in the Perry Preschool’s long-term impacts conditioned many of us to expect amazing long-term impacts of early interventions of all kinds, but the Perry Preschool evaluation was flawed in several ways, and later randomized studies such as the Tennessee Voluntary Prekindergarten Program do not find such lasting impacts. There have been similar difficulties documenting long-term impacts of the Reading Recovery tutoring program. I have been looking at research on summer school (Neitzel et al., 2020), and found a few summer programs for kindergarteners and first graders that had exceptional impacts on end-of-summer reading effects, but these had faded by the following spring.

A little coaching can go a long way.

Advocates for these and other intensive interventions frequently express an expectation that resource-intensive interventions at key developmental turning points can transform the achievement trajectories of students performing below grade level or otherwise at risk. Many educators and researchers believe that after successful early intervention, students can participate in regular classroom teaching and will continue to advance with their agemates. However, for many students, this is unlikely.  For example, imagine a struggling third grade girl reading at the first grade level. After sixteen weeks of daily 30-minute tutoring, she has advanced to grade level reading. However, after finishing her course of tutoring, the girl may experience slow progress. She will probably not forget what she has learned, but other students, who reached grade level reading without tutoring, may make more rapid progress than she does, because whatever factors caused her to be two years below grade level in the third grade may continue to slow her progress even after tutoring succeeds. By sixth grade, without continuing intervention, she might be well below grade level again, perhaps better off than she would have been without tutoring, but not at grade level.

But what if we knew, as the evidence clearly suggests, that one year of Perry Preschool or 60 lessons of Reading Recovery or seven weeks of intensive reading summer school was not sufficient to ensure long-lasting gains in achievement? What could we do to see that successful investments in intensive early interventions are built upon in subsequent years, so that formerly at-risk students not only maintain what they learned, but continue afterwards to make exceptional gains?

Clearly, we could build on early gains by continuing to provide intensive intervention every year, if that is what is needed, but that would be extremely expensive. Instead, imagine that each school had within it a small group of teachers and teacher assistants, whose job was to provide initial tutoring for students at risk, and then to monitor students’ progress and to strategically intervene to keep students on track. For the moment, I’ll call them an Excellence in Learning Team (XLT). This team would keep close track of the achievement of all at-risk and formerly at-risk students on frequent assessments, at least in reading and math. These staff members would track students’ trajectories toward grade level performance. If students fall off of that trajectory, members of the XLT would provide tutoring to the students, as long as necessary. My assumption is that a student who made brilliant progress with 60 tutoring sessions, for example, would not need another 60 sessions each year to stay on track toward grade level, but that perhaps 10 or 20 sessions would be sufficient.

 The XLT would need effective, targeted tools to quickly and efficiently help students whose progress is stumbling. For example, XLT tutors might have available computer-assisted tutoring modules to assist students who, for example, mastered phonics, but are having difficulty with fluency, or multi-syllabic words, or comprehension of narrative or factual text. In mathematics, they might have specific computer-assisted tutoring modules on place value, fractions, or word problems. The idea is precision and personalization, so that the time of every XLT member is used to maximum effect. From the students’ perspective, assistance from the XLT is not a designation (like special or remedial education), but rather time-limited assistance to enable all students to achieve ambitious and challenging goals.

XLT, would be most effective, I believe, if students have started with intensive tutoring, intensive summer school, or other focused interventions that can bring about rapid progress. This is essential early in students’ progression. Rapid progress at the outset not only sets students up for success, in an academic sense, but it also convinces the student and his or her teachers that he or she is capable of extraordinary progress. Such confidence is crucial.

As an analogy to what I am describing here, consider how you cook a stew. You first bring the stew to a boil, and then simmer for a long time. If you only brought the stew to a boil and then turned off the stove, the stew would never cook. If you only set the stove on simmer, but did not first bring the stew to a boil, it might take hours to cook, if it ever did. It is the sequence of intense energy followed by less intense but lengthy support that does the job. Or consider a rocket to the moon, which needs enormous energy to reach escape velocity, followed by continued but less intense energy to complete the trip.  In education, high-quality preschool or tutoring or intensive summer school can play the part of the boil, but this needs to be followed by long-term, lower-intensity, precisely targeted support.

I would love to see a program of research designed to figure out how to implement long-term support to enable at-risk students to experience rapid success and then build on that success for many years. This is how we will finally leverage our demonstrated ability to make big differences in intensive early intervention, by linking it to multi-year, life-changing services that ensure students’ success in the long term, where it really matters.

References

Neitzel, A., Lake, C., Pellegrini, M., & Slavin, R. (2020). A synthesis of quantitative research on programs for struggling readers in elementary schools. Available at *www.bestevidence.org. Manuscript submitted for publication. *This new review of research on elementary programs for struggling readers had to be taken down because it is under review at a journal.  For a copy of the current draft, contact Amanda Neitzel (aneitzel@jhu.edu).

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

The Summertime Blues

            A long-ago rock song said it first: “There ain’t no cure for the summertime blues.”

            In the 1970s, Barbara Heyns (1978) discovered that over the summer, disadvantaged students lost a lot more of what they had learned in school than did advantaged students. Ever since then, educators have been trying to figure out how they could use time during the summer to help disadvantaged students catch up academically. I got interested in this recently because I have been trying to learn what kinds of educational interventions might be most impactful for the millions of students who have missed many months of school due to Covid-19 school closures. Along with tutoring and after school programs, summer school is routinely mentioned as a likely solution.

            Along with colleagues Chen Xie, Alan Cheung, and Amanda Neitzel, I have been looking at the literature on summer programs for disadvantaged students.

            There are two basic approaches to summer programs intended to help at-risk students. One of these, summer book reading, gives students reading assignments over the summer (e.g., Kim & Guryan, 2010). These generally have very small impacts, but on the other hand, they are relatively inexpensive.

            Of greater interest to the quest for powerful interventions to overcome Covid-19 learning losses are summer school programs in reading and mathematics. Studies of most of the summer school programs found they made little difference in outcomes. For example, an evaluation of a 5-week, six hour a day remedial program for middle school students found no significant differences in reading or math (Somers et al., 2015). However, there was one category of summer school programs that had at least a glimmer of promise. All three involved intensive, phonics-focused programs for students in kindergarten or first grade. Schachter & Jo (2005) reported substantial impacts of such a program, with a mean effect size of +1.16 on fall reading measures. However, by the following spring, a follow-up test showed a non-significant difference of +0.18. Zvoch & Stevens (2013), using similar approaches, found effect sizes of +0.60 for kindergarten and +0.78 for first grade. However, no measure of maintenance was reported. Borman & Dowling (2006) provided first graders with a 7-week reading-focused summer school. There were substantial positive effects by fall, but these disappeared by spring. The same students qualified for a second summer school experience after second grade, and this once again showed positive effects that faded by the following spring. There was no cumulative effect.

Because these studies showed no lasting impact, one might consider them a failure. However, it is important to note the impressive initial impacts, which might suggest that intensive reading instruction could be a part of a comprehensive approach for struggling readers in the early grades, if these gains were followed up during the school year with effective interventions. What summertime offers is an opportunity to use time differently (i.e., intensive phonics for young students who need it). It would make more sense to build on the apparent potential of focused summer school, rather than abandoning it based on its lack of long-term impacts.

            All by themselves, summer programs, based on the evidence we have so far “Ain’t no cure for the summertime blues.” But in next week’s blog, I discuss some ideas about how short-term interventions with powerful impacts, such as tutoring, pre-kindergarten,  and intensive phonics for students in grades K-1 in summer school, might be followed up with school-year interventions to produce long-term positive impacts. Perhaps summer school could be part of a cure for the school year blues.

References

Borman, G. D., & Dowling, Ν. M. (2006). Longitudinal achievement effects of multiyear summer school: Evidence from the Teach Baltimore randomized field trial. Educational Evaluation and Policy Analysis, 28, 25-48. doi:10.3102/01623737028001025

Heyns, B. (1978). Summer learning and the effect of schooling. New York: Academic Press.

Kim, J. S., & Guryan, J. (2010). The efficacy of a voluntary summer book reading intervention for low-income Latino children from language minority families. Journal of Educational Psychology, 102, 20-31. doi:10.1037/a0017270

Somers, M. A., Welbeck, R., Grossman, J. B., & Gooden, S. (2015). An analysis of the effects of an academic summer program for middle school students. Retrieved from ERIC website: https://files.eric.ed.gov/fulltext/ED558507.pdf

Schacter, J., & Jo, B. (2005). Learning when school is not in session: A reading summer day-camp intervention to improve the achievement of exiting first-grade students who are economically disadvantaged. Journal of Research in Reading, 28, 158-169. doi:10.1111/j.1467-9817.2005.00260.x

Zvoch, K., & Stevens, J. J. (2013). Summer school effects in a randomized field trial. Early Childhood Research Quarterly, 28(1), 24-32. doi:10.1016/j.ecresq.2012.05.002

Photo credit: American Education: Images of Teachers and Students in Action (CC BY-NC 4.0)

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

The Summer Slide: Fact or Fiction?

One of the things that “everyone knows” from educational research is that while advantaged students gain in achievement over the summer, disadvantaged students decline. However, the rate of gain during school time, from fall to spring, is about the same for advantaged and disadvantaged students. This pattern has led researchers such as Alexander, Entwisle, and Olson (2007) and Allington & McGill-Franzen (2018) to conclude that differential gain/loss over the summer completely explains the gap in achievement between advantaged and disadvantaged students. Middle class students are reading, going to the zoo, and going to the library, while disadvantaged students are less likely to do these school-like things.

The “summer slide,” as it’s called, has come up a lot lately, because it is being used to predict the amount of loss disadvantaged students will experience as a result of Covid-19 school closures. If disadvantaged students lose so much ground over 2 ½ months of summer vacation, imagine how much they will lose after five or seven or nine months (to January, 2021)!  Remarkably precise-looking estimates of how far behind students will be when school finally re-opens for all are circulating widely. These estimates are based on estimates of the losses due to “summer slide,” so they are naturally called “Covid slide.”

I am certain that most students, and especially disadvantaged students, are in fact losing substantial ground due to the long school closures. The months of school not attended, coupled with the apparent ineffectiveness of remote teaching for most students, do not bode well for a whole generation of children. But this is abnormal. Ordinary summer vacation is normal. Does ordinary summer vacation lead to enough “summer slide” to explain substantial gaps in achievement between advantaged and disadvantaged students?

 I’m pretty sure it does not. In fact, let me put this in caps:

SUMMER SLIDE IS PROBABLY A MYTH.

Recent studies of summer slide, mostly using NWEA MAP data from millions of children, are finding results that call summer slide into question (Kuhfeld, 2019; Quinn et al., 2016) or agree that it happens but that summer losses are similar for advantaged and disadvantaged students (Atteberry & McEachin, 2020). However, hiding in plain sight is the most conclusive evidence of all: NWEA’s table of norms for the MAP, a benchmark assessment widely used to monitor student achievement. The MAP is usually given three times a year. In the chart below, calculated from raw data on the NWEA website (teach.mapnwea.org), I compute the gains from fall to winter, winter to spring, and spring to fall (the last being “summer”). These are for grades 1 to 5 reading.

GradeFall to winterWinter to springSpring to fall (summer)
19.925.550.95
28.854.371.05
37.283.22-0.47
45.832.33-0.35
54.641.86-0.81
Mean7.303.470.07

NWEA’s chart is probably accurate. But it suggests something that cannot possibly be true. No, it’s not that students gain less in reading each year. That’s true. It is that students gain more than twice as much from fall to winter as they do from winter to spring. That cannot be true.Why would students gain so much more in the first semester than the second? One might argue that they are fresher in the fall, or something like that. But double the gain, in every elementary grade? That cannot be right.

 Here is my explanation. The fall score is depressed.

The only logical explanation for extraordinary fall-to-winter gain is that many students score poorly on the September test, but rapidly recover.

I think most elementary teachers already know this. Their experience is that students score very low when they return from summer vacation, but this is not their true reading level. For three decades, we have noticed this in our Success for All program, and we routinely recommend that teachers place students in our reading sequence not where they score in September, but no lower than they scored last spring. (If students score higher in September than they did on a spring test, we do use the September score).

What is happening, I believe, is that students do not forget how to read, they just momentarily forget how to take tests. Or perhaps teachers do not invest time in preparing students to take a pretest, which has few if any consequences, but they do prepare them for winter and spring tests. I do not know for sure how it happens, but I do know for sure, from experience, that fall scores tend to understate students’ capabilities, often by quite a lot. And if the fall score is artificially or temporarily low, then the whole summer loss story is wrong.

Another indicator that fall scores are, shall we say, a bit squirrely, is the finding by both Kuhfield (2019) and Atteberry & McEachin (2020) that there is a consistent negative correlation between school year gain and summer loss. That is, the students who gain the most from fall to spring lose the most from spring to fall. How can that be? What must be going on is just that students who get fall scores far below their actual ability quickly recover, and then make what appear to be fabulous gains from fall to spring. But that same temporarily low fall score gives them a summer loss. So of course there is a negative correlation, but it does not have any practical meaning.

So far, I’ve only been talking about whether there is a summer slide at all, for all students taken together. It may still be true, as found in the Heyns (1978) and Alexander, Entwisle, and Olson (2007) studies, that disadvantaged students are not gaining as much as advantaged students do over the summer. Recent studies by Atteberry & McEachin (2020) and Kuhfeld (2019) do not find much differential summer gain/loss according to social class. One the other hand, it could be that disadvantaged students are more susceptible to forgetting how to take tests. Or perhaps disadvantaged students are more likely to attend schools that put little emphasis on doing well on a September test that has no consequences for the students or the school. But it is unlikely they are truly forgetting how to read. The key point is that if fall tests are unreliable indicators of students’ actual skills, if they are just temporary dips that do not indicate what students can do, then taking them seriously in determining whether or not “summer slide” exists is not sensible.

By the way, before you begin thinking that while summer slide may not happen in reading but it must exist in math or other subjects, prepare to be disappointed again. The NWEA MAP scores for math, science, and language usage follow very similar patterns to those in reading.

Perhaps I’m wrong, but if I am, then we’d better start finding out about the amazing fall-to-winter surge, and see how we can make winter-to-spring gains that large! But if you don’t have a powerful substantive explanation for the fall-to-winter surge, you’re going to have to accept that summer slide isn’t a major factor in student achievement.

References

Alexander, K. L., Entwisle, D. R., & Olson, L. S. (2007). Lasting consequences of the summer learning gap. American Sociological Review, 72(2), 167-180.  doi:10.1177/000312240707200202

Allington, R. L., & McGill-Franzen, A. (Eds.). (2018). Summer reading: Closing the rich/poor reading achievement gap. New York, NY: Teachers College Press.

Atteberry, A., & McEachin, A. (2020). School’s out: The role of summers in understanding achievement disparities. American Educational Research Journal https://doi.org/10.3102/0002831220937285

Heyns, B. (1978). Summer learning and the effect of schooling. New York: Academic Press.

Kuhfeld, M (2019). Surprising new evidence on summer learning loss. Phi Delta Kappan, 101 (11), 25-29.

Quinn, D., Cook, N., McIntyre, J., & Gomez, C. J. (2016). Seasonal dynamics of academic achievement inequality by socioeconomic status and race/ethnicity: Updating and extending past research with new national data. Educational Researcher, 45 (8), 443-453.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

 Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org

Could Intensive Education Rescue Struggling Readers?

Long, long ago, I heard about a really crazy idea. Apparently, a few private high schools were trying a scheduling plan in which instead of having students take all of their subjects every day, they would take one subject at a time for a month or six weeks. The idea was that with a total concentration on one subject, with no time lost in changing classes, students could make astonishing progress. At the end of each week, they could see the progress they’d made, and really feel learning happening.

Algebra? Solved!

French? Accompli!

Of course, I could not talk anyone into trying this. I almost got a Catholic school to try it, but when they realized that kids would have to take religion all day, that was that.

However, in these awful days, with schools nationwide closing for months due to Covid, I was thinking about a way to use a similar concept with students who have fallen far behind, or actually with any students who are far behind grade level for any reason.

What happens now with students who are far behind in, say, reading, is that they get a daily period of remedial instruction, or special education. For most of them, despite the very best efforts of dedicated teachers, this is not very effective. Day after day after day, they get instruction that at best moves them forward at a slow, steady pace. But after a while, students lose any hope of truly catching up, and when you lose hope, you lose motivation, and no one learns without motivation.

blog_8-13-20_tripletutor_333x500So here is my proposal. What if students who were far behind could enroll in a six-week intensive service designed to teach them to read, no matter what? They would attend an intensive class, perhaps all day, in which they receive a promise: this time, you’ll make it. No excuses. This is the best chance you’ll ever have. Students would be carefully assessed, including their vision and hearing as well as their reading levels. They would be assigned to one-to-small group or, if necessary, one-to-one instruction for much of the day. There might be music or sports or other activities between sessions, but imagine that students got three 40-minute tutoring sessions a day, on content exactly appropriate to their needs. The idea, as in intensive education, would be to enable the students to feel the thrill of learning, to see unmistakable gains in a day, extraordinary gains in a week. The tutoring could be to groups of four for most students, but students with the most difficult, most unusual problems could receive one-to-one tutoring.

The ideal time to do this intensive tutoring would be summer school. Actually, this has been done in a few studies. Schacter & Jo (2005) provided intensive phonics instruction to students after first grade in three disadvantaged schools in Los Angeles. The seven-week experience increased their test scores by an effect size of +1.16, compared to similar students who did not have the opportunity to attend summer school. Zvoch & Stevens (2015) also provided intensive phonics instruction in small groups in a 5-week reading summer school. The students were disadvantaged kindergartners and first graders in a medium-sized city in the Pacific Northwest. The effect sizes were +0.60 for kindergarten, +0.78 for first grade.

Summer is not the only good time for intensive reading instruction. Reading is so important that it would be arguably worthwhile to provide intensive six-week instruction (with time out for mathematics) and breaks for, say, sports and music, during the school year.

If intensive education were as effective as ordinary 40-minute daily tutoring, it might be no more expensive. A usual course of tutoring is 20 weeks, so triple tutoring sessions for six weeks would cost almost the same as 18 weeks of ordinary tutoring. In other words, if intensive tutoring is more effective than ordinary tutoring, then the additional benefits might cost little or nothing.

Intensive tutoring would make particular sense to try during summer, 2021, when millions of students will still be far behind in reading because of the lengthy school closures they will have experienced. I have no idea whether intensive tutoring will be more or less effective than ordinary one-to-small group tutoring (which is very, very effective; see here and here). Planfully concentrating tutoring during an intensive period of time certainly seems worth a try!

References

Schacter, J., & Jo, B. (2005). Learning when school is not in session: A reading summer day-camp intervention to improve the achievement of exiting first-grade students who are economically disadvantaged. Journal of Research in Reading, 28, 158-169. doi:10.1111/j.1467-9817.2005.00260.x

Zvoch, K., & Stevens, J. J. (2013). Summer school effects in a randomized field trial. Early Childhood Research Quarterly, 28(1), 24-32. doi:10.1016/j.ecresq.2012.05.002

Photo credit: American Education: Images of Teachers and Students in Action, (CC BY-NC 4.0)

 This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org