More Chinese Dragons: How the WWC Could Accelerate Its Pace

blog_4-26-18_chinesedragon_500x375

A few months ago, I wrote a blog entitled “The Mystery of the Chinese Dragon: Why Isn’t the WWC Up to Date?” It really had nothing to do with dragons, but compared the timeliness of the What Works Clearinghouse review of research on secondary reading programs and a Baye et al. (2017) review on the same topic. The graph depicting the difference looked a bit like a Chinese dragon with a long tail near the ground and huge jaws. The horizontal axis was the dates accepted studies had appeared, and the vertical axis was the number of studies. Here is the secondary reading graph.

blog_4-26-18_graph1_500x292

What the graph showed is that the WWC and the U.S. studies from the Baye et al. (2017) review were similar in coverage of studies appearing from 1987 to 2009, but after that diverged sharply, because the WWC is very slow to add new studies, in comparison to reviews using similar methods.

In the time since the Chinese Dragon for secondary reading studies appeared on my blog, my colleagues and I have completed two more reviews, one on programs for struggling readers by Inns et al. (2018) and one on programs for elementary math by Pellegrini et al. (2018). We made new Chinese Dragon graphs for each, which appear below.*

blog_4-26-18_graph3_500x300

blog_4-26-18_graph2_500x316

*Note: In the reading graph, the line for “Inns et al.” added numbers of studies from the Inns et al. (2018) review of programs for struggling readers to additional studies of programs for all elementary students in an unfinished report.

The new dragons look remarkably like the first. Again, what matters is the similar pattern of accepted studies before 2009, (the “tail”), and the sharply diverging rates in more recent years (the “jaws”).

There are two phenomena that cause the dragons’ “jaws” to be so wide open. The upper jaw, especially in secondary reading and elementary math, indicate that many high-quality rigorous evaluations are appearing in recent years. Both the WWC inclusion standards and those of the Best Evidence Encyclopedia (BEE; www.bestevidence.org) require control groups, clustered analysis for clustered designs, samples that are well-matched at pretest and have similar attrition by posttest, and other features indicating methodological rigor, of the kind expected by the ESSA evidence standards, for example.

The upper jaw of each dragon is increasing so rapidly because rigorous research is increasing rapidly in the U.S. (it is also increasing rapidly in the U.K., but the WWC does not include non-U.S. studies, and non-U.S. studies are removed from the graph for comparability). This increase is due to U. S. Department of Education funding of many rigorous studies in each topic area, through its Institute for Education Sciences (IES) and Investing in Innovation (i3) programs, and special purpose funding such as Striving Readers and Preschool Curriculum Education Research. These recent studies are not only uniformly rigorous, they are also of great importance to educators, as they evaluate current programs being actively disseminated today. Many of the older programs whose evaluations appear on the dragons’ tails no longer exist, as a practical matter. If educators wanted to adopt them, the programs would have to be revised or reinvented. For example, Daisy Quest, still in the WWC, was evaluated on TRS-80 computers not manufactured since the 1980s. Yet exciting new programs with rigorous evaluations, highlighted in the BEE reviews, do not appear at all in the WWC.

I do not understand why the WWC is so slow to add new evaluations, but I suspect that the answer lies in the painstaking procedures any government has to follow to do . . ., well, anything. Perhaps there are very good reasons for this stately pace of progress. However, the result is clear. The graph below shows the publication dates of every study in every subject and grade level accepted by the WWC and entered on its database. This “half-dragon” graph shows that only 26 studies published or made available after 2013 appear on the entire WWC database. Of these, only two have appeared after 2015.

blog_4-26-18_graph4_500x316

The slow pace of the WWC is of particular concern in light of the appearance of the ESSA evidence standards. More educators than ever before must be consulting the WWC, and many must be wondering why programs they know to exist are not listed there, or why recent studies do not appear.

Assuming that there are good reasons for the slow pace of the WWC, or that for whatever reason the pace cannot be greatly accelerated, what can be done to bring the WWC up to date? I have a suggestion.

Imagine that the WWC commissioned someone to do rapid updating of all topics reviewed on the WWC website. The reviews would follow WWC guidelines, but would appear very soon after studies were published or issued. It’s clear that this is possible, because we do it for Evidence for ESSA (www.evidenceforessa.org). Also, the WWC has a number of “quick reviews,” “single study reports,” and so on, scattered around on its site, but not integrated with its main “Find What Works” reviews of various programs. These could be readily integrated with “Find What Works.”

The recent studies identified in this accelerated process might be identified as “provisionally reviewed,” much as the U. S. Patent Office has “patent pending” before inventions are fully patented. Users would have an option to look only at program reports containing fully reviewed studies, or could decide to look at reviews containing both fully and provisionally reviewed studies. If a more time consuming full review of a study found results different from those of the provisional review, the study report and the program report in which it was contained would be revised, of course.

A process of this kind could bring the WWC up to date and keep it up to date, providing useful, actionable evidence in a timely fashion, while maintaining the current slower process, if there is a rationale for it.

The Chinese dragons we are finding in every subject we have examined indicate the rapid growth and improving quality of evidence on programs for schools and students. The U. S. Department of Education and our whole field should be proud of this, and should make it a beacon on a hill, not hide our light under a bushel. The WWC has the capacity and the responsibility to highlight current, high-quality studies as soon as they appear. When this happens, the Chinese dragons will retire to their caves, and all of us, government, researchers, educators, and students, will benefit.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (2017). Effective reading programs for secondary students. Manuscript submitted for publication. Also see Baye, A., Lake, C., Inns, A. & Slavin, R. E. (2017, August). Effective reading programs for secondary students. Baltimore, MD: Johns Hopkins University, Center for Research and Reform in Education.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Photo credit: J Bar [GFDL (http://www.gnu.org/copyleft/fdl.html), CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/), GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Love, Hope, and Evidence in Secondary Reading

I am pleased to announce that our article reviewing research on effective secondary reading programs has just been posted on the Best Evidence Encyclopedia, aka the BEE. Written with my colleagues Ariane Baye, Cynthia Lake, and Amanda Inns, our review found 64 studies of 49 reading programs for students in grades 6 to 12, which had to meet very high standards of quality. For example, 55 of the studies used random assignment to conditions.

But before I get all nerdy about the technical standards of the review, I want to reflect on what we learned. I’ve already written about one thing we learned, that simply providing more instructional time made little difference in outcomes. In 22 of the studies, students got an extra period for reading beyond what control students got for at least an entire year, yet programs (other than tutoring) that provided extra time did no better than those that did not.

If time doesn’t help struggling readers, what does? I think I can summarize our findings with three words: love, hope, and evidence.

Love and hope are exactly what students who are reading below grade level are lacking. They are no longer naive. They know exactly what it means to be a poor reader in a high-poverty secondary school (almost all of the schools in our review served disadvantaged adolescents). If you can’t read well, college is out of the question. Decent jobs without a degree are scarce. If you have no hope, you cannot be motivated, or you may be motivated in antisocial directions that give you at least a chance for money and recognition. Every child needs love, but poor readers in secondary schools are too often looking for love in all the wrong places.

The successful programs in our review were ones that give adolescents a chance to earn the hope and love they crave. One category, all studies done in England, involved one-to-one and small group tutoring. How better to build close relationships between students and caring adults than to have individual or very small group time with them? And the one-to-one or small group setting allows tutors to personalize instruction, giving students a sense of hope that this time, their efforts will pay off (as the evidence says it will).

But the largest impacts in our review came from two related programs – The Reading Edge and Talent Development High School (TDHS). These both developed in our research center at Johns Hopkins University in the 1990s, so I have to be very modest here. But beyond these individual programs, I think there is a larger message.

Both The Reading Edge (for middle schools) and TDHS (for high schools) organize students into mixed-ability cooperative teams. The team members work on activities designed to build reading comprehension and related skills. Students are frequently assessed and on the basis of those assessments, they can earn recognition for their teams. Teachers introduce lessons, and then, as students work with each other on reading activities, teachers can cruise around the class looking in on students who need encouragement or help, solving problems, and building relationships. Students are on task, eager to learn, and seeing the progress they are making, but students and teachers are laughing together, sharing easy banter, and encouraging each other. Yes, this really happens. I’ve seen it hundreds of times in secondary schools throughout the U.S. and England.

Many of the most successful programs in our review also are based on principles of love and hope. BARR, a high school program, is an excellent example. It uses block scheduling to build positive relationships among a group of students and teachers, adding regular meetings between teachers and students to review their progress in all areas, social as well as academic. The program focuses on building positive social-emotional skills and behaviors, and helping students describe their desired futures, make plans to get there, and regularly review progress on their plans with their teachers and peers. Love and hope.

California’s Expository Reading and Writing Course helps 12th graders hoping to attend California State Universities prepare to pass the test used to determine whether students have to take remedial English (a key factor in college dropout). The students work in groups, helping each other to build reading, writing, and discussion skills, and helping students to visualize a future for themselves. Love and hope.

A few technology programs showed promising outcomes, especially Achieve3000 and Read 180. These do not replace teachers and peers with technology, but instead cycle students through small group, teacher-led, and computer-assisted activities. Pure technology programs did not work so well, but models taking advantage of relationships as well as personalization did best. Love and hope.

Of course, love and hope are not sufficient. We also need evidence that students are learning more than they might have been. To produce positive achievement effects requires outstanding teaching strategies, professional development, curricular approaches, assessments, and more. Love and hope may be necessary but they are not sufficient.

Our review applied the toughest evidence standards we have ever applied. Most of the studies we reviewed did not show positive impacts on reading achievement. But the ones that did so inspire that much more confidence. The very fact that we could apply these standards and still find plenty of studies that meet them shows how much our field is maturing. This in itself fills me with hope.

And love.

Apology

In a recent blog, I wrote about work we are doing to measure the impact on reading and math performance of a citywide campaign to provide assessments and eyeglasses to every child in Baltimore, from pre-k to grade 8. I forgot to mention the name of the project, Vision for Baltimore, and neglected to say that the project operates under the authority of the Baltimore City Health Department, which has been a strong supporter. I apologize for the omission.

Time Passes. Will You?

When I was in high school, one of my teachers posted a sign on her classroom wall under the clock:

Time passes. Will you?

Students spend a lot of time watching clocks, yearning for the period to be over. Yet educators and researchers often seem to believe that more time is of course beneficial to kids’ learning. Isn’t that obvious?

In a major review of secondary reading programs I am completing with my colleagues Ariane Baye, Cynthia Lake, and Amanda Inns, it turns out that the kids were right. More time, at least in remedial reading, may not be beneficial at all.

Our review identified 60 studies of extraordinary quality- mostly large-scale randomized experiments- evaluating reading programs for students in grades 6 to 12. In most of the studies, students reading 2 to 5 grade levels below expectations were randomly assigned to receive an extra class period of reading instruction every day all year, in some cases for two or three years. Students randomly assigned to the control group continued in classes such as art, music, or study hall. The strategies used in the remedial classes varied widely, including technology approaches, teaching focused on metacognitive skills (e.g., summarization, clarification, graphic organizers), teaching focused on phonics skills that should have been learned in elementary school, and other remedial approaches, all of which provided substantial additional time for reading instruction. It is also important to note that the extra-time classes were generally smaller than ordinary classes, in the range of 12 to 20 students.

In contrast, other studies provided whole class or whole school methods, many of which also focused on metacognitive skills, but none of which provided additional time.

Analyzing across all studies, setting aside five British tutoring studies, there was no effect of additional time in remedial reading. The effect size for the 22 extra-time studies was +0.08, while for 34 whole class/whole school studies, it was slightly higher, ES =+0.10. That’s an awful lot of additional teaching time for no additional learning benefit.

So what did work? Not surprisingly, one-to-one and small-group tutoring (up to one to four) were very effective. These are remedial and do usually provide additional teaching time, but in a much more intensive and personalized way.

Other approaches that showed particular promise simply made better use of existing class time. A program called The Reading Edge involves students in small mixed-ability teams where they are responsible for the reading success of all team members. A technology approach called Achieve3000 showed substantial gains for low-achieving students. A whole-school model called BARR focuses on social-emotional learning, building relationships between teachers and students, and carefully monitoring students’ progress in reading and math. Another model called ERWC prepares 12th graders to succeed on the tests used to determine whether students have to take remedial English at California State Universities.

What characterized these successful approaches? None were presented as remedial. All were exciting and personalized, and not at all like traditional instruction. All gave students social supports from peers and teachers, and reasons to hope that this time, they were going to be successful.

There is no magic to these approaches, and not every study of them found positive outcomes. But there was clearly no advantage of remedial approaches providing extra time.

In fact, according to the data, students would have done just as well to stay in art or music. And if you’d asked the kids, they’d probably agree.

Time is important, but motivation, caring, and personalization are what counts most in secondary reading, and surely in other subjects as well.

Time passes. Kids will pass, too, if we make such good use of our time with them that they won’t even notice the minutes going by.

Joy is a Basic Skill in Secondary Reading

I have a policy of not talking about studies I’m engaged in before they are done and available, but I have an observation to make that just won’t wait.

I’m working on a review of research on secondary reading programs with colleagues Ariane Baye (University of Liege in Belgium) and Cynthia Lake (Johns Hopkins University). We have found a large number of very high-quality studies evaluating a broad range of programs. Most are large, randomized experiments.

Mostly, our review is really depressing. The great majority of studies have found no effects on learning. In particular, programs that focus on teaching middle and high school students struggling in reading in classes of 12 to 20, emphasizing meta-cognitive strategies, phonics, fluency, and/or training for teachers in what they were already doing, show few impacts on learning. Most of the studies provided daily, extra reading classes to help struggling readers build their skills, while the control group got band or art. They should have stayed in band or art.

Yet all is not dismal. Two approaches did have markedly positive effects. One was tutoring students in groups of one to four, not every day but perhaps twice a week. The other was cooperative learning, where students worked in four-member teams to help each other learn and practice reading skills. How could these approaches be so much more effective than the others?

My answer begins with a consideration of the nature of struggling adolescent readers. They are bored out of their brains. They are likely to see school as demeaning, isolating, and unrewarding. All adolescents live for their friends. They crave mastery and respect. Remedial approaches have to be fantastic to overcome the negative aspects of having to be remediated in the first place.

Tutoring can make a big difference, because groups are small enough for students to make meaningful relationships with adults and with other kids, and instruction can be personalized to meet their unique needs, to give them a real shot at mastery.

Cooperative learning, however, had a larger average effect size than tutoring. Even though cooperative learning did not require smaller class sizes and extra daily instructional periods, it was much more effective than remedial instruction. Cooperative learning gives struggling adolescent readers opportunities to work with their peers, to teach each other, to tease each other, to laugh, to be active rather than passive. To them, it means joy. And joy is a basic skill.

Of course, joy is not enough. Kids must be learning joyfully, not just joyful. Yet in our national education system, so focused on testing and accountability, we have to keep remembering who we are teaching and what they need. More of the same, a little slower and a little louder, won’t do it. Adolescents need a reason to believe that things can be better, and that school need not cut them off from their peers. They need opportunities to teach and learn from each other. School must be joyful, or it is nothing at all, for so many adolescents.

Making Evidence Primary for Secondary Readers

In the wonderful movie Awakenings, Robin Williams plays a research neuroscientist who has run out of grants and therefore applies for a clinical job at a mental hospital. In the interview, the hospital’s director asks him about his research.

“I was trying to extract myelin from millions of earthworms,” he explains.

“But that’s impossible!” says the director.

“Yes, but now we know it’s impossible,” says Robin Williams’ character.

I recently had an opportunity to recall this scene. I was traveling back to Baltimore from Europe. Whenever I make this trip, I use the eight or so uninterrupted hours to do a lot of work. This time I was reading a giant stack of Striving Readers reports, because I am working with colleagues to update a review of research on secondary reading programs.

Striving Readers, part of Reading First, was a richly funded initiative of the George W. Bush administration that gave money to states to help them adopt intensive solutions for below-level readers in middle and high schools. The states implemented a range of programs, almost all of them commercial programs designed for secondary readers. To their credit, the framers of Striving Readers required rigorous third-party evaluations of whatever the states implemented, and those were the reports I was reading. Unfortunately, it apparently did not occur to anyone to suggest that the programs have their own evidence of effectiveness prior to being implemented and evaluated as part of Striving Readers.

As you might guess from the fact that I started off this blog post with the earthworm story, the outcomes are pretty dismal. A few of the studies found statistically significant impacts, but even those found very small effect sizes, and only on some but not other measures or subgroups.

I’m sure I and others will learn more as we get further into these reports, which are very high-quality evaluations with rich measures of implementation as well as outcomes. But I wanted to make one observation at this point.

Striving Readers was a serious, well-meaning attempt to solve a very important problem faced by far too many secondary students: difficulties with reading. I’m glad the Department of Education was willing to make such an investment. But next time anyone thinks of doing something on the scale of Striving Readers, I hope they will provide preference points in the application process for applicants who propose to use approaches with solid evidence of effectiveness. I also hope government will continue to fund development and evaluation of programs to address enduring problems of education, so that when they do start providing incentives for using proven programs, there will be many to choose from.

Just like the earthworm research in Awakenings, finding out conclusively what doesn’t work is a contribution to science. But in education, how many times do we have to learn what doesn’t work before we start supporting programs that we know do work? It’s time to recognize on a broad scale that programs proven to work in rigorous evaluations are more likely than other approaches to work again if implemented well in similar settings. Even earthworms learn from experience. Shouldn’t we do the same?

Lessons from Innovators: Collaborative Strategic Reading

2013-09-25-HPImage092513.jpg

The process of moving an educational innovation from a good idea to widespread effective implementation is far from straightforward, and no one has a magic formula for doing it. The William T. Grant and Spencer Foundations, with help from the Forum for Youth Investment, have created a community composed of grantees in the federal Investing in Innovation (i3) program to share ideas and best practices. Our Success for All program participates in this community. In this space, I, in partnership with the Forum for Youth Investment, highlight observations from the experiences of i3 grantees other than our own, in an attempt to share the thinking and experience of colleagues out on the front lines of evidence-based reform.

This blog is based on an interview that the Forum for Youth Investment conducted with Janette Klingner and Alison Boardman from the School of Education at the University of Colorado Boulder (CU). They are working closely with Denver Public Schools on an i3 validation project focused on scaling the Collaborative Strategic Reading (CSR) instructional model across middle schools. As lead partners in a district-led initiative, the two reflect on the dynamics of collaborating across research and practice, as well as the critical importance of embedding new practices within existing district infrastructure. Some of their lessons learned are summarized below.

Project “home” matters
Unlike many i3 projects, the CSR grant was submitted by and awarded to Denver Public Schools, with the university as a subcontractor. The project “home” has influenced dynamics in important ways and at multiple levels, beginning with a basic level of buy-in and ownership that is not always present in school improvement projects and studies that the CU team has been involved in. In university-led projects, as Klingner pointed out, a district can simply decide to back out for any number of reasons. Though there are obvious downsides to “outsiders” coming in with an intervention for people to adopt, being district-led as opposed to university-led hasn’t necessarily meant smooth sailing. Some teachers, Klingner noted, expressed resistance because they were being told by the district to implement yet another program. Despite occasional resistance, CSR is making good progress on ambitious expansion goals laid out by the district, and in fact the project is ahead of schedule in terms of middle school expansion. “We are moving faster than we envisioned. We have teachers and schools really wanting to learn CSR, and we are adding them sooner than planned,” said Klingner.

Rethink traditional research-practice relationships
The CU team brings a “design-based implementation research” perspective to this work, which is based on the idea of collaborative, iterative design and implementation, focused on a district-identified problem of practice. “We know that handing schools an innovation in a box and seeing how it works is not effective,” said Klingner. “We are trying to be intentional about scaffolding from the validation stage, where there is more support available for an intervention, to scale-up, where the new practices become integrated and can be scaled and sustained. Working closely with the district seems like the only way to do that successfully.” While there are clear advantages to this approach, there are instances where despite the close partnership, conflicting priorities of the partners emerge. For example, in an effort to implement consistently and in a coordinated fashion across a large group of schools, the district sometimes imposes strict guidelines, such as requiring all science teachers to implement CSR on a given day. While this helps with knowing where and when to deploy coaches, it doesn’t necessarily make sense if your goal is to better understand and support teachers’ authentic integration of a new instructional model into their classroom practice in the context of their curriculum. Despite occasional bumps in the road, the project is built upon a strong partnership, and that partnership is critical to how the team thinks about scale and sustainability.

Embed within existing structures
The CSR team has been intentional from the beginning about embedding the intervention within existing district and school infrastructure. “We are very aware that this needs to become part of the daily practice of what the district does,” said Klingner. From working to maximize teacher-leader roles, to housing a principal liaison within the district as opposed to at the University, the team is constantly re-evaluating to what extent practices are being embedded. “Sometimes it feels like this is becoming part of the ongoing infrastructure, and then there will be some change and we’re not so sure. There’s a tipping point and even though we have a lot of great buy-in, I’d say we’re not there yet.” Boardman noted that making sure that everyone working in support roles with teachers is trained in CSR would be ideal. “Ideally all of the different coaches in the district would be able to coach for CSR. So the literacy coaches that are in schools, the teacher effectiveness coaches that visit schools, those supporting classroom management or use of the curriculum – all these different existing mechanisms would be able to support CSR. We are trying to do this and have done a lot of training for people in different roles, but we are not there yet and the plan for how to get there is still evolving.”

Align with existing initiatives, tools, and processes
In addition to extensive training, linking and aligning CSR with other district initiatives has also been a priority. For example, it was clear early on that for teachers to engage in a meaningful way, any new instructional model needed to align with LEAP, Denver’s teacher effectiveness framework. This makes sense and has been a priority, though LEAP itself, in addition to its uses, is still evolving. As Boardman put it, “Maintaining a consistent research design when everything around you is changing is a challenge. That said, we are working hard to understand how our model aligns with LEAP and working with teachers to help them understand the connections and to ensure they feel what they’re doing is supported and will pay off for them.” Implementation of the Common Core standards has been another new effort with which the project has had to align. The team’s commitment to link CSR to existing or emerging work is consistent and laudable, though they are aware of potential trade-offs. “We are rolling with the new things as they come in,” said Boardman, “but there are pros and cons. Sometimes we become overly consumed by trying to connect with district initiatives. We have to be careful about where things fit and where they simply might not.”

Find the right communications channels and messengers
Just as important as trying to figure out where CSR fits is making sure it doesn’t become “just another add-on.” One thing the project team feels is important for sustainability is figuring out at what point information needs to be communicated and by whom. As Boardman said, “Things have to be communicated by the right players. We and our district colleagues are constantly trying to figure out where and by whom key information should be communicated in order for teachers and others to feel this is the real deal. Is it the district’s monthly principal meetings? Is the key that we need area superintendents to say this is a priority?” The team is thinking about communications and messaging at both the district and the school levels. “At the school level, there is also a great deal of integration that needs to happen, and CSR people can’t be at every meeting. So which meetings are critical to attend? Which planning sessions should we prioritize?” Keenly aware that change happens in the context of relationships, the CSR team is being as intentional about communications and messaging as they are about things like tools and trainings.