Could Proven Programs Eliminate Gaps in Elementary Reading Achievement?

What if every child in America could read at grade level or better? What if the number of students in special education for learning disabilities, or retained in grade, could be cut in half?

What if students who become behavior problems or give up on learning because of nothing more than reading difficulties could instead succeed in reading and no longer be frustrated by failure?

Today these kinds of outcomes are only pipe dreams. Despite decades of effort and billions of dollars directed toward remedial and special education, reading levels have barely increased.  Gaps between middle class and economically disadvantaged students remain wide, as do gaps between ethnic groups. We’ve done so much, you might think, and nothing has really worked at scale.

Yet today we have many solutions to the problems of struggling readers, solutions so effective that if widely and effectively implemented, they could substantially change not only the reading skills, but the life chances of students who are struggling in reading.

blog_4-25-19_teacherreading_500x333

How do I know this is possible? The answer is that the evidence is there for all to see.

This week, my colleagues and I released a review of research on programs for struggling readers. The review, written by Amanda Inns, Cynthia Lake, Marta Pellegrini, and myself, uses academic language and rigorous review methods. But you don’t have to be a research expert to understand what we found out. In ten minutes, just reading this blog, you will know what needs to be done to have a powerful impact on struggling readers.

Everyone knows that there are substantial gaps in student reading performance according to social class and race. According to the National Assessment of Educational Progress, or NAEP, here are key gaps in terms of effect sizes at fourth grade:

Gap in Effect Sizes
No Free/Reduced lunch/

Free/Reduced lunch

0.56
White/African American 0.52
White/Hispanic 0.46

These are big differences. In order to eliminate these gaps, we’d have to provide schools serving disadvantaged and minority students with programs or services sufficient to increase their reading scores by about a half standard deviation. Is this really possible?

Can We Really Eliminate Such Big and Longstanding Gaps?

Yes, we can. And we can do it cost-effectively.

Our review examined thousands of studies of programs intended to improve the reading performance of struggling readers. We found 59 studies of 39 different programs that met very high standards of research quality. 73% of the qualifying studies used random assignment to experimental or control groups, just as the most rigorous medical studies do. We organized the programs into response to intervention (RTI) tiers:

Tier 1 means whole-class programs, not just for struggling readers

Tier 2 means targeted services for students who are struggling to read

Tier 3 means intensive services for students who have serious difficulties.

Our categories were as follows:

Multi-Tier (Tier 1 + tutoring for students who need it)

Tier 1:

  • Whole-class programs

Tier 2:

  • Technology programs
  • One-to-small group tutoring

Tier 3:

  • One-to-one tutoring

We are not advocating for RTI itself, because the data on RTI are unclear. But it is just common sense to use proven programs with all students, then proven remedial approaches with struggling readers, then intensive services for students for whom Tier 2 is not sufficient.

Do We Have Proven Programs Able to Overcome the Gaps?

The table below shows average effect sizes for specific reading approaches. Wherever you see effect sizes that approach or exceed +0.50, you are looking at proven solutions to the gaps, or at least programs that could become a component in a schoolwide plan to ensure the success of all struggling readers.

Programs That Work for Struggling Elementary Readers

Multi-Tier Approaches Grades Proven No. of Studies Mean Effect Size
      Success for All K-5 3 +0.35
      Enhanced Core Reading Instruction 1 1 +0.24
Tier 1 – Classroom Approaches      
     Cooperative Integrated Reading                        & Composition (CIRC) 2-6 3 +0.11
      PALS 1 1 +0.65
Tier 2 – One-to-Small Group Tutoring      
      Read, Write, & Type (T 1-3) 1 1 +0.42
      Lindamood (T 1-3) 1 1 +0.65
      SHIP (T 1-3) K-3 1 +0.39
      Passport to Literacy (TA 1-4/7) 4 4 +0.15
      Quick Reads (TA 1-2) 2-3 2 +0.22
Tier 3 One-to-One Tutoring
      Reading Recovery (T) 1 3 +0.47
      Targeted Reading Intervention (T) K-1 2 +0.50
      Early Steps (T) 1 1 +0.86
      Lindamood (T) K-2 1 +0.69
      Reading Rescue (T or TA) 1 1 +0.40
      Sound Partners (TA) K-1 2 +0.43
      SMART (PV) K-1 1 +0.40
      SPARK (PV) K-2 1 +0.51

Key:    T: Certified teacher tutors

TA: Teaching assistant tutors

PV: Paid volunteers (e.g., AmeriCorps members)

1-X: For small group tutoring, the usual group size for tutoring (e.g., 1-2, 1-4)

(For more information on each program, see www.evidenceforessa.org)

The table is a road map to eliminating the achievement gaps that our schools have wrestled with for so long. It only lists programs that succeeded at a high level, relative to others at the same tier levels. See the full report or www.evidenceforessa for information on all programs.

It is important to note that there is little evidence of the effectiveness of tutoring in grades 3-5. Almost all of the evidence is from grades K-2. However, studies done in England in secondary schools have found positive effects of three reading tutoring programs in the English equivalent of U.S. grades 6-7. These findings suggest that when well-designed tutoring programs for grades 3-5 are evaluated, they will also show very positive impacts. See our review on secondary reading programs at www.bestevidence.org for information on these English middle school tutoring studies. On the same website, you can also see a review of research on elementary mathematics programs, which reports that most of the successful studies of tutoring in math took place in grades 2-5, another indicator that reading tutoring is also likely to be effective in these grades.

Some of the individual programs have shown effects large enough to overcome gaps all by themselves if they are well implemented (i.e., ES = +0.50 or more). Others have effect sizes lower than +0.50 but if combined with other programs elsewhere on the list, or if used over longer time periods, are likely to eliminate gaps. For example, one-to-one tutoring by certified teachers is very effective, but very expensive. A school might implement a Tier 1 or multi-tier approach to solve all the easy problems inexpensively, then use cost-effective one-to-small group methods for students with moderate reading problems, and only then use one-to-one tutoring with the small number of students with the greatest needs.

Schools, districts, and states should consider the availability, practicality, and cost of these solutions to arrive at a workable solution. They then need to make sure that the programs are implemented well enough and long enough to obtain the outcomes seen in the research, or to improve on them.

But the inescapable conclusion from our review is that the gaps can be closed, using proven models that already exist. That’s big news, news that demands big changes.

Photo credit: Courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

The Fabulous 20%: Programs Proven Effective in Rigorous Research

blog_4-18-19_girlcheer_500x333
Photo courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

Over the past 15 years, governments in the U.S. and U.K. have put quite a lot of money (by education standards) into rigorous research on promising programs in PK-12 instruction. Rigorous research usually means studies in which schools, teachers, or students are assigned at random to experimental or control conditions and then pre- and posttested on valid measures independent of the developers. In the U.S., the Institute for Education Sciences (IES) and Investing in Innovation (i3), now called Education Innovation Research (EIR), have led this strategy, and in the U.K., it’s the Education Endowment Foundation (EEF). Enough research has now been done to enable us to begin to see important patterns in the findings.

One finding that is causing some distress is that the numbers of studies showing significant positive effects is modest. Across all funding programs, the proportion of studies reporting positive, significant findings averages around 20%. It is important to note that most funded projects evaluate programs that have been newly developed and not previously evaluated. The “early phase” or “development” category of i3/EIR is a good example; it provides small grants intended to fund creation or refinement of new programs, so it is not so surprising that these studies are less likely to find positive outcomes. However, even programs that have been successfully evaluated in the past often do not replicate their positive findings in the large, rigorous evaluations required at the higher levels of i3/EIR and IES, and in all full-scale EEF studies. The problem is that positive outcomes may have been found in smaller studies in which hard-to-replicate levels of training or monitoring by program developers may have been possible, or in which measures made by developers or researchers were used, or where other study features made it easier to find positive outcomes.

The modest percentage of positive findings has caused some observers to question the value of all these rigorous studies. They wonder if this is a worthwhile investment of tax dollars.

One answer to this concern is to point out that while the percentage of all studies finding positive outcomes is modest, so many have been funded that the number of proven programs is growing rapidly. In our Evidence for ESSA website (www.evidenceforessa.org), we have found 111 programs that meet ESSA’s Strong, Moderate, or Promising standards in elementary and secondary reading or math. That’s a lot of proven programs, especially in elementary reading, where there were 62.

The situation is a bit like that in medicine. A very small percentage of rigorous studies of medicines or other treatments show positive effects. Yet so many are done that each year, new proven treatments for all sorts of diseases enter widespread use in medical practice. This dynamic is one explanation for the steady increases in life expectancy taking place throughout the world.

Further, high quality studies that fail to find positive outcomes also contribute to the science and practice of education. Some programs do not meet standards for statistical significance, but nevertheless they show promise overall or with particular subgroups. Programs that do not find clear positive outcomes but closely resemble other programs that do are another category worth further attention. Funders can take this into account in deciding whether to fund another study of programs that “just missed.”

On the other hand, there are programs that show profoundly zero impact, in categories that never or almost never find positive outcomes. I reported recently on benchmark assessments,  with an overall effect size of -0.01 across 10 studies. This might be a good candidate for giving up, unless someone has a markedly different approach unlike those that have failed so often. Another unpromising category is textbooks. Textbooks may be necessary, but the idea that replacing one textbook with another has failed many, many times. This set of negative results can be helpful to schools, enabling them to focus their resources on programs that do work. But giving up on categories of studies that hardly ever work would significantly reduce the 80% failure rate, and save money better spent on evaluating more promising approaches.

The findings of many studies of replicable programs can also reveal patterns that should help current or future developers create programs that meet modern standards of evidence. There are a few patterns I’ve seen across many programs and studies:

  1. I think developers (and funders) vastly underestimate the amount and quality of professional development needed to bring about significant change in teacher behaviors and student outcomes. Strong professional development requires top-quality initial training, including simulations and/or videos to show teachers how a program works, not just tell them. Effective PD almost always includes coaching visits to classrooms to give teachers feedback and new ideas. If teachers fall back into their usual routines due to insufficient training and follow-up coaching, why would anyone expect their students’ learning to improve in comparison to the outcomes they’ve always gotten? Adequate professional development can be expensive, but this cost is highly worthwhile if it improves outcomes.
  2. In successful programs, professional development focuses on classroom practices, not solely on improving teachers’ knowledge of curriculum or curriculum-specific pedagogy. Teachers standing at the front of the class using the same forms of teaching they’ve always used but doing it with more up-to-date or better-aligned content are not likely to significantly improve student learning. In contrast, professional development focused on tutoring, cooperative learning, and classroom management has a far better track record.
  3. Programs that focus on motivation and relationships between teachers and students and among students are more likely to enhance achievement than programs that focus on cognitive growth alone. Successful teaching focuses on students’ hearts and spirits, not just their minds.
  4. You can’t beat tutoring. Few approaches other than one-to-one or one-to-small group tutoring have consistent powerful impacts. There is much to learn about how to make tutoring maximally effective and cost-effective, but let’s start with the most effective and cost-effective tutoring models we have now and build out from there .
  5. Many, perhaps most failed program evaluations involve approaches with great potential (or great success) in commercial applications. This is one reason that so many evaluations fail; they assess textbooks or benchmark assessments or ordinary computer assisted instruction approaches. These often involve little professional development or follow-up, and they may not make important changes in what teachers do. Real progress in evidence-based reform will begin when publishers and software developers come to believe that only proven programs will succeed in the marketplace. When that happens, vast non-governmental resources will be devoted to development, evaluation, and dissemination of well-implemented forms of proven programs. Medicine was once dominated by the equivalent of Dr. Good’s Universal Elixir (mostly good-tasting alcohol and sugar). Very cheap, widely marketed, and popular, but utterly useless. However, as government began to demand evidence for medical claims, Dr. Good gave way to Dr. Proven.

Because of long-established policies and practices that have transformed medicine, agriculture, technology, and other fields, we know exactly what has to be done. IES, i3/EIR, and EEF are doing it, and showing great progress. This is not the time to get cold feet over the 80% failure rate. Instead, it is time to celebrate the fabulous 20% – programs that have succeeded in rigorous evaluations. Then we need to increase investments in evaluations of the most promising approaches.

 

 

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Benchmark Assessments: Weighing the Pig More Often?

There is an old saying about educational assessment: “If you want to fatten a pig, it doesn’t help to weigh it more often.”

To be fair, it may actually help to weigh pigs more often, so the farmer knows whether they are gaining weight at the expected levels. Then they can do something in time if this is not the case.

It is surely correct that weighing pigs does no good in itself, but it may serve a diagnostic purpose. What matters is not the weighing, but rather what the farmer or veterinarian does based on the information provided by the weighing.

blog_4-11-19_pigscale_500x432

This blog is not, however, about porcine policy, but educational policy. In schools, districts, and even whole states, most American children take “benchmark assessments” roughly three to six times a year. These assessments are intended to tell teachers, principals, and other school leaders how students are doing, especially in reading and math. Ideally, benchmark assessments are closely aligned with state accountability tests, making it possible for school leaders to predict how whole grade levels are likely to do on the state tests early enough in the year to enable them to provide additional assistance in areas of need. The information might be as detailed as “fourth graders need help in fractions” or “English learners need help in vocabulary.”

Benchmark assessments are only useful if they improve scores on state accountability tests. Other types of intervention may be beneficial even if they do not make any difference in state test scores, but it is hard to see why benchmark assessments would be valuable if they do not in fact have any impact on state tests, or other standardized tests.

So here is the bad news: Research finds that benchmark assessments do not make any difference in achievement.

High-quality, large scale randomized evaluations of benchmark assessments are relatively easy to do. Many have in fact been done. Use of benchmark assessments have been evaluated in elementary reading and math (see www.bestevidence.org). Here is a summary of the findings.

Number of Studies Mean Effect Size
Elementary Reading 6 -0.02
Elementary Math 4    .00
Study-weighted mean 10 -0.01

In a rational world, these findings would put an end to benchmark assessments, at least as they are used now. The average outcomes are not just small, they are zero. They use up a lot of student time and district money.

In our accountability-obsessed educational culture, how could use of benchmark assessments make no difference at all on the only measure they are intended to improve? I would suggest several possibilities.

First, perhaps the most likely, is that teachers and schools do not do much with the information from benchmark assessments. If you are trying to lose weight, you likely weigh yourself every day. But if you then make no systematic effort to change your diet or increase your exercise, then all those weighings are of little value. In education, the situation is much worse than in weight reduction, because teachers are each responsible for 20-30 students. Results of benchmark assessments are different for each student, so a school staff that learns that its fourth graders need improvement in fractions finds it difficult to act on this information. Some fourth graders in every school are excelling in fractions, some just need a little help, and some are struggling in fractions because they missed the prerequisite skills. “Teach more fractions” is not a likely solution except for some of that middle group, yet differentiating instruction for all students is difficult to do well.

Another problem is that it takes time to score and return benchmark assessments, so by the time a team of teachers decides how to respond to benchmark information, the situation has moved on.

Third, benchmark assessments may add little because teachers and principals already know a lot more about their students than any test can tell them. Imagine a principal receiving the information that her English learners need help in vocabulary. I’m going to guess that she already knows that. But more than that, she and her teachers know which English learners need what kind of vocabulary, and they have other measures and means of finding out. Teachers already give a lot of brief, targeted curriculum-linked assessments, and they always have. Further, wise teachers stroll around and listen in on students working in cooperative groups, or look at their tests or seatwork or progress on computer curriculum, to get a sophisticated understanding of why some students are having trouble, and ideas for what to do about it. For example, it is possible that English learners are lacking school-specific vocabulary, such as that related to science or social studies, and this observation may suggest solutions (e.g., teach more science and social studies). But what if some English learners are afraid or unwilling to express themselves in class, but sit quietly and never volunteer answers? A completely different set of solutions might be appropriate in this case, such as using cooperative learning or tutoring strategies to give students safe spaces in which to use the vocabulary they have, and gain motivation and opportunities to learn and use more.

Benchmark assessments fall into the enormous category of educational solutions that are simple, compelling, and wrong. Yes, teachers need to know what students are learning and what is needed to improve it, but they have available many more tools that are far more sensitive, useful, timely, and tied to actions teachers can take.

Eliminating benchmark assessments would save schools a lot of money. Perhaps that money could be redirected to professional development to help teachers use approaches actually proven to work. I know, that’s crazy talk. But perhaps if we looked at what students are actually doing and learning in class, we could stop weighing pigs and start improving teaching for all children.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Mislabeled as Disabled

Kenny is a 10th grader in the Baltimore City Public Schools. He is an African American from a disadvantaged neighborhood, attending a high school that requires high grades and test scores. He has good attendance, and has never had any behavior problems. A good kid, by all accounts but one.

Kenny reads at the kindergarten level.

Kenny has spent most of his time in school in special education. He received extensive and expensive services, following an Individual Education Program (IEP) made and updated over time just for him, tailored to his needs.

Yet despite all of this, he is still reading at the kindergarten level in 10th grade.

Kenny’s story starts off a remarkable book, Mislabeled as Disabled, by my friend Kalman (Buzzy) Hettleman. A lawyer by training, Hettleman has spent many years volunteering in Baltimore City schools to help children being considered for special education obtain the targeted assistance they need to either avoid special education or succeed in it. What he has seen, and describes in detail in his book, is nothing short of heartbreaking. In fact, it makes you furious. Here is a system designed to improve the lives of vulnerable children, spending vast amounts of money to enable talented and hard-working teachers to work with children. Yet the outcomes are appalling. It’s not just Kenny. Thousands of students in Baltimore, and in every other city and state, are failing. These are mostly children with specific learning disabilities or other mild, “high-incidence” categories. Or they are struggling readers not in special education who are not doing much better. Many of the students who are categorized as having mild disabilities are not disabled, and would have done at least as well with appropriate services in the regular classroom. Instead, what they get is an IEP. Such children are “mislabeled as disabled,” and obtain little benefit from the experience.

blog_4-4-19_BuzzyHettleman_500x333Buzzy has worked at many levels of this system. He was on the Baltimore school board for many years. He taught social work at the University of Maryland. He has been an activist, fighting relentlessly for the rights of struggling students (and at 84 years of age still is). Most recently, he has served on the Kirwan Commission, appointed to advise the state legislature on reform policies and new funding formulas for the state’s schools. Buzzy has seen it all, from every angle. His book is deeply perceptive and informed, and makes many recommendations for policy and practice. But his message is infuriating. What he documents is a misguided system that is obsessed with rules and policies but pays little attention to what actually works for struggling learners.

What most struggling readers need is proven, well-implemented programs in a Response to Intervention (RTI) framework. Mostly, this boils down to tutoring. Most struggling students can benefit enormously from one-to-small group tutoring by well-qualified teaching assistants (paraprofessionals), so tutoring need not be terribly expensive. Others may need certified teachers or one-to-one. Some struggling readers can succeed with well-implemented proven, strategies in the regular classroom (Tier 1). Those who do not succeed in Tier 1 should receive proven one-to-small group tutoring approaches (Tier 2). If that is not sufficient, a small number of students may need one-to-one tutoring, although research tells us that one-to-small group is almost as effective as one-to-one, and is a lot less expensive.

Tutoring is the missing dynamic in the special education system for struggling readers, whether or not they have IEPs. Yes, some districts do provide tutoring to struggling readers, and if the tutoring model they implement is proven in rigorous research it is generally effective. The problem is that there are few schools or districts that provide enough tutoring to enough struggling readers to move the needle.

Buzzy described a policy he devised with Baltimore’s then-superintendent, Andres Alonso. They called it “one year plus.” It was designed to ensure that all students with high-incidence disabilities, such as those with specific learning disabilities, must receive instruction sufficient to enable them to make one year’s progress or more every 12 months.  If students could do this, they would, over time, close the gap between their reading level and their grade level. This was a radical idea, and its implementation it fell far short. But the concept is exactly right. Students with mild disabilities, who are the majority of those with IEPs, can surely make such gains. In recent years, research has identified a variety of tutoring approaches that can ensure one year or more of progress in a year for most students with IEPs, at a cost a state like Maryland could surely afford.

            Mislabeled as Disabled is written about Buzzy’s personal experience in Baltimore. However, what he describes is happening in districts and states throughout the U.S., rich as well as poor. This dismal cycle can stop anywhere we choose to stop it. Buzzy Hettleman describes in plain, powerful language how this could happen, and most importantly, why it must.

Reference

Hettleman, K. R. (2019). Mislabeled as disabled. New York: Radius.

 This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Do Different Textbooks Have Different Effects on Student Achievement?

The British comedy group Monty Python used to refer to “privileged glimpses into the perfectly obvious.”

And just last week, there they were. In a front-page article, the March 13 edition of Education Week reported that a six-state study of the achievement outcomes of different textbooks found . . . wait for it. . . near-zero relative effects on achievement measures (Sawchuck, 2019).

Really!

The study was led by Harvard’s Thomas Kane, a major proponent of the Common Core, who was particularly upset to find out that textbooks produced before and after the Common Core influenced textbook content had few if any differential effects on achievement.

I doubt that I am the only person who is profoundly unsurprised by these findings. For the past 12 years, I’ve been doing reviews of research on programs’ effects on achievement in rigorous research. Textbooks (or curricula) are usually one of the categories in my reviews. You can see the reviews at www.bestevidence.org. Here is a summary of the average effect sizes for textbooks or curricula:

Review No. of Studies Mean Effect Size
Elementary Reading

(Inns et al., 2019)

9 +0.03
Elementary Math

Pellegrini et al., 2018)

16 +0.06
Secondary Math

(Slavin et al., 2009)

40 +0.03
Secondary Science

(Cheung et al., 2016)

8 +0.10
Weighted Average 73 +0.04

None of these outcomes suggest that textbooks make much difference, and the study-weighted average of +0.04 is downright depressing.

blog_3-28-19_sleepingstudent_500x333

Beyond the data, it is easy to see why evaluations of the achievement outcomes of textbooks rarely find significant positive outcomes. Such studies compare one textbook to another textbook that is usually rather similar. The reason is that textbook publishers respond to the demands of the market, not to evidence of effectiveness. New and existing textbooks were shaped by similar market forces. When standards change, as in the case of the Common Core State Standards in recent years, all textbook companies generally are forced to make changes in the same direction. There may be a brief window of time when new textbooks designed to meet new standards have a temporary advantage, but large publishers are extremely sensitive to such changes, and if they are not up to date in terms of standards today, they soon will be. Still, as the Kane et al. study found, changes in standards do not in themselves improve achievement on a substantial scale. Changes in standards do change market demand, which changes the content of textbooks, but fundamentally, the changes are not enough to make a measurable difference in learning.

Kane was quoted by Education Week as drawing the lesson from the study that perhaps it isn’t the textbooks that matter, but rather how the textbooks are used:

“What levels of coaching or more-intensive professional development are required to help teachers use rigorous materials at higher levels of fidelity, and does that produce larger benefits?” (Sawchuk, 2019, p. 17).

This sounds logical, but recent research in elementary mathematics calls this approach into question. Pellegrini et al. (2018) examined a category of programs that provide teachers with extensive professional development focused on math content and pedagogy. The average effect size across 12 studies was only +0.04, or essentially zero. In contrast, what did work very well were one-to-one and one-to-small group tutoring (mean effect size = +0.29) and professional development focused on classroom management and motivation (mean effect size = +0.25). In other words, programs focusing on helping teachers use standards-based materials added little if anything to the learning impact of textbooks. What mattered, beyond tutoring, were approaches that change classroom routines and relationships, such as cooperative learning or classroom management methods.

Changing textbooks matters little, and adding extensive professional development focused on standards adds even less. Instead, strategies that engage, excite, and accommodate individual needs of students are what we find to matter a great deal, across many subjects and grade levels.

This should be a privileged glimpse into the perfectly obvious. Everyone knows that textbooks make little difference. Walk through classrooms in any school, teaching any subject at any grade level. Some classes are exciting, noisy, fully engaged places in which students are eager to learn. Others are well, teaching the textbook. In which type of class did you learn best? In which type do you hope your own children will spend their time in school, or wish they had?

What is obvious from the experience of every teacher and everyone who has ever been a student is that changing textbooks and focusing on standards do not in themselves lead to classrooms that kindle the love of learning. Imagine that you, as an accomplished adult educator, took a class in tennis, or Italian, or underwater basket weaving. Would a teacher using better textbooks and more advanced standards make you love this activity and learn from it? Or would a teacher who expresses enthusiasm for the subject and for the students, who uses methods that engage students in active social activities in every lesson, obtain better outcomes of every kind? I hope this question answers itself.

I once saw a science teacher in Baltimore teaching anatomy by having students take apart steamed crabs (a major delicacy in Baltimore). The kids were working in groups, laughing at this absurd idea, but they were learning like crazy, and learning to love science. I would submit that this experience, these connections among students, this laughter are the standards our schools need to attain. It’s not about textbooks, nor professional development on textbooks.

Another Baltimore teacher I knew taught a terrific unit on ancient Egypt. The students made their own sarcophagi, taking into the afterlife the things most important to them. Then the class went on a field trip to a local museum with a mummy exhibit, and finally, students made sarcophagi representing what Egyptians would value in the afterlife.  That’s what effective teaching is about.

The great 18th century Swedish botanist Carl Linnaeus took his students on walks into forests, fields, and lakes around Uppsala University. Whatever they found, they brought back held high singing and playing conch shell trumpets in triumph.  That’s what effective teaching is about.

In England, I saw a teacher teaching graph coordinates. She gave each student’s desk a coordinate, from 1, 1 to 5, 5, and put up signs labeled North, South, East, and West on the walls. She then made herself into a robot, and the students gave her directions to get from one coordinate to another. The students were laughing, but learning. That’s what effective teaching is about.

No textbook can compete with these examples of inspired teaching. Try to remember your favorite textbook, or your least favorite. I can’t think of a single one. They were all the same. I love to read and love to learn, and I’m sure anyone reading this blog is the same. But textbooks? Did a textbook ever inspire you to want to learn more or give you enthusiasm for any subject?

This is a privileged glimpse into the perfectly obvious to which we should devote our efforts in innovation and professional development. A textbook or standard never ignited a student’s passion or curiosity. Textbooks and standards may be necessary, but they will not transform our schools. Let’s use what we already know about how learning really happens, and then make certain that every teacher knows how to do the things that make learning engage students’ hearts and emotions, not just their minds.

References

Cheung, A., Slavin, R.E., Kim, E., & Lake, C. (2016). Effective secondary science programs: A best-evidence synthesis. Journal of Research on Science Teaching, 54 (1), 58-81. Doi: 10.1002/tea.21338

Inns, A., Lake, C. Byun, S., Shi, C., & Slavin, R. E. (2019). Effective Tier 1 reading instruction for elementary schools: A systematic review. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, D.C.

Pellegrini, M., Inns, A., Lake, C., & Slavin, R. E. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Manuscript submitted for publication.

Sawchuk, S. (2019, March 13). New texts failed to lift test scores in six-state study. Education Week, 38(25), 1, 17.

Slavin, R.E., Lake, C., & Groff, C. (2009). Effective programs in middle and high school mathematics: A best-evidence synthesis. Review of Educational Research, 79 (2), 839-911.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

 

What Works in Teaching Writing?

“I’ve learned that people will forget what you said, people will forget what you did, but people will never forget how you made them feel. The idea is to write it so that people hear it and it slides through the brain and goes straight to the heart.”   -Maya Angelou

It’s not hard to make an argument that creative writing is the noblest of all school subjects. To test this, try replacing the word “write” in this beautiful quotation from Maya Angelou with “read” or “compute.” Students must be proficient in reading and mathematics and other subjects, of course, but in what other subject must learners study how to reach the emotions of their readers?

blog_3-21-19_mangelou2_394x500

Good writing is the mark of an educated person. Perhaps especially in the age of electronic communications, we know most of the people we know largely through their writing. Job applications depend on the ability of the applicant to make themselves interesting to someone they’ve never seen. Every subject–science, history, reading, and many more–requires its own exacting types of writing.

Given the obvious importance of writing in people’s lives, you’d naturally expect that writing would occupy a central place in instruction. But you’d be wrong. Before secondary school, writing plays third fiddle to the other two of the 3Rs, reading and ‘rithmetic, and in secondary school, writing is just one among many components of English. College professors, employers, and ordinary people complain incessantly about the poor writing skills of today’s youth. The fact is that writing is not attended to as much as it should be, and the results are apparent to all.

Not surprisingly, the inadequate focus on writing in U.S. schools extends to an inadequate focus on research on this topic as well. My colleagues and I recently carried out a review of research on secondary reading programs. We found 69 studies that met rigorous inclusion criteria (Baye, Lake, Inns, & Slavin, in press). Recently, our group completed a review of secondary writing using similar inclusion standards, under funding from the Education Endowment Foundation in England (Slavin, Lake, Inns, Baye, Dachet, & Haslam, 2019). Yet we found only 14 qualifying studies, of which 11 were in secondary schools (we searched down to third grade).

To be fair, our inclusion standards were pretty tough. We required that studies compare experimental groups to randomized or matched control groups on measures independent of the experimental treatment. Tests could not have been made up by teachers or researchers, and they could not be scored by the teachers who taught the classes. Experimental and control groups had to be well-matched at pretest and have nearly equal attrition (loss of subjects over time). Studies had to have a duration of at least 12 weeks. Studies could include students with IEPs, but they could not be in self-contained, special education settings.

We divided the studies into three categories. One was studies of writing process models, in which students worked together to plan, draft, revise, and edit compositions in many genres. A very similar category was cooperative learning models, most of which also used a plan-draft-revise-edit cycle, but placed a strong emphasis on use of cooperative learning teams. A third category was programs that balanced writing with reading instruction.

Remarkably, the average effect sizes of each of the three categories were virtually identical, with a mean effect size of +0.18. There was significant variation within categories, however. In the writing process category, the interesting story concerned a widely used U.S. program, Self-Regulated Strategy Development (SRSD), evaluated in two qualifying studies in England. In one, the program was implemented in rural West Yorkshire and had huge impacts on struggling writers, the students for whom SRSD was designed. The effect size was +0.74. However, in a much larger study in urban Leeds and Lancashire, outcomes were not so positive (ES= +0.01), although effects were largest for struggling writers. There were many studies of SRSD in the U.S, but none of them qualified, due to a lack of control group, brief experiments, measures made up by researchers, and located in all-special education classrooms.

Three programs that emphasize cooperative learning had notably positive impacts. These were Writing Wings (ES = +0.13), Student Team Writing (ES = +0.38), and Expert 21 (ES = +0.58).

Among programs emphasizing reading and writing, two had a strong focus on English learners: Pathway (ES = +0.32) and ALIAS (ES = +0.18). Another two approaches had an explicit focus on preparing students for freshman English: College Ready Writers Program (ES = +0.18) and Expository Reading and Writing Course (ES = =0.13).

Looking across all categories, there were several factors common to successful programs that stood out:

  • Cooperative Learning. Cooperative learning usually aids learning in all subjects, but it makes particular sense in writing, as a writing team gives students opportunities to give and receive feedback on their compositions, facilitating their efforts to gain insight into how their peers think about writing, and giving them a sympathetic and ready audience for their writing.
  • Writing Process. Teaching students step-by-step procedures to work with others to plan, draft, revise, and edit compositions in various genres appears to be very beneficial. The first steps focus on helping students get their ideas down on paper without worrying about mechanics, while the later stages help students progressively improve the structure, organization, grammar, and punctuation of their compositions. These steps help students reluctant to write at all to take risks at the outset, confident that they will have help from peers and teachers to progressively improve their writing.
  • Motivation and Joy in Self-Expression. In the above quote, Maya Angelou talks about the importance in writing of “sliding through the brain to get to the heart.” But to the writer, this process must work the other way, too. Good writing starts in the heart, with an urge to say something of importance. The brain shapes writing to make it readable, but writing must start with a message that the writer cares about. This principle is demonstrated most obviously in writing process and cooperative learning models, where every effort is made to motivate students to find exciting and interesting topics to share with their peers. In programs balancing reading and writing, reading is used to help students have something important to write.
  • Extensive Professional Development. Learning to teach writing well is not easy. Teachers need opportunities to learn new strategies and to apply them in their own writing. All of the successful writing programs we identified in our review provided extensive, motivating, and cooperative professional development, often designed as much to help teachers catch the spirit of writing as to follow a set of procedures.

Our review of writing research found that there is considerable consensus in how to teach writing. There were more commonalities than differences across the categories. Effects were generally positive, however, because control teachers were not using these consensus strategies, or were not doing so with the skills imparted by the professional development characteristic of all of the successful approaches.

We cannot expect writing instruction to routinely produce Maya Angelous or Mark Twains. Great writers add genius to technique. However, we can create legions of good writers, and our students will surely benefit.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (in press). Effective reading programs for secondary students. Reading Research Quarterly.

Slavin, R. E., Lake, C. Inns, A., Baye, A., Dachet, D., & Haslam, J. (2019). A quantitative synthesis of research on writing approaches in Key Stage 2 and secondary schools. London: Education Endowment Foundation.

Photo credit: Kyle Tsui from Washington, DC, USA [CC BY 2.0 (https://creativecommons.org/licenses/by/2.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Moneyball for Education

When I was a kid, growing up in the Maryland suburbs of Washington, DC, everyone I knew rooted for the hapless Washington Senators, one of the worst baseball teams ever. At that time, however, the Baltimore Orioles were one of the best teams in baseball, and every once in a while a classmate would snap. He (always “he”) would decide to become an Orioles fan. This would cause him to be shamed and ostracized for the rest of his life by all true Senators fans.

I’ve now lived in Baltimore for most of my life. I wonder if I came here in part because of my youthful impression of Baltimore as a winning franchise?

blog_3-14-19_moneyball_500x435

Skipping forward in time to now, I recently saw in the New York Times an article about the collapse of the Baltimore Orioles. In 2018, they had the worst record of any team in history. Worse than even the Washington Senators ever were. Why did this happen? According to the NYT, the Orioles are one of the last teams to embrace analytics, which means using evidence to decide which players to recruit or drop, to put on the field or on the bench. Some teams have analytics departments of 15. The Orioles? Zero, although they have just started one.

It’s not as though the benefits of analytics are a secret. A 2003 book by Michael Lewis, Moneyball, explained how the underfunded Oakland As used analytics to turn themselves around. A hugely popular 2011 movie told the same story.

In case anyone missed the obvious linkage of analytics in baseball to analytics in education, Results for America (RfA), a group that promotes the use of evidence in government social programs, issued a 2015 book called, you guessed it, Moneyball for Government (Nussle & Orszag, 2015). This Moneyball focused on success stories and ideas from key thinkers and practitioners in government and education. RfA was instrumental in encouraging the U.S. Congress to include in ESSA definitions of strong, moderate, and promising evidence of effectiveness, and to specify a few areas of federal funding that require or incentivize use of proven programs.

The ESSA evidence standards are a giant leap forward in supporting the use of evidence in education. Yet, like the Baltimore Orioles, the once-admired U.S. education system has been less than swept away by the idea that using proven programs and practices could improve outcomes for children. Yes, the situation is better than it was, but things are going very slowly. I’m worried that because of this, the whole evidence movement in education will someday be dismissed: “Evidence? Yeah, we tried that. Didn’t work.”

There are still good reasons for hope. The amount of high-quality evidence continues to grow at an unprecedented pace. The ESSA evidence standards have at least encouraged federal, state, and local leaders to pay some attention to evidence, though moving to action based on this evidence is a big lift.

Perhaps I’m just impatient. It took the Baltimore Orioles a book, a movie, and 16 years to arrive at the conclusion that maybe, just maybe, it was time to use evidence, as winning teams have been doing for a long time. Education is much bigger, and its survival does not depend on its success (as baseball teams do). Education will require visionary leadership to embrace the use of evidence. But I am confident that when it does, we will be overwhelmed by visits from educators from Finland, Singapore, China, and other countries that currently clobber us in international comparisons. They’ll want to know how the U.S. education system became the best in the world. Perhaps we’ll have to write a book and a movie to explain it all.  I’d suggest we call it . . . “Learnball.”

References

Nussle, J., & Orszag, P. (2015). Moneyball for Government (2nd Ed.). Washington, DC: Disruption Books.

Photo credit: Keith Allison [CC BY-SA 2.0 (https://creativecommons.org/licenses/by-sa/2.0)]

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.