Proven Tutoring Approaches: The Path to Universal Proficiency

There are lots of problems in education that are fundamentally difficult. Ensuring success in early reading, however, is an exception. We know what skills children need in order to succeed in reading. No area of teaching has a better basis in high-quality research. Yet the reading performance of America’s children is not improving at an adequate pace. Reading scores have hardly changed in the past decade, and gaps between white, African-American, and Hispanic students have been resistant to change.
In light of the rapid growth in the evidence base, and of the policy focus on early reading at the federal and state levels, this is shameful. We already know a great deal about how to improve early reading, and we know how to learn more. Yet our knowledge is not translating into improved practice and improved outcomes on a large enough scale.
There are lots of complex problems in education, and complex solutions. But here’s a really simple solution:


Over the past 30 years researchers have experimented with all sorts of approaches to improve students’ reading achievement. There are many proven and promising classroom approaches, and such programs should be used with all students in initial teaching as broadly as possible. Effective classroom instruction, universal access to eyeglasses, and other proven approaches could surely reduce the number of students who need tutors. But at the end of the day, every child must read well. And the only tool we have that can reliably make a substantial difference at scale with struggling readers is tutors, using proven one-to-one or small-group methods.

I realized again why tutors are so important in a proposal I’m making to the State of Maryland, which wants to bring all or nearly all students to “proficient” on its state test, the PARCC. “Proficient” on the PARCC is a score of 750, with a standard deviation of about 50. The state mean is currently around 740. I made a colorful chart (below) showing “bands” of scores below 750 to show how far students have to go to get to 750.


Each band covers an effect size of 0.20. There are several classroom reading programs with effect sizes this large, so if schools adopted them, they could move children scoring at 740 to 750. These programs can be found at But implementing these programs alone still leaves half of the state’s children not reaching “proficient.”

What about students at 720? They need 30 points, or +0.60. The best one-to-one tutoring can achieve outcomes like this, but these are the only solutions that can.

Here are mean effect sizes for various reading tutoring programs with strong evidence:



As this chart shows, one-to-one tutoring, by well-trained teachers or paraprofessionals using proven programs, can potentially have the impacts needed to bring most students scoring 720 (needing 30 points or an effect size of +0.60) to proficiency (750). Three programs have reported effect sizes of at least +0.60, and several others have approached this level. But what about students scoring below 720?

So far I’ve been sticking to established facts, studies of tutoring that are, in most cases, already being disseminated. Now I’m entering the region of well-justified supposition. Almost all studies of tutoring occupy just one year or less. But what if the lowest achievers could receive multiple years of tutoring, if necessary?

One study, over 2½ years, did find an effect size of +0.68 for one-to-one tutoring. Could we do better that that? Most likely. In addition to providing multiple years of tutoring, it should be possible to design programs to achieve one-year effect sizes of +1.00 or more. These may incorporate technology or personalized approaches specific to the needs of individual children. Using the best programs for multiple years, if necessary, could increase outcomes further. Also, as noted earlier, using proven programs other than tutoring for all students may increase outcomes for students who also receive tutoring.

But isn’t tutoring expensive? Yes it is. But it is not as expensive as the costs of reading failure: Remediation, special education, disappointment, and delinquency. If we could greatly improve the reading performance of low achievers, this would of course reduce inequities across the board. Reducing inequities in educational outcomes could reduce inequities in our entire society, an outcome of enormous importance.

Even providing a substantial amount of teacher tutoring could, by my calculations, increase total state education expenditures (in Maryland) by only about 12%. These costs could be reduced greatly or even eliminated by reducing expenditures on ineffective programs, reducing special education placements, and other savings. Having some tutoring done by part time teachers may reduce costs. Using small-group tutoring (fewer than 6 students at a time) for students with milder problems may save a great deal of money. Even at full cost, the necessary funding could be phased in over a period of 6 years at 2% a year.

The bottom line is that the low levels of achievement and high levels of gaps according to economic and racial differences could be improved a great deal using methods already proven to be effective and already widely available. Educators and policy makers are always promising policies that bring every child to proficiency: “No Child Left Behind” and “Every Student Succeeds” come to mind. Yet if these outcomes are truly possible, why shouldn’t we be pursuing them, with every resource at our disposal?


Time Passes. Will You?

When I was in high school, one of my teachers posted a sign on her classroom wall under the clock:

Time passes. Will you?

Students spend a lot of time watching clocks, yearning for the period to be over. Yet educators and researchers often seem to believe that more time is of course beneficial to kids’ learning. Isn’t that obvious?

In a major review of secondary reading programs I am completing with my colleagues Ariane Baye, Cynthia Lake, and Amanda Inns, it turns out that the kids were right. More time, at least in remedial reading, may not be beneficial at all.

Our review identified 60 studies of extraordinary quality- mostly large-scale randomized experiments- evaluating reading programs for students in grades 6 to 12. In most of the studies, students reading 2 to 5 grade levels below expectations were randomly assigned to receive an extra class period of reading instruction every day all year, in some cases for two or three years. Students randomly assigned to the control group continued in classes such as art, music, or study hall. The strategies used in the remedial classes varied widely, including technology approaches, teaching focused on metacognitive skills (e.g., summarization, clarification, graphic organizers), teaching focused on phonics skills that should have been learned in elementary school, and other remedial approaches, all of which provided substantial additional time for reading instruction. It is also important to note that the extra-time classes were generally smaller than ordinary classes, in the range of 12 to 20 students.

In contrast, other studies provided whole class or whole school methods, many of which also focused on metacognitive skills, but none of which provided additional time.

Analyzing across all studies, setting aside five British tutoring studies, there was no effect of additional time in remedial reading. The effect size for the 22 extra-time studies was +0.08, while for 34 whole class/whole school studies, it was slightly higher, ES =+0.10. That’s an awful lot of additional teaching time for no additional learning benefit.

So what did work? Not surprisingly, one-to-one and small-group tutoring (up to one to four) were very effective. These are remedial and do usually provide additional teaching time, but in a much more intensive and personalized way.

Other approaches that showed particular promise simply made better use of existing class time. A program called The Reading Edge involves students in small mixed-ability teams where they are responsible for the reading success of all team members. A technology approach called Achieve3000 showed substantial gains for low-achieving students. A whole-school model called BARR focuses on social-emotional learning, building relationships between teachers and students, and carefully monitoring students’ progress in reading and math. Another model called ERWC prepares 12th graders to succeed on the tests used to determine whether students have to take remedial English at California State Universities.

What characterized these successful approaches? None were presented as remedial. All were exciting and personalized, and not at all like traditional instruction. All gave students social supports from peers and teachers, and reasons to hope that this time, they were going to be successful.

There is no magic to these approaches, and not every study of them found positive outcomes. But there was clearly no advantage of remedial approaches providing extra time.

In fact, according to the data, students would have done just as well to stay in art or music. And if you’d asked the kids, they’d probably agree.

Time is important, but motivation, caring, and personalization are what counts most in secondary reading, and surely in other subjects as well.

Time passes. Kids will pass, too, if we make such good use of our time with them that they won’t even notice the minutes going by.

Those Flying Finns: Is it Saunas or Reading That Make the Difference?

I recently attended a conference in Stockholm, at which there were several Finns and a lot of discussion about the “Finnish Miracle,” in which Finland was found to score at the top on PISA (Program for International Student Assessment). PISA periodically tests representative samples of fifteen year olds in math, science, and reading.

The Finnish Miracle became apparent in 2001, and has been talked to death since then. Much about what I heard at the conference was familiar. Finland is a small, homogeneous country in which teaching is an honored profession. I heard that Helsinki, the capital, hires 80 teachers a year, and gets thousands of applicants. Maybe these factors are all we need to know.

However, I heard something else that I knew but had forgotten.

Ten years before the Finnish PISA Miracle, there was an international test of reading, called the IEA Reading Literacy Study, which tested the reading skills of students ages 9 to 10 in 30 countries. The U.S. scored second on this test, behind- you guessed it – Finland. I looked it up, and discovered that the difference was huge. Finnish children scored 31% of a standard deviation ahead of the U. S.

A (Swedish) speaker at my conference, Jan-Eric Gustafsson, brought up the earlier IEA Reading Literacy study. He explained that throughout the 1980s, Finland had a relentless policy of ensuring that every child learned to read in the early grades. If they needed it, struggling students were given one-to-one tutoring focused on phonics as long as necessary to ensure success in this crucial subject. In light of their focus on early reading success, the outcomes on the IEA reading tests are more comprehensible.

Now space forward to the PISA tests reported 2001. The fifteen year olds who took the test were, of course, subject to the Finnish reading policy throughout their elementary years. Not only reading, but also math and science, are surely influenced by success in elementary reading.

It’s possible that Finland’s success in reading in the 1990s was equally a product of outstanding and honored teachers, a homogeneous society, and other factors (though these were also true in other Nordic countries that did not score nearly as well). Perhaps Finns eat a lot more smoked fish or spend a lot more time in saunas than other people, and these explain academic success. But it must be at least a partial explanation of Finland’s reading success that they focused substantial resources over a long time period on reading for all. In turn, their students’ success on PISA must be at least partially a result of their earlier success in reading.

One reason this all matters to the U.S. and other non-Finnish countries is that while we cannot all become Finns, we can ensure that virtually every child learns to read confidently and capably by third grade.

Many states have “Reading by Third Grade” laws that threaten to hold back third graders if they are not reading at grade level, and usually provide a last-chance summer school course to avert retention. Neither of these strategies (retention and last-chance summer school) have evidence of effectiveness. In contrast, I noted in a recent blog that there were 24 elementary programs for struggling readers that have strong, moderate, or promising evidence of effectiveness according to ESSA evidence standards.

The 24 programs were proven in our own country, and most have been widely and successfully applied. There is plenty of rationale for using these programs no matter what the Finns are doing or have done in the past. But if one of our goals is to keep up with or surpass our economic competitors in terms of education, to produce a capable workforce able to deal with complex problems of all kinds, then we need to provide our children with top-quality reading programs in the first place and effective support for struggling readers. It would be expensive to do this, perhaps, but certainly much cheaper than providing smoked fish and saunas to every U. S. family!

Joy is a Basic Skill in Secondary Reading

I have a policy of not talking about studies I’m engaged in before they are done and available, but I have an observation to make that just won’t wait.

I’m working on a review of research on secondary reading programs with colleagues Ariane Baye (University of Liege in Belgium) and Cynthia Lake (Johns Hopkins University). We have found a large number of very high-quality studies evaluating a broad range of programs. Most are large, randomized experiments.

Mostly, our review is really depressing. The great majority of studies have found no effects on learning. In particular, programs that focus on teaching middle and high school students struggling in reading in classes of 12 to 20, emphasizing meta-cognitive strategies, phonics, fluency, and/or training for teachers in what they were already doing, show few impacts on learning. Most of the studies provided daily, extra reading classes to help struggling readers build their skills, while the control group got band or art. They should have stayed in band or art.

Yet all is not dismal. Two approaches did have markedly positive effects. One was tutoring students in groups of one to four, not every day but perhaps twice a week. The other was cooperative learning, where students worked in four-member teams to help each other learn and practice reading skills. How could these approaches be so much more effective than the others?

My answer begins with a consideration of the nature of struggling adolescent readers. They are bored out of their brains. They are likely to see school as demeaning, isolating, and unrewarding. All adolescents live for their friends. They crave mastery and respect. Remedial approaches have to be fantastic to overcome the negative aspects of having to be remediated in the first place.

Tutoring can make a big difference, because groups are small enough for students to make meaningful relationships with adults and with other kids, and instruction can be personalized to meet their unique needs, to give them a real shot at mastery.

Cooperative learning, however, had a larger average effect size than tutoring. Even though cooperative learning did not require smaller class sizes and extra daily instructional periods, it was much more effective than remedial instruction. Cooperative learning gives struggling adolescent readers opportunities to work with their peers, to teach each other, to tease each other, to laugh, to be active rather than passive. To them, it means joy. And joy is a basic skill.

Of course, joy is not enough. Kids must be learning joyfully, not just joyful. Yet in our national education system, so focused on testing and accountability, we have to keep remembering who we are teaching and what they need. More of the same, a little slower and a little louder, won’t do it. Adolescents need a reason to believe that things can be better, and that school need not cut them off from their peers. They need opportunities to teach and learn from each other. School must be joyful, or it is nothing at all, for so many adolescents.

Trans-Atlantic Concord: Tutoring by Paraprofessionals Works

Whenever I speak to skeptical audiences about the enormous potential of evidence-based reform in education, three of the top complaints I always hear are as follows.

  1. In high-quality, randomized experiments, nothing works.
  2. Since educational outcomes depend so much on context, even programs that do work somewhere cannot be assumed to work elsewhere.
  3. Even if a given approach is found to be effective in many contexts, it is unlikely to be scalable to serve large numbers of students and schools.

In light of these criticisms, I was delighted to see a recent blog by Jonathan Sharples at the Education Endowment Foundation (EEF), the main funder of randomized evaluations of educational programs in England (and a former colleague at the University of York). The blog summarizes results from six experiments in England that used what they call teaching assistants (we call them paraprofessionals or aides) to tutor struggling students one-to-one or in small groups, in reading or math, at various grade levels.


Sharples included a table summarizing the results, which I have adapted here:


What is interesting about this chart is that although every study was a third-party randomized experiment, the effect sizes fall within a range from moderately positive to very positive (+0.12 to +0.51).

Another interesting thing about the table is that it resembles findings in U.S. studies of tutoring by paraprofessionals. Here is a chart of such studies:


The contents of the Tables 1 and 2 are heartening in providing relatively consistent positive effects in rigorous studies for replicable, pragmatic interventions for struggling students, a population of great substantive importance. Because paraprofessionals are relatively inexpensive and usually poorly utilized in their current roles, providing them with good training materials and software to work with individuals and small groups of students in dire need of help in reading and math just makes good sense.

However, think back to the criticisms so often thrown at evidence-based reform in general. The findings from tutoring and small-group teaching studies devastates those criticisms:

  1. Nothing works. Really? Not everything works, and it would be nice to have a larger set of positive examples. But tutoring by paraprofessionals (and also by teachers and well-supervised and trained volunteers) definitely works, over and over. There are numerous other programs also proven to work in rigorous studies.
  2. Nothing replicates. Really? Context matters, but here we have relatively consistent findings across the U.S. and England, two very different systems. The effects vary for one-to-one and small-group tutoring, reading and math, and other factors, and we can learn from this variation. But it is clear that across very different contexts, positive effects do replicate.
  3. Nothing scales. Really? Various proven forms of tutoring – by teachers, paraprofessionals, and volunteers – are working right now in schools across the U.S., U.K., and many other countries. Reading Recovery alone, a one-to-one model that uses certified teachers as tutors, works with thousands of teachers worldwide. With the slightest encouragement, proven tutoring models could be expanded to serve many more schools and students, at modest cost.

Proven tutoring models of all types should be a standard offering for every school. More research is always needed to find more effective and cost-effective strategies. But there is no reason whatsoever not to use what we have now. And I hope this example will help critics of evidence-based reform move on to better arguments.

Response to Intervention and Bob’s Law

The problem in education reform isn’t a lack of good ideas. It’s a lack of good ideas implemented with enough clarity, consistency and integrity to actually make a difference in rigorous experiments (and therefore in large-scale application). A recent large-scale evaluation of Response to Intervention (RTI) illustrates this problem once again.

Response to Intervention (RTI) is a strategy for helping students who are struggling to keep up with ordinary classroom teaching. The idea is that following initial instruction (Tier 1), teachers provide mild assistance to students who are having difficulties (Tier 2), such as small-group remediation. Those who continue to struggle might receive more intensive assistance (Tier 3), such as one-to-one tutoring.

RTI has been common in U.S. classrooms for 20 years, and was virtually mandated as part of No Child Left Behind. So it is distressing that a recent study by MDRC, a respected independent research organization, found no positive effects of Tier 2 services in grades 1-3 reading. In fact, there were slight negative effects for first graders receiving Tier 2 services.

Philosophically, I am a supporter of RTI, but I’m even more of a supporter of rigorous evidence. Yet here’s a very large, well-done (though not randomized) study of RTI that finds no benefits.

I think the findings of the RTI evaluation speak to a broader problem of education policy. Often, national, state or local policies promote or require uses of broadly defined strategies. RTI is a perfect example. Everyone understands the general idea, but there are thousands of ways to implement RTI in practice.

Studies of broad teaching concepts almost always find that they make little if any difference. The reason is that general concepts are implemented differently from class to class and school to school, and end up on average looking a lot like what teachers were doing before, or are doing in schools that do not claim to be implementing the broad concept. That is, the “experimental” classes are not terribly different from the “control” classes.

As a good example of this problem, in the 1970s and ‘80s, Madeline Hunter was extremely popular, and she spoke everywhere suggesting effective classroom strategies. Yet several studies found that when teachers were given training and coaching in Madeline Hunter strategies, it made no difference in achievement. Why? The studies also found that control teachers were already using strategies much like those in the Hunter model. It may well be that Madeline Hunter’s theories were so popular precisely because they were appealing descriptions of what teachers already were doing. Everyone likes to hear that what they’ve always done turns out to be supported by research. So, exactly what made Hunter’s prescriptions popular also made them no more effective than ordinary teaching, because they were ordinary teaching.

In the case of RTI, the MDRC researchers documented some differences between schools using RTI and those that were not, but there was enormous overlap.

So here’s a proposal for what I’ll call Bob’s Law: General teaching strategies subject to substantially varying interpretations by individual teachers are likely to be transformed into practices much like ordinary teaching. For this reason, they are unlikely to produce better outcomes than the control group does.

This does not mean that ordinary teaching methods are bad, or that teaching methods informed by general concepts are bad, but what it does imply is that if you want to see marked improvements in student achievement across many teachers and schools, you have to have programs that are well-conceived, well-specified and well-supported by top-quality professional development and materials. Even then, not all programs work, but success is at least possible if programs bring about systematic and sensible change in teaching methods.

Back to RTI, I remain hopeful that RTI’s strategies can improve student outcomes. However, the approaches to this concept that are likely to work are ones that are specific about all key aspects of the design and help teachers implement approaches that are markedly better than whatever they were using before.

Why Leave Learning to Chance?

Every year about four million kindergartners enter America’s schools. They’re all excited, eager and confident, because that’s the nature of kindergartners, but unfortunately, we adults know better. We know that among those wonderful five year olds, 65% will reach fourth grade reading below the “proficient” level on the National Assessment of Educational Progress (NAEP), and 31% will not even reach the “basic” level. We know which students in which neighborhoods are most likely to have these problems. Since 1980, the story has hardly changed.

Today, I’m writing this blog from an airplane flying from Baltimore to San Francisco. Flying was a risky business long ago, but today the chances are infinitesimal that my airplane will crash.

So here’s a question. Why is it ok to leave the reading success of children to chance? Why don’t we treat reading success the way we treat air safety, as something to ensure no matter what?

If you think we don’t yet know how to ensure the reading success of all children, you might be right, but I can tell you that we absolutely do know how to ensure a much higher level of success than we have now, with today’s teachers and today’s schools. I was recently reviewing research evaluating reading programs, and I found more than 60 different programs with moderate to strong evidence of effectiveness: one-to-one and one-to-small group tutoring, classroom methods, school-wide reforms, and technology. Over time, it’s certain that these approaches, and combinations of them, could become more and more effective, and we could approach 100% success.

Getting to 100% will require more than just better instruction. We are doing a study in high-poverty schools in Baltimore and found that while at least 21% of second and third graders need glasses, only 6% have them. I’m sure there are similar stories relating to hearing, dental, health, and mental health. Absenteeism is another blocker, and there are more. If we want to get to 100%, we have to deal with all of these.

Well sure, you might say, but how could we afford all of this? Fortunately, the most widespread reading problems can be solved inexpensively. The average annual per-pupil cost in the U.S. is about $11,000. The annual cost of our proven Success for All reading program is around $100 additional, or less than 1% of what we are already spending. Two pairs of eyeglasses — one to take home and one to leave at school — including the eye exam and glasses replacement, costs less than $50. Proven tutoring models provided by paraprofessionals can cost as little as $400 per student, but even at $2000 for one-to-one tutoring, that’s 18% of average per-pupil cost, and for only a minority of the class.

These modest expenditures on proven programs quickly pay back their costs in terms of reducing special education and retention, much less long-term benefits to children and society. Yet none of the 60 proven and promising programs I found is in truly widespread use.

On my airplane, of course, the situation is quite different. Pilots are carefully and extensively trained in proven methods. Technology is constantly developing to provide information and automated assistance to ensure safety and effectiveness. Back-up systems ensure that if things go wrong despite the best of preparation, disaster will not result. All of these systems are constantly evolving in response to development, evaluation, and implementation of innovations.

The reading success of a child is a very serious matter. It simply makes no sense to treat it any less seriously than we treat air safety. Just as on airplanes, we need systems to monitor children’s success, not to punish teachers but to know when and how to intervene if trouble arises.

Perhaps someday, we’ll put Boeing or Lockheed Martin in charge of our schools, and charge them with getting us as close as possible to 100% success in reading. I can see it now.

Proven approaches to:

Phonemic awareness? Check
Phonics? Check
Vocabulary? Check
Fluency? Check
Comprehension? Check
Vision? Check
Hearing? Check
Tutoring backup? Check

Ready for takeoff!

Of course we can solve this problem. All we have to do is to decide it must be solved and then do it. It is neither efficient nor ethical to keep accepting the number of reading disasters we experience in our schools.