Those Flying Finns: Is it Saunas or Reading That Make the Difference?

I recently attended a conference in Stockholm, at which there were several Finns and a lot of discussion about the “Finnish Miracle,” in which Finland was found to score at the top on PISA (Program for International Student Assessment). PISA periodically tests representative samples of fifteen year olds in math, science, and reading.

The Finnish Miracle became apparent in 2001, and has been talked to death since then. Much about what I heard at the conference was familiar. Finland is a small, homogeneous country in which teaching is an honored profession. I heard that Helsinki, the capital, hires 80 teachers a year, and gets thousands of applicants. Maybe these factors are all we need to know.

However, I heard something else that I knew but had forgotten.

Ten years before the Finnish PISA Miracle, there was an international test of reading, called the IEA Reading Literacy Study, which tested the reading skills of students ages 9 to 10 in 30 countries. The U.S. scored second on this test, behind- you guessed it – Finland. I looked it up, and discovered that the difference was huge. Finnish children scored 31% of a standard deviation ahead of the U. S.

A (Swedish) speaker at my conference, Jan-Eric Gustafsson, brought up the earlier IEA Reading Literacy study. He explained that throughout the 1980s, Finland had a relentless policy of ensuring that every child learned to read in the early grades. If they needed it, struggling students were given one-to-one tutoring focused on phonics as long as necessary to ensure success in this crucial subject. In light of their focus on early reading success, the outcomes on the IEA reading tests are more comprehensible.

Now space forward to the PISA tests reported 2001. The fifteen year olds who took the test were, of course, subject to the Finnish reading policy throughout their elementary years. Not only reading, but also math and science, are surely influenced by success in elementary reading.

It’s possible that Finland’s success in reading in the 1990s was equally a product of outstanding and honored teachers, a homogeneous society, and other factors (though these were also true in other Nordic countries that did not score nearly as well). Perhaps Finns eat a lot more smoked fish or spend a lot more time in saunas than other people, and these explain academic success. But it must be at least a partial explanation of Finland’s reading success that they focused substantial resources over a long time period on reading for all. In turn, their students’ success on PISA must be at least partially a result of their earlier success in reading.

One reason this all matters to the U.S. and other non-Finnish countries is that while we cannot all become Finns, we can ensure that virtually every child learns to read confidently and capably by third grade.

Many states have “Reading by Third Grade” laws that threaten to hold back third graders if they are not reading at grade level, and usually provide a last-chance summer school course to avert retention. Neither of these strategies (retention and last-chance summer school) have evidence of effectiveness. In contrast, I noted in a recent blog that there were 24 elementary programs for struggling readers that have strong, moderate, or promising evidence of effectiveness according to ESSA evidence standards.

The 24 programs were proven in our own country, and most have been widely and successfully applied. There is plenty of rationale for using these programs no matter what the Finns are doing or have done in the past. But if one of our goals is to keep up with or surpass our economic competitors in terms of education, to produce a capable workforce able to deal with complex problems of all kinds, then we need to provide our children with top-quality reading programs in the first place and effective support for struggling readers. It would be expensive to do this, perhaps, but certainly much cheaper than providing smoked fish and saunas to every U. S. family!


Perfect Implementation of Hopeless Methods: The Sinking of the Vasa

If you are ever in Stockholm, you must visit the Vasa Museum. It contains a complete warship launched in 1628 that sank 30 minutes later. Other than the ship itself, the museum contains objects and bones found in the wreck, and carefully analyzed by scientists.

The basic story of the sinking of the Vasa has important analogies to what often happens in education reform.

After the Vasa sank, the king, who commissioned it, Gustav II Adolphe, called together a commission to find out whose fault it was, to punish the guilty.

Yet the commission, after many interviews with survivors, found that no one did anything wrong. 3 ½ centuries later, modern researchers came to the same conclusion. Everything was in order. The skeleton of the helmsman was found still gripping the steering pole, trying heroically to turn the ship’s bow into the wind to keep it from leaning over.

So what went wrong? The ship could never have sailed. It was built too top-heavy, with too much heavy wood and too many heavy guns on the top decks and too little ballast on the bottom. The Vasa was doomed, no matter what the captain and crew did.

In education reform, there is a constant debate about how much is contributed to effectiveness by a program as opposed to quality of implementation. In implementation science, there are occasionally claims that it does not matter what programs schools adopt, as long as they implement them well. But most researchers, developers, and educators agree that success only results from a combination of good programs and good implementation. Think of the relationship as multiplicative:

P X I = A

(Quality of program times quality of implementation equals achievement gain).

The reason the relationship might be multiplicative is that if either P or I is zero, achievement gain is zero. If both are very positive, then achievement gain is very, very positive.

In the case of the Vasa, P=0, so no matter how good implementation was, the Vasa was doomed. In many educational programs, the same is true. For example, programs that are not well worked out, not well integrated into teachers’ schedules and skill sets, or are too difficult to implement, are unlikely to work. One might argue that in order to have positive effects, a program must be very clear about what teachers are expected to do, so that professional development and coaching can be efficiently targeted to helping teachers do those things. Then we have to have evidence that links teachers’ doing certain things to improving student learning. For example, providing teachers with professional development to enhance their content knowledge may not be helpful if teachers are not clear how to put this new knowledge into their daily teaching.

Rigorous research, especially under funding from IES and i3 in the U.S. and from EEF in England, is increasingly identifying proven programs as well as programs that consistently fail to improve student outcomes. The patterns are not perfectly clear, but in general those programs that do make a significant difference are ones that are well-designed, practical, and coherent.

If you think implementation alone will carry the day, keep in mind the skeleton of the heroic helmsman of the Vasa, spending 333 years on the seafloor trying to push the Vasa’s bow into the wind. He did everything right, except for signing on to the wrong ship.

Twenty-four Proven Programs for Struggling Readers

One of the greatest impediments to evidence-based reform in education is the belief that there are very few programs that have been rigorously evaluated and found to be effective. People often make fun of the What Works Clearinghouse (WWC), calling it the Nothing Works Clearinghouse, because in its early days there were, in fact, few programs that met WWC standards.

If you believe in the “nothing works” formulation, I’ve got astonishing news for you. You might want to find a safe place to sit, and remove any eyeglasses or sharp objects, before reading any further, to avoid accidental injury.


I have been reviewing research on various programs for elementary struggling readers to find out how many meet the new ESSA evidence standards. The answer: at least 24. Of these, 14 met the “strong” ESSA criterion, which means that there was at least one randomized study with statistically significant positive effects. Eight met the “moderate” standard, which requires at least one quasi-experimental (i.e., matched) study with significant positive effects. Two met the “promising” standard, requiring at least one correlational study with positive effects. (For a list of struggling reader programs organized by ESSA categories, click here).

I should hasten to explain that the numbers of proven programs will be higher for struggling readers programs than for whole-class programs, because most of the struggling readers programs are one-to-one or one-to-small-group tutoring. But still, the number and diversity of proven programs is impressive. Among the 24 programs, eight used one-to-one tutoring by teachers, paraprofessionals, or volunteers. Nine used small-group tutoring by teachers or paraprofessionals. However, one used computer-assisted instruction, and five used whole-school or whole-class methods and reported significantly positive effects on the students who had been in the lowest-achieving third or quarter of the classes at pretest. Two of the 24 programs, Reading Recovery (1-1 tutoring by teachers) and Success for All (whole-school approach) are well known and have been around a long time, but many others are much less well known. Of course, one-to-one tutoring, especially by teachers, can be very expensive, but whole-school and whole-class approaches tend to be relatively inexpensive on a per-pupil basis.

Here’s my point. Schools seeking proven, practical approaches to improving outcomes for their struggling readers have a wide array of attractive alternatives. Six of them, Reading Recovery, Success for All, Sound Partners (1-1 tutoring by paraprofessionals), Lindamood (small group tutoring by teachers), Targeted Reading Intervention (1-1 tutoring by teachers), and Empower Reading (small group tutoring by teachers) all have large effect sizes from randomized experiments and have been proven in from two to 28 studies.

It is important to note that there are also many programs for struggling readers that have been evaluated and found to be ineffective, including tutoring programs. It matters a lot which program you choose.

Every school and district has children who are struggling to learn to read, and all too often their solution is to make up their own approach for these students, or to purchase materials, software, or services from vendors who can present no credible evidence of effectiveness. If there were no proven solutions, such strategies might make sense, but how can they be justified when there are so many proven alternatives?

A better use of time and energy might be for educational leaders to review the proven programs for struggling readers, seek information about their benefits and costs, speak with educators who have used them, and perhaps arrange a visit to schools using programs being considered. Then they’d have a good chance of picking an approach that is likely to work if well implemented.

Soon, we will have information about proven programs in every subject and grade level, for all types of learners. Wouldn’t this be a good time to get into the habit of using proven programs to improve student outcomes?

The Sailor and the Sailboat: Leadership and Evidence

My one extravagance is that I live on the Chesapeake Bay and have a small sailboat. I love to sail, even if I’m not especially good at it, but sailing small boats teaches you a lot of important life lessons.

One of these lessons is that leadership is crucial, but leadership can only make a difference if leaders have the tools to translate leadership into outcomes.

Here’s what I mean from a sailing perspective. A sailboat is just a hull, sails, lines, a mast, a rudder, and a centerboard. When these are all in good working order, it still takes a good sailor to manage a small sailboat in heavy weather. However, when any one component is lacking, all hell breaks loose. For example, on my 11-foot sailboat, sometimes the rudder falls off in rough water. Without a rudder, it doesn’t matter how good a sailor you are. You aren’t going anywhere. Similarly, we once lost a mast in a heavy wind. Yikes!

Principals and superintendents in Title I schools are a lot like small-boat sailors in heavy weather, every single day. If all the structures and supports are in place, and if they have a great crew, capable school or district leaders can do wonders for their children.

Proven programs do not manage schools on their own. What they do is help provide the sails, mast, rudder, and lines known to work effectively with a good captain and crew.

Sometimes I hear educational leaders dismiss the importance of proven programs, saying that the only thing that matters is good leadership. But this is only half right. Great leadership is essential to make proven programs work, but proven, replicable programs and other infrastructure are equally essential to enable great leaders to have great results with kids.

So yes, recruit the best captains you can, and mentor them as much as possible. But give them, or enable them to acquire, sailboats known to work. Too many potentially great captains are given sailboats lacking a rudder or mast. When this happens, they’re sunk from the beginning.

Evidence Means Different Things in ESSA and NCLB

Whenever I talk or write about the new evidence standards in the Every Student Succeeds Act (ESSA), someone is bound to ask how this is different from No Child Left Behind (NCLB). Didn’t NCLB also emphasize using programs and practices “based on scientifically-based research?”

Though they look similar on the surface, evidence in ESSA is very different from evidence in NCLB. In NCLB, “scientifically-based research” just meant that a given program or practice was generally consistent with principles that had been established in research, and almost any program can be said to be “based on” research. In contrast, ESSA standards encourage the use of specific programs and practices that have themselves been evaluated. ESSA defines strong, moderate, and promising levels of evidence for programs and practices with at least one significantly positive outcome in a randomized, matched, or correlational study, respectively. NCLB had nothing of the sort.

To illustrate the difference, consider a medical example. In a recent blog, I told the story of how medical researchers had long believed that stress caused ulcers. Had NCLB’s evidence provision applied to ulcer treatment, all medicines and therapies based on reducing or managing stress, from yoga to tranquilizers, might be considered “based on scientifically based research” and therefore encouraged. Yet none of these stress-reduction treatments were actually proven to work; they were just consistent with current understandings about the origin of ulcers, which were wrong (bacteria, not stress, causes ulcers).

If ESSA were applied to ulcer treatment, it would demand evidence that a particular medicine or therapy actually improved or eliminated ulcers. ESSA evidence standards wouldn’t care whether a treatment was based on stress theory or bacteria theory, as long as there was good evidence that the actual treatment itself worked in practice, as demonstrated in high-quality research.

Getting back to education, NCLB’s “scientifically-based research” was particularly intended to promote the use of systematic phonics in beginning reading. There was plenty of evidence summarized by the National Reading Panel that a phonetic approach is a good idea, but most of that research was from controlled lab studies, small-scale experiments, and correlations. What the National Reading Panel definitely did not say was that any particular approach to phonics teaching was effective, only that phonics was a generically good idea.

One problem with NCLB’s “scientifically-based research” standard was that a lot of things go into making a program effective. One phonics program might provide excellent materials, extensive professional development, in-class coaching to help teachers use phonetic strategies, effective motivation strategies to get kids excited about phonics, effective grouping strategies to ensure that instruction is tailored to students’ needs, and regular assessments to keep track of students’ progress in reading. Another, equally phonetic program might teach phonics to students on a one-to-one basis. A third phonics program might consist of a textbook that comes with a free half-day training before school opens.

According to NCLB, all three of these approaches are equally “based on scientifically-based research.” But anyone can see that the first two, lots of PD and one-to-one tutoring, are way more likely to work. ESSA evidence standards insist that the actual approaches to be disseminated to schools be tested in comparison to control groups, not assumed to work because they correspond with accepted theory or basic research.

“Scientifically-based research” in NCLB was a major advance in its time, because it was the first time evidence had been mentioned so prominently in the main federal education law, yet educators soon learned that just about anything could be justified as “based on scientifically-based research,” because there are bound to be a few articles out there supporting any educational idea. Fortunately, enthusiasm about “scientifically-based” led to the creation of the Institute of Education Sciences (IES) and, later, to Investing in Innovation (i3), which set to work funding and encouraging development and rigorous evaluations of specific, replicable programs. The good work of IES and i3 paved the way for the ESSA evidence standards, because now there are a lot more rigorously evaluated programs. NCLB never could have specified ESSA-like evidence standards because there would have been too few qualifying programs. But now there are many more.

Sooner or later, policy and practice in education will follow medicine, agriculture, technology, and other fields in relying on solid evidence to the maximum degree possible. “Scientifically-based research” in NCLB was a first tentative step in that direction, and the stronger ESSA standards are another. If development and research continue or accelerate, successive education laws will have stronger and stronger encouragement and assistance to help schools and districts select and implement proven programs. Our kids will be the winners.