What Works in Elementary Math?

Euclid, the ancient Greek mathematician, is considered the inventor of geometry. His king heard about it, and wanted to learn geometry, but being a king, he was kind of busy. He called in Euclid, and asked him if there was a faster way. “I’m sorry sire,” said Euclid, “but there is no royal road to geometry.”

Skipping forward a couple thousand years, Marta Pellegrini, of the University of Florence in Italy, spent nine months with our group at Johns Hopkins University and led a review of research on effective programs for elementary mathematics  (Pellegrini, Lake, Inns & Slavin, 2018), which was recently released on our Best Evidence Encyclopedia (BEE). What we found was not so different from Euclid’s conclusion, but broader: There’s no royal road to anything in mathematics. Improving mathematics achievement isn’t easy. But it is not impossible.

Our review focused on 78 very high-quality studies (65 used random assignment). 61 programs were divided into eight categories: tutoring, technology, professional development for math content and pedagogy, instructional process programs, whole-school reform, social-emotional approaches, textbooks, and benchmark assessments.

Tutoring had the largest and most reliably positive impacts on math learning. Tutoring included one-to-one and one-to-small group services, and some tutors were certified teachers and some were paraprofessionals (teacher assistants). The successful tutoring models were all well-structured, and tutors received high-quality materials and professional development. Across 13 studies involving face-to-face tutoring, average outcomes were very positive. Surprisingly, tutors who were certified teachers (ES=+0.34) and paraprofessionals (ES=+0.32) obtained very similar student outcomes. Even more surprising, one-to-small group tutoring (ES=+0.32) was as effective as one-to-one (ES=+0.26).

Beyond tutoring, the category with the largest average impacts was instructional programs, classroom organization and management approaches, such as cooperative learning and the Good Behavior Game. The mean effect size was +0.25.

blog_10-11-18_LTF_500x479

After these two categories, there were only isolated studies with positive outcomes. 14 studies of technology approaches had an average effect size of only +0.07. 12 studies of professional development to improve teachers’ knowledge of math content and pedagogy found an average of only +0.04. One study of a social-emotional program called Positive Action found positive effects but seven other SEL studies did not, and the mean for this category was +0.03. One study of a whole-school reform model called the Center for Data-Driven Reform in Education (CDDRE), which helps schools do needs assessments, and then find, select, and implement proven programs, showed positive outcomes (ES=+0.24), but three other whole-school models found no positive effects. Among 16 studies of math curricula and software, only two, Math in Focus (ES=+0.25) and Math Expressions (ES=+0.11), found significant positive outcomes. On average, benchmark assessment approaches made no difference (ES=0.00).

Taken together, the findings of the 78 studies support a surprising conclusion. Few of the successful approaches had much to do with improving math pedagogy. Most were one-to-one or one-to-small group tutoring approaches that closely resemble tutoring models long used with great success in reading. A classroom management approach, PAX Good Behavior Game, and a social-emotional model, Positive Action, had no particular focus on math, yet both had positive effects on math (and reading). A whole-school reform approach, the Center for Data-Driven Reform in Education (CDDRE), helped schools do needs assessments and select proven programs appropriate to their needs, but CDDRE focused equally on reading and math, and had significantly positive outcomes in both subjects. In contrast, math curricula and professional development specifically designed for mathematics had only two positive examples among 28 programs.

The substantial difference in outcomes of tutoring and outcomes of technology applications is also interesting. The well-established positive impacts of one-to-one and one-to-small group tutoring, in reading as well as math, are often ascribed to the tutor’s ability to personalize instruction for each student. Computer-assisted instruction is also personalized, and has been expected, largely on this basis, to improve student achievement, especially in math (see Cheung & Slavin, 2013). Yet in math, and also reading, one-to-one and one-to-small group tutoring, by certified teachers and paraprofessionals, is far more effective than the average for technology approaches. The comparison of outcomes of personalized CAI and (personalized) tutoring make it unlikely that personalization is a key explanation for the effectiveness of tutoring. Tutors must contribute something powerful beyond personalization.

I have argued previously that what tutors contribute, in addition to personalization, is a human connection, encouragement, and praise. A tutored child wants to please his or her tutor, not by completing a set of computerized exercises, but by seeing a tutor’s eyes light up and voice respond when the tutee makes progress.

If this is the secret of the effect of tutoring (beyond personalization), perhaps a similar explanation extends to other approaches that happen to improve mathematics performance without using especially innovative approaches to mathematics content or pedagogy. Approaches such as PAX Good Behavior Game and Positive Action, targeted on behavior and social-emotional skills, respectively, focus on children’s motivations, emotions, and behaviors. In the secondary grades, a program called Building Assets, Reducing Risk (BARR) (Corsello & Sharma, 2015) has an equal focus on social-emotional development, not math, but it also has significant positive effects on math (as well as reading). A study in Chile of a program called Conecta Ideas found substantial positive effects in fourth grade math by having students practice together in preparation for bimonthly math “tournaments” in competition with other schools. Both content and pedagogy were the same in experimental and control classes, but the excitement engendered by the tournaments led to substantial impacts (ES=+0.30 on national tests).

We need breakthroughs in mathematics teaching. Perhaps we have been looking in the wrong places, expecting that improved content and pedagogy will be the key to better learning. They will surely be involved, but perhaps it will turn out that math does not live only in students’ heads, but must also live in their hearts.

There may be no royal road to mathematics, but perhaps there is an emotional road. Wouldn’t it be astonishing if math, the most cerebral of subjects, turns out more than anything else to depend as much on heart as brain?

References

Cheung, A., & Slavin, R. E. (2013). The effectiveness of educational technology applications for enhancing mathematics achievement in K-12 classrooms: A meta-analysis. Educational Research Review, 9, 88-113.

Corsello, M., & Sharma, A. (2015). The Building Assets-Reducing Risks Program: Replication and expansion of an effective strategy to turn around low-achieving schools: i3 development grant final report. Biddeford, ME, Consello Consulting.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018, March 3). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018, March 3). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Photo credit: By Los Angeles Times Photographic Archive, no photographer stated. [CC BY 4.0  (https://creativecommons.org/licenses/by/4.0)], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

Succeeding Faster in Education

“If you want to increase your success rate, double your failure rate.” So said Thomas Watson, the founder of IBM. What he meant, of course, is that people and organizations thrive when they try many experiments, even though most experiments fail. Failing twice as often means trying twice as many experiments, leading to twice as many failures—but also, he was saying, many more successes.

blog_9-20-18_TJWatson_500x488
Thomas Watson

In education research and innovation circles, many people know this quote, and use it to console colleagues who have done an experiment that did not produce significant positive outcomes. A lot of consolation is necessary, because most high-quality experiments in education do not produce significant positive outcomes. In studies funded by the Institute for Education Sciences (IES), Investing in Innovation (i3), and England’s Education Endowment Foundation (EEF), all of which require very high standards of evidence, fewer than 20% of experiments show significant positive outcomes.

The high rate of failure in educational experiments is often shocking to non-researchers, especially the government agencies, foundations, publishers, and software developers who commission the studies. I was at a conference recently in which a Peruvian researcher presented the devastating results of an experiment in which high-poverty, mostly rural schools in Peru were randomly assigned to receive computers for all of their students, or to continue with usual instruction. The Peruvian Ministry of Education was so confident that the computers would be effective that they had built a huge model of the specific computers used in the experiment and attached it to the Ministry headquarters. When the results showed no positive outcomes (except for the ability to operate computers), the Ministry quietly removed the computer statue from the top of their building.

Improving Success Rates

Much as I believe Watson’s admonition (“fail more”), there is another principle that he was implying, or so I expect: We have to learn from failure, so we can increase the rate of success. It is not realistic to expect government to continue to invest substantial funding in high-quality educational experiments if the success rate remains below 20%. We have to get smarter, so we can succeed more often. Fortunately, qualitative measures, such as observations, interviews, and questionnaires, are becoming required elements of funded research, facilitating finding out what happened so that researchers can find out what went wrong. Was the experimental program faithfully implemented? Were there unexpected responses toward the program by teachers or students?

In the course of my work reviewing positive and disappointing outcomes of educational innovations, I’ve noticed some patterns that often predict that a given program is likely or unlikely to be effective in a well-designed evaluation. Some of these are as follows.

  1. Small changes lead to small (or zero) impacts. In every subject and grade level, researchers have evaluated new textbooks, in comparison to existing texts. These almost never show positive effects. The reason is that textbooks are just not that different from each other. Approaches that do show positive effects are usually markedly different from ordinary practices or texts.
  2. Successful programs almost always provide a lot of professional development. The programs that have significant positive effects on learning are ones that markedly improve pedagogy. Changing teachers’ daily instructional practices usually requires initial training followed by on-site coaching by well-trained and capable coaches. Lots of PD does not guarantee success, but minimal PD virtually guarantees failure. Sufficient professional development can be expensive, but education itself is expensive, and adding a modest amount to per-pupil cost for professional development and other requirements of effective implementation is often the best way to substantially enhance outcomes.
  3. Effective programs are usually well-specified, with clear procedures and materials. Rarely do programs work if they are unclear about what teachers are expected to do, and helped to do it. In the Peruvian study of one-to-one computers, for example, students were given tablet computers at a per-pupil cost of $438. Teachers were expected to figure out how best to use them. In fact, a qualitative study found that the computers were considered so valuable that many teachers locked them up except for specific times when they were to be used. They lacked specific instructional software or professional development to create the needed software. No wonder “it” didn’t work. Other than the physical computers, there was no “it.”
  4. Technology is not magic. Technology can create opportunities for improvement, but there is little understanding of how to use technology to greatest effect. My colleagues and I have done reviews of research on effects of modern technology on learning. We found near-zero effects of a variety of elementary and secondary reading software (Inns et al., 2018; Baye et al., in press), with a mean effect size of +0.05 in elementary reading and +0.00 in secondary. In math, effects were slightly more positive (ES=+0.09), but still quite small, on average (Pellegrini et al., 2018). Some technology approaches had more promise than others, but it is time that we learned from disappointing as well as promising applications. The widespread belief that technology is the future must eventually be right, but at present we have little reason to believe that technology is transformative, and we don’t know which form of technology is most likely to be transformative.
  5. Tutoring is the most solid approach we have. Reviews of elementary reading for struggling readers (Inns et al., 2018) and secondary struggling readers (Baye et al., in press), as well as elementary math (Pellegrini et al., 2018), find outcomes for various forms of tutoring that are far beyond effects seen for any other type of treatment. Everyone knows this, but thinking about tutoring falls into two camps. One, typified by advocates of Reading Recovery, takes the view that tutoring is so effective for struggling first graders that it should be used no matter what the cost. The other, also perhaps thinking about Reading Recovery, rejects this approach because of its cost. Yet recent research on tutoring methods is finding strategies that are cost-effective and feasible. First, studies in both reading (Inns et al., 2018) and math (Pellegrini et al., 2018) find no difference in outcomes between certified teachers and paraprofessionals using structured one-to-one or one-to-small group tutoring models. Second, although one-to-one tutoring is more effective than one-to-small group, one-to-small group is far more cost-effective, as one trained tutor can work with 4 to 6 students at a time. Also, recent studies have found that tutoring can be just as effective in the upper elementary and middle grades as in first grade, so this strategy may have broader applicability than it has in the past. The real challenge for research on tutoring is to develop and evaluate models that increase cost-effectiveness of this clearly effective family of approaches.

The extraordinary advances in the quality and quantity of research in education, led by investments from IES, i3, and the EEF, have raised expectations for research-based reform. However, the modest percentage of recent studies meeting current rigorous standards of evidence has caused disappointment in some quarters. Instead, all findings, whether immediately successful or not, should be seen as crucial information. Some studies identify programs ready for prime time right now, but the whole body of work can and must inform us about areas worthy of expanded investment, as well as areas in need of serious rethinking and redevelopment. The evidence movement, in the form it exists today, is completing its first decade. It’s still early days. There is much more we can learn and do to develop, evaluate, and disseminate effective strategies, especially for students in great need of proven approaches.

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (in press). Effective reading programs for secondary students. Reading Research Quarterly.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

 Photo credit: IBM [CC BY-SA 3.0  (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.