Ask any parent what their kids say when they ask them what they did in school today. Invariably, they respond, “Nuffin,” or some equivalent. My four-year-old granddaughter always says, “I played with my fwends.” All well and good.
However, in educational policy, policy makers often give the very same answer when asked, “What did the schools not using the (insert latest policy darling) do?”
“Nuffin’”. Or they say, “Whatever they usually do.” There’s nothing wrong with the latter answer if it’s true. But given the many programs now known to improve student achievement (see www.evidenceforessa.org), why don’t evaluators compare outcomes of new policy initiatives to those of proven educational programs known to improve the same outcomes the policy innovation is supposed to improve, perhaps at far lower cost per student? The evaluations should also compare to “business as usual,” but adding proven programs to evaluations of large policy innovations would help avoid declaring policy innovations to be successful when they are in fact just slightly more effective than “business as usual,” and much less effective or less cost-effective than alternative proven approaches? For example, when evaluating charter schools, why not routinely compare them to whole-school reform models that have similar objectives? When evaluating extending the school day or school year to help high-poverty schools, why not compare these innovations to using the same amount of additional money to hiring tutors to use proven tutoring models to help struggling students? In evaluating policies in which students are held back if they do not read at grade level by third grade, why not compare these approaches to intensive phonics instruction and tutoring in grades K-3, which are known to greatly improve student reading achievement?
As one example of research comparing a policy intervention to a promising educational intervention, I recently saw a very interesting pair of studies from France. Ecalle, Gomes, Auphan, Cros, & Magnan (2019) compared two interventions applied in special priority areas with high poverty levels. Both interventions focused on reading in first grade.
One of the interventions involved halving class size, from approximately 24 students to 12. The other provided intensive reading instruction in small groups (4-6 children) to students who were struggling in reading, as well as less intensive interventions to larger groups (10-12 students). Low achievers got two 30-minute interventions each day for a year, while the higher-performing readers got one 30-minute intervention each day. In both cases, the focus of instruction was on phonics. In all cases, the additional interventions were provided by the students’ usual teachers.
The students in small classes were compared to students in ordinary-sized classes, while the students in the educational intervention were compared to students in same-sized classes who did not get the group interventions. Similar measures and analyses were used in both comparisons.
The results were nearly identical for the class size policy and the educational intervention. Halving class size had effect sizes of +0.14 for word reading and +0.22 for spelling. Results for the educational intervention were +0.13 for word reading, +0.12 for spelling, +0.14 for a group test of reading comprehension, +0.32 for an individual test of comprehension, and +0.19 for fluency.
These studies are less than perfect in experimental design, but they are nevertheless interesting. Most importantly, the class size policy required an additional teacher for each class of 24. Using Maryland annual teacher salaries and benefits ($84,000), that means the cost in our state would be about $3500 per student. The educational intervention required one day of training and some materials. There was virtually no difference in outcomes, but the differences in cost were staggering.
The class size policy was mandated by the Ministry of Education. The educational intervention was offered to schools and provided by a university and a non-profit. As is so often the case, the policy intervention was simplistic, easy to describe in the newspaper, and minimally effective. The class size policy reminds me of a Florida program that extended the school schedule by an hour every day in high-poverty schools, mainly to provide more time for reading instruction. The cost per child was about $800 per year. The outcomes were minimal (ES=+0.05).
After many years of watching what schools do and reviewing research on outcomes of innovations, I find it depressing that policies mandated on a substantial scale are so often found to be ineffective. They are usually far more expensive than much more effective, rigorously evaluated programs that are, however, a bit more difficult to describe, and rarely arouse great debate in the political arena. It’s not that anyone is opposed to the educational intervention, but it is a lot easier to carry a placard saying “Reduce Class Size Now!” than to carry one saying “Provide Intensive Phonics in Small Groups with More Supplemental Teaching for the Lowest Achievers Now!” The latter just does not fit on a placard, and though easy to understand if explained, it does not lend itself to easy communication. Actually, there are much more effective first grade interventions than the one evaluated in France (see www.evidenceforessa.org). At a cost much less than $3500 per student, several one-to-one tutoring programs using well-trained teaching assistants as tutors would have been able to produce an effect size of more than +0.50 for all first graders on average. This would even fit on a placard: “Tutoring Now!”
I am all in favor of trying out policy innovations. But when parents of kids in a proven-program comparison group are asked what they did in school today, they shouldn’t say “nuffin’”. They should say, “My tooter taught me to read. And I played with my fwends.”
Ecalle, J., Gomes, C., Auphan, P., Cros, L., & Magnan, A. (2019). Effects of policy and educational interventions intended to reduce difficulties in literacy skills in grade 1. Studies in Educational Evaluation, 61, 12-20.
This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.