Getting Below the Surface to Understand Disappointing Outcomes

Long ago, I toured West Germany, visiting some family friends near Hanover. They suggested I go see Duderstadt, a picturesque town nearby (see picture of it below).

My wife, Nancy, and I drove into Duderstadt and walked around. It was indeed gorgeous, but very strange. Not a person was in sight. Every shop was closed. In the center of the town was a beautiful church. We reasoned that churches are always open. We walked to the church door, I stretched out my hand to open it, but inches away the door burst open. An entire wedding party streamed out into the street. The church was packed to the rafters with happy people, now following the bride and groom out of the church. Mystery solved.

If social scientists came to Duderstadt when we did but failed to see the wedding, they might make all sorts of false conclusions. An economist might see the empty shops and conclude that the economy of rural Germany is doomed, due to low productivity. A demographer might agree and blame this on the obviously declining workforce. But looking just the thickness of a church door beneath the surface, all could immediately understand what was happening.

My point here is a simple one. I am a quant. I believe in numbers and rigorous research designs. But at the same time, I also want to understand what is really going on, and the main numbers rarely tell the whole story.

I was thinking about this when I read the rather remarkable study by Carolyn Heinrich and her colleagues (2010), cited in my two previous blogs. Like many other researchers, she and her colleagues found near-zero impacts for Supplemental Educational Services. At the time this study took place, this was a surprise. How could all that additional instructional time after school not make a meaningful difference?

But instead of just presenting the overall (bad) findings, she poked around town, so to speak, to find out what was going on.

What she found was appalling, but also perfectly logical. Most eligible middle and high school students in Milwaukee who were offered after-school programs either failed to sign up, or if they did sign up, did not attend even a single day, or if they did attend a single day, they attended irregularly, thereafter. And why did they not sign up or attend? Most programs offered attractive incentives, such as iPods, very popular at the time, so about half of the eligible students did sign up, at least. But after the first day, when they got their incentives, students faced drudgery. Heinrich et al. cite evidence that most instruction was either teachers teaching immobile students, or students doing unsupervised worksheets. Heinrich et al.’s technical report had a sentence (dropped in the published report), which I quoted previously, but will quote again here: “One might also speculate that parents and students are, in fact, choosing rationally in not registering for or attending SES.”

A study of summer school by Borman & Dowling (2006) made a similar observation. K-1 students in Baltimore were randomly assigned to have an opportunity to attend three years of summer school. The summer school sessions included 7 weeks of 6-hour a day activities, including 2 ½ hours of reading and writing instruction, plus sports, art, and other enrichment activities. Most eligible students (79%) signed up and attended in the first summer, but fewer did so in the second summer (69%) and even fewer in the third summer (42%). The analyses focused on the students who were eligible for the first and second summers, and found no impact on reading achievement. There was a positive effect for the students who did show up and attended for two summers.

Many studies of summer school, after school, and SES programs overall (including both) have just reported the disappointing outcomes without exploring why they occurred. Such reports are important, if well done, but they offer little understanding of why. Could after school or summer school programs work better if we took into account the evidence on why they usually fail? Perhaps. For example, in my previous blog, I suggested that extended-time programs might do better if they provided one-to-one, or small-group tutoring. However, there is only suggestive evidence that this might be true, and there are good reasons that it might not be, because of the same attendance and motivation problems that may doom any program, no matter how good, when struggling students go to school during times when their friends are outside playing.

Econometric production function models predicting that more instruction leads to more learning are useless unless we take into account what students are actually being provided in extended-time programs and what their motivational state is likely to be. We have to look a bit below the surface to explain why disappointing outcomes are so often achieved, so we can avoid mistakes and repeat successes, rather than making the same mistakes over and over again.

Correction

My recent blog, “Avoiding the Errors of Supplemental Educational Services,” started with a summary of the progress of the Learning Recovery Act. It was brought to my attention that my summary was not correct. In fact, the Learning Recovery Act has been introduced in Congress, but is not part of the current reconciliation proposal moving through Congress and has not become law. The Congressional action cited in my last blog was referring to a non-binding budget resolution, the recent passage of which facilitated the creation of the $1.9 trillion reconciliation bill that is currently moving through Congress. Finally, while there is expected to be some amount of funding within that current reconciliation bill to address the issues discussed within my blog, reconciliation rules will prevent the Learning Recovery Act from being included in the current legislation as introduced. I apologize for this error.

References

Borman, G. D., & Dowling, N. M. (2006). Longitudinal achievement effects of multiyear summer school: Evidence from the Teach Baltimore randomized field trial. Educational Evaluation and Policy Analysis, 28(1), 25–48. https://doi.org/10.3102/01623737028001025

Heinrich, C. J., Meyer, R., H., & Whitten, G. W. (2010). Supplemental Education Services under No Child Left Behind: Who signs up and what do they gain? Education Evaluation and Policy Analysis, 32, 273-298.

Photo credit: Amrhingar, CC BY-SA 3.0, via Wikimedia Commons

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org.

Avoiding the Errors of Supplemental Educational Services (SES)

“The definition of insanity is doing the same thing over and over again, and expecting different results.” –Albert Einstein

Last Friday, the U.S. Senate and House of Representatives passed a $1.9 trillion recovery bill. Within it is the Learning Recovery Act (LRA). Both the overall bill and the Learning Recovery Act are timely and wonderful. In particular, the LRA emphasizes the importance of using research-based tutoring to help students who are struggling in reading or math. The linking of evidence to large-scale federal education funding began with the 2015 ESSA definition of proven educational programs, and the LRA would greatly increase the importance of evidence-based practices.

But if you sensed a “however” coming, you were right. The “however” is that the LRA requires investments of substantial funding in “school extension programs,” such as “summer school, extended day, or extended school year programs” for vulnerable students.

This is where the Einstein quote comes in. “School extension programs” sound a lot like Supplemental Educational Services (SES), part of No Child Left Behind that offered parents and children an array of services that had to be provided after school or in summer school.

The problem is, SES was a disaster. A meta-analysis of 28 studies of SES by Chappell et al. (2011) found a mean effect size of +0.04 for math and +0.02 for reading. A sophisticated study by Deke et al. (2014) found an effect size of +0.05 for math and -0.03 for reading. These effect sizes are just different flavors of zero. Zero was the outcome whichever way you looked at the evidence, with one awful exception: The lowest achievers, and special education students, actually performed significantly less well in the Deke et al. (2014) study if they were in SES than if they qualified but did not sign up. The effect sizes for these students were around -0.20 for reading and math. Heinrich et al. (2009) also reported that the lowest achievers were least likely to sign up for SES, and least likely to attend regularly if they did. All three major studies found that outcomes did not vary much depending on which type of provider or program they received. Considering that the per-pupil cost was estimated at $1,725 in 2021 dollars, these outcomes are distressing, but more important is the fact that despite the federal government’s willingness to spend quite a lot on them, millions of struggling students in desperate need of effective assistance did not benefit.

Why did SES fail? I have two major explanations. Heinrich et al. (2009), who added questionnaires and observations to find out what was going on, discovered that at least in Milwaukee, attendance in SES after-school programs was appalling (as I reported in my previous blog). In the final year studied, only 16% of eligible students were attending (less than half signed up at all, and of those, average attendance in the remedial program was only 34%). Worse, the students in greatest need were least likely to attend.

From their data and other studies they cite, Heinrich et al. (2010) paint a picture of students doing boring, repetitive worksheets unrelated to what they were doing in their school-day classes. Students were incentivized to sign up for SES services with incentives, such as iPods, gift cards, or movie passes. Students often attended just enough to get their incentives, but then stopped coming. In 2006-2007, a new policy limited incentives to educationally-related items, such as books and museum trips, and attendance dropped further. Restricting SES services to after-school and summertime, when attendance is not mandated and far from universal, means that students who did attend were in school while their friends were out playing. This is hardly a way to engage students’ motivation to attend or to exert effort. Low-achieving students see after school and summertime as their free time, which they are unlikely to give up willingly.

Beyond the problems of attendance and motivation in extended time, there was another key problem with SES. This was that none of the hundreds of programs offered to students in SES were proven to be effective beforehand (or ever) in rigorous evaluations. And there was no mechanism to find out which of them were working well, until very late in the program’s history. As a result, neither schools nor parents had any particular basis for selecting programs according to their likely impact. Program providers probably did their best, but there was no pressure on them to make certain that students benefited from SES services.

As I noted in my previous blog, evaluations of SES do not provide the only evidence that after school and summer school programs rarely work for struggling students. Reviews of summer school programs by Xie et al. (in press) and of after school programs (Dynarski et al., 2002; Kidron & Lindsay, 2014) have found similar outcomes, always for the same reasons: poor attendance and poor motivation of students in school when they would otherwise have free time.

Designing an Effective System of Services for Struggling Students

There are two policies that are needed to provide a system of services capable of substantially improving student achievement. One is to provide services during the ordinary school day and year, not in after school or summer school. The second is to strongly emphasize the use of programs proven to be highly effective in rigorous research.

Educational services provided during the school day are far more likely to be effective than those provided after school or in the summer. During the day, everyone expects students to be in school, including the students themselves. There are attendance problems during the regular school day, of course, especially in secondary schools, but these problems are much smaller than those in non-school time, and perhaps if students are receiving effective, personalized services in school and therefore succeeding, they might attend more regularly. Further, services during the school day are far easier to integrate with other educational services. Principals, for example, are far more likely to observe tutoring or other services if they take place during the day, and to take ownership for ensuring their effectiveness. School day services also entail far fewer non-educational costs, as they do not require changing bus schedules, cleaning and securing schools more hours each day, and so on.

The problem with in-school services is that they can disrupt the basic schedule. However, this need not be a problem. Schools could designate service periods for each grade level spread over the school day, so that tutors or other service providers can be continuously busy all day. Students should not be taken out of reading or math classes, but there is a strong argument that a student who is far below grade level in reading or math needs a reading or math tutor using a proven tutoring model far more than other classes, at least for a semester (the usual length of a tutoring sequence).

If schools are deeply reluctant to interrupt any of the ordinary curriculum, then they might extend their day to offer art, music, or other subjects during the after-school session. These popular subjects might attract students without incentives, especially if students have a choice of which to attend. This could create space for tutoring or other services during the school day. A schedule like this is virtually universal in Germany, which provides all sports, art, music, theater, and other activities after school, so all in-school time is available for academic instruction.

Use of proven programs makes sense throughout the school day. Tutoring should be the main focus of the Learning Recovery Act, because in this time of emergency need to help students recover from Covid school closures, nothing less will do. But in the longer term, adoption of proven classroom programs in reading, math, science, writing, and other subjects should provide a means of helping students succeed in all parts of the curriculum (see www.evidenceforessa.org).

In summer, 2021, there may be a particularly strong rationale for summer school, assuming schools are otherwise able to open. The evidence is clear that doing ordinary instruction during the summer will not make much of a difference, but summer could be helpful if it is used as an opportunity to provide as many struggling students as possible in-person, one-to-one or one-to-small group tutoring in reading or math. In the summer, students might receive tutoring more than once a day, every day for as long as six weeks. This could make a particularly big difference for students who basically missed in-person kindergarten, first, or second grade, a crucial time for learning to read. Tutoring is especially effective in those grades in reading, because phonics is relatively easy for tutors to teach. Also, there is a large number of effective tutoring programs for grades K-2. Early reading failure is very important to prevent, and can be prevented with tutoring, so the summer months may get be just the right time to help these students get a leg up on reading.

The Learning Recovery Act can make life-changing differences for millions of children in serious difficulties. If the LRA changes its emphasis to the implementation of proven tutoring programs during ordinary school times, it is likely to accomplish its mission.

SES served a useful purpose in showing us what not to do. Let’s take advantage of these expensive lessons and avoid repeating the same errors. Einstein would be so proud if we heed his advice.

Correction

References

Chappell, S., Nunnery, J., Pribesh, S., & Hager, J. (2011). A meta-analysis of Supplemental Education Services (SES) provider effects on student achievement. Journal of Education for Students Placed at Risk, 16 (1), 1-23.

Deke, J., Gill, B. Dragoset, L., & Bogen, K. (2014). Effectiveness of supplemental educational services. Journal of Research in Educational Effectiveness, 7, 137-165.

Dynarski, M. et al. (2003). When schools stay open late: The national evaluation of the 21^st Century Community Learning Centers Programs (First year findings). Washington, DC: U.S. Department of Education.

Kidron, Y., & Lindsay, J. (2014). The effects of increased learning time on student academic and nonacademic outcomes: Findings from a meta‑analytic review (REL 2014-015). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Appalachia.

Xie, C., Neitzel, A., Cheung, A., & Slavin, R. E. (2020). The effects of summer programs on K-12 students’ reading and mathematics achievement: A meta-analysis. Manuscript submitted for publication.

This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.

Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org.

Six Low-Cost or Free Ways to Make American Education the Best in the World

It does not take a political genius to know that for the foreseeable future, American education is not going to be rescued by a grand influx of new money. Certainly in the near term, the slow economic recovery, gridlock in Washington, and other factors mean that the path to substantial improvement in outcomes is going to be paved not with new gold, but with better use of the gold that’s already there.

No problem.

We already spend a lot of money on education. The task right now is to change how we spend federal, state, and local resources so that more money is spent on programs and practices known to make a difference rather than on investments with zero or unknown impacts on learning. Here are my top six suggestions for how to spend our education resources more effectively. (I’ll go into more details on these in future blogs).

1. Provide incentives for schools and districts to implement programs with strong evidence of effectiveness in competitive grants. In competitive grants in all parts of federal and state government, offer “competitive preference points” for applicants who promise to adopt and effectively implement programs proven to be effective. For example, schools proposing to implement programs identified as having “strong evidence of effectiveness” under the new EDGAR definitions might receive four extra points on a 100-point scale, while those meeting the criteria for “moderate evidence of effectiveness” might receive two points. Readers of this blog have seen me make this recommendation many times. Perfect example: School Improvement Grants for low-achieving schools. Cost: zero.

2. Provide incentives for schools and districts to implement programs with strong evidence of effectiveness in formula grants. The big money in federal and state education funding is in formula grants that go to districts and schools based on, for example, levels of poverty, rather than competitive applications. The classic example is Title I. Schools have great freedom in how they use these funds, so how can they be encouraged to use them in more effective ways? The answer is to provide additional incentive funding if schools or districts commit to using proven programs with their allotted formula funds. For example, if schools agree to use a portion of their (formula-driven) Title I funds on a proven program, they may qualify for additional funds (not from the formula pot). This was the idea behind the Obey-Porter Comprehensive School Reform initiative of the late 1990s, which encouraged thousands of Title I schools to adopt whole-school reform models. Cost: This strategy could be done at a cost of perhaps 1% of the current $15 billion annual Title I budget.

3. Offer commitment to proven programs as an alternative to use of value-added teacher evaluation models. A central part of the current administration’s policies is incentivizing states and districts to adopt teacher evaluation plans that combine principal ratings of teachers with value-added scores based on students’ state reading and math tests. This is a required part of Race to the Top in those states that received this funding, and it is a required element of state applications for a waiver of elements of No Child Left Behind.

In practice, current teacher evaluation policies are intended to do two things. First, they insist that schools identify extremely ineffective teachers and help them find other futures. If done fairly and consistently, few oppose this aspect of teacher evaluation. Principals have evaluated teachers and identified those with serious deficits forever, and I am not arguing against continuing this type of evaluation.

The second purpose of the teacher evaluation policies is to improve teaching and learning for all teachers. This is the expensive and contentious part of the policies; in most states it requires a combination of frequent, structured observation by principals and “value-added” assessments of a given teacher’s students. The technical difficulties of both are substantial, and no study has yet shown any benefit to student learning as a result of going through the whole ordeal.

If the goal is better teaching and learning, why not require that all reform approaches meet the same evidence standards? If a school proposes to use a schoolwide strategy that (unlike current teacher evaluation policies) has strong evidence of effectiveness, the school should be permitted, even encouraged, to suspend aspects of the new model as long as it is implementing proven alternatives with fidelity and good outcomes. Cost: Modest, assuming proven programs are similar in cost to the expensive new teacher evaluation strategies.

4. Train and equip paraprofessionals as tutors. The most common expenditure of Title I funds is on paraprofessionals or aides, educators who do not usually have teaching degrees but perform all sorts of functions within schools other than class teaching. Paraprofessionals can be wonderful and capable people, but evidence in the U.S. and U.K. consistently finds that as they are most commonly used, they make little difference in student learning.

Yet there is also extensive evidence that paraprofessionals can be very effective if
they are trained to provide well-structured one-to-one or one-to-small group tutoring to students who are struggling in reading and math. Paraprofessionals are a multi-billion dollar army eager and capable of making more of a difference. Let’s empower them to do so. Cost: Minimal (just training and materials).

5. Encourage schools to use Supplemental Educational Services (SES) funding on proven programs. As part of No Child Left Behind, Title I schools had to use substantial portions of their Title I dollars to provide Supplemental Educational Services (SES) to students in schools failing to meet standards. Study after study has found SES to be ineffective, and expenditures on SES are waning, yet they remain as a significant element of Title I funding, even in states with waivers. If districts could be encouraged to use SES funds on programs with evidence of effectiveness in improving achievement (such as training paraprofessionals and teachers to be tutors in reading and/or math), outcomes are sure to improve. Cost: Minimal.

6. Invest in research and development to identify effective uses of universal access to tablets or computers. Despite economic and political hard times, schools everywhere are moving rapidly toward providing universal, all-student access to tablets or computers. There is a lot of talk about blended learning, flipped learning, and so on, but little actual research and development is going on that is likely to identify effective and replicable classroom strategies likely to make good use of these powerful tools. As it has done many times before, American education is about to spend billions on technology without first knowing which applications actually work. Setting aside a tiny percentage of the costs of the hardware and software, we could fund many innovators to create and rigorously evaluate approaches using all-student technology access, before we get stuck on ineffective solutions (again). Cost: modest.

* * *

I’m sure there are many more ways we could shift existing funds to advance
American education, but they all come down to one common recommendation: use what works. Collectively, the six strategies I’ve outlined, and others like them, could catapult American education to the top on international comparisons, greatly reduce education gaps, and prepare our students for the demands of a technological economy, all at little or no net cost, if we’re willing to also stop making ineffective investments. Moreover, all of these six prescriptions could be substantially underway in the next two years, during the remainder of the current administration. All could be done by the Department of Education alone, without congressional action. And again, I’m sure that others have many other examples of low-cost and no-cost solutions that I haven’t thought of or haven’t addressed here.

A revolution in American education does not necessarily require money, but it does require courage, leadership, and resolve. Those are resources our nation has in abundance. Let’s put them to work.

Follow Robert E. Slavin on Twitter: www.twitter.com/RobertSlavin

How To Do Lots of High-Quality Educational Evaluations for Peanuts

One of the greatest impediments to evidence-based reform in education is the difficulty and expense of doing large-scale randomized experiments. These are essential for several reasons. Large-scale experiments are important because when treatments are at the level of classrooms and schools, you need a lot of classrooms and schools to avoid having just a few unusual sites influence the results too much. Also, research finds that small-scale studies produce inflated effects, particularly because researchers can create special conditions on a small scale that they could not sustain on a large scale. Large experiments simulate the situations likely to exist when programs are used in the real world, not under optimal, hothouse conditions.

Randomized experiments are important because when schools or classrooms are assigned at random to use a program or continue to serve as a control group (doing what they were doing before), we can be confident that there are no special factors that favor the experimental group other than the program itself. Non-randomized, matched studies that are well designed can also be valid, but they have more potential for bias.

Most quantitative researchers would agree that large-scale randomized studies are preferable, but in the real world such studies done well can cost a lot – more than $10 million per study in some cases. That may be chump change in medicine, but in education, we can’t afford many such studies.

How could we do high-quality studies far less expensively? The answer is to attach studies to funding being offered by the U. S. Department of Education. That is, when the Department is about to hand out a lot of money, it should commission large-scale randomized studies to evaluate specific ways of spending these resources.

To understand what I’m proposing, consider what the Department might have done when No Child Left Behind (NCLB) required that low-performing schools offer after-school tutoring to low-achieving students, in its Supplemental Educational Services (SES) initiative. The Department might have invited proposals from established providers of tutoring services, which would have had to participate in research as a condition of special funding. It might have then chosen a set of after-school tutoring providers (I’m making these up):

Program A provides structured one-to-three tutoring.
Program B rotates children through computer, small-group, and individualized activities.
Program C provides computer-assisted instruction.
Program D offers small-group tutoring in which children who make progress get treats or free time for sports.

Now imagine that for each program, 60 qualifying schools were recruited for the studies. For the Program A study, half get Program A and half get the same funding to do whatever they wanted to do (except Programs A to D) consistent with the national rules. The assignment to Program A or its control group would be at random. Program B, C, and D would be evaluated in the same way.

Here’s why such a study would have cost peanuts. The costs of offering the program to the schools that got Programs A, B, C, or D would have been covered by Title I, as was true of all NCLB after-school tutoring programs. Further, state achievement tests, routinely collected in every state in grades 3-8, could have been obtained at pre- and posttest at little cost for data collection. The only costs would be for data management, analysis, and reporting, plus some amount of questionnaires and/or observations to see what was actually happening in the participating classes. Peanuts.

Any time money is going out from the Department, such designs might be used. For example, in recent years a lot of money has been spent on School Improvement Grants (SIG), now called School Turnaround Grants. Imagine that various whole-school reform models were invited to work with many of the very low-achieving schools that received SIG grants. Schools would have been assigned at random to use Programs A, B, C, or D, or to control groups able to use the same amount of money however they wished. Again, various models could be evaluated. The costs of implementing the programs would have been provided by SIG (which was going to spend this money anyway), and the cost of data collection would have been minimal because test scores and graduation rates already being collected could have been used. Again, the costs of this evaluation would have just involved data management, analysis, and reporting. More peanuts.

Note that in such evaluations, no school gets nothing. All of them get the money. Only schools that want to sign up for the studies would be randomly assigned. Modest incentives might be necessary to get schools to participate in the research, such as a few competitive preference points in competitive proposals (such as SIG) or somewhat higher funding levels in formula grants (such as after-school tutoring). Schools that do not want to participate in the research could do what they would have done if the study had never existed.

Against the minimal cost, however, weigh the potential gain. Each U. S. Department of Education program that lends itself to this type of evaluation would produce information about how the funds could best be used. Over time, not only would we learn about specific effective programs, we’d also learn about types of programs most likely to work. Also, additional promising programs could enter into the evaluation over time, ultimately expanding the range of options for schools. Funding from the Institute of Education Sciences (IES) or Investing in Innovation (i3) might be used specifically to build up the set of promising programs for use in such federal programs and evaluations.

Ideally, the Department might continuously commission evaluations of this kind alongside any funding it provides for schools to adopt programs capable of being evaluated on existing measures. Perhaps the Department might designate an evaluation expert to sit in on early meetings to identify such opportunities, or perhaps it might fund an external “Center for Cost-Effective Evaluations in Education.”

There are many circumstances in which expensive evaluations of promising programs still need to be done, but constantly doing inexpensive studies where they are feasible might free up resources to do necessarily expensive research and development. It might also accelerate the trend toward evidence-based reform by adding a lot of evidence quickly to support (or not) programs of immediate importance to educators, to government, and to taxpayers.

Because of the central role government plays in education, and because government routinely collects a lot of data on student achievement, we could be learning a lot more from government initiatives and innovative programs. For just a little more investment, we could learn a lot about how to make the billions we spend on providing educational services a lot more effective. Very important peanuts, if you ask me.

Follow Robert E. Slavin on Twitter: www.twitter.com/RobertSlavin

How Government Can Support Effective Innovation in Education

There are few aspects of life more thoroughly dominated by government than education. This is particularly true of educational innovation. Innovative programs and materials do often come from the private sector, but they are adopted only if government supports them.

There are two theories of government as regards improvements in education. One emphasizes regulation. For example, many states decree which textbooks can be used. Most states use this authority just to maintain minimum standards (e.g., paper weight, accuracy, non-discriminatory language), but others use their textbook adoption authority to force specific practices, such as use of phonics or certain approaches to teaching about evolution. None, however, use textbook adoption to encourage use of proven programs, which is apparently less important than paper weight. At the national level, regulations relating to Title I, charter schools, special education, and much more are intended to drive practice in a particular direction. Again, evidence plays little role. For example, study after study has found little effect of supplemental educational services (usually after-school remediation), yet SES has continued as a part of NCLB, taking billions from schools’ Title I budgets.

The alternative theory of government-supported innovation emphasizes fostering change by setting evaluation standards and letting the private sector innovate, as well as funding much R&D directly, and then helping scale up proven approaches. In medicine, companies come up with new devices to cure diseases, and those found in rigorous research to significantly improve outcomes are taken up at all levels of medical practice, with government support. Similarly, the Air Force might specify detailed characteristics it wants in a new pilotless drone and would then find bidders capable of building such a plane, adopting particular prototypes only if the plane ultimately meets the standards.

All areas of government also use regulation to promote certain policies and practices, but in education government almost never builds up practice from proven programs and practices. As a result, innovation in technology, textbooks, professional development, and other areas are driven by fashions, fads, politics, and marketing, not evidence.

At long last, this is beginning to change. The U.S. Department of Education is supporting the evaluation and scale-up of proven programs in its Investing in Innovation (i3) program. Recently, the Department proposed new regulations defining “strong” and “moderate” levels of evidence supporting educational innovations. These and other developments have not yet created an evidence-based system, far from it, but they are serious starts in the right direction.

Creating and then scaling up effective practices takes time, but its advantage is that if scale-up ensures effective implementation and continued positive outcomes, evidence-based reform builds from success to success and can learn over time what works under which circumstances. In contrast, innovation by national or statewide regulation is always a massive gamble, and most evaluations of grand national or statewide policies find that they made little difference.

Government by regulation may be good for maintaining the current system, but real change in any field depends on research and development. Government policies for education need to balance regulation with innovation, evaluation, and dissemination of proven programs and practices.

Follow Robert E. Slavin on Twitter: www.twitter.com/RobertSlavin

What Else Could We Do With $800 Million?

Tutor students after class?
No! says every lad and lass
Yes! replies the ruling class
But will it help the children pass?

My colleague Steve Ross, writing in yesterday’s guest blog on Sputnik, refers to the noble intentions and disappointing outcomes of Supplemental Educational Services (SES). I wanted to add some additional perspectives on what we can learn from the many SES evaluations and their larger meaning for policy.

Ross notes that most would raise participating students from the 25th to the 28th percentile, and a recent review of SES evaluations from Old Dominion University suggests the effect is even smaller. It is important to be clear that even this effect applies only to the students who were actually tutored, roughly 10-20 percent of students in most cases. So the effects of SES on the whole school were even smaller. It is entirely appropriate to focus on the students in greatest need, but SES could never have improved the achievement of entire schools to a substantial degree.

The lesson of SES is not “don’t do after-school tutoring.” I’m sure all of the SES providers had the best of intentions, and many of their models would succeed in rigorous evaluations if given the chance. Instead, the lesson for policy is, “focus on approaches that are proven and scalable.” At an annual cost of $800 million, SES has been using Title I funds that could have been supporting research-proven models in the school itself, rather than adding additional, hard-to-coordinate services after school. Rather than attempting to micromanage tens of thousands of Title I schools, the federal government’s responsibility is to help find out what works and then let struggling schools choose among effective options.

For $800 million, for example, more than 11,000 elementary schools could have chosen and implemented one of several proven, whole-school reform models. Proven cooperative learning models could have been implemented by 40,000 elementary and secondary schools. If they felt tutoring was what they needed, schools could have provided proven one-to-one or small-group phonics-focused tutoring to a far larger number of struggling readers using teachers or paraprofessionals already on the school staff during the school day, which would have been much more likely to be integrated with the rest of students’ instruction.

The ESEA renewal still has many steps to go through, and if there is any further consideration of continuing SES, I hope that available research and evidence is part of that conversation.