Lessons from Innovators: Erikson Institute’s Early Math Initiative

The process of moving an educational innovation from a good idea to widespread effective implementation is far from straightforward, and no one has a magic formula for doing it. The William T. Grant and Spencer Foundations, with help from the Forum for Youth Investment, have created a community composed of grantees in the federal Investing in Innovation (i3) program to share ideas and best practices. Our Success for All program participates in this community. In this space, I, in partnership with the Forum for Youth Investment, highlight observations from the experiences of i3 grantees other than our own, in an attempt to share the thinking and experience of colleagues out on the front lines of evidence-based reform.

This blog post is based on an interview between the Forum for Youth Investment and Jennifer McCray, Project Director for the Erikson Institute’s Early Math Collaborative. Their i3 development project is entering its third year of implementation, and McCray’s team has begun to learn important lessons about how to develop an idea and move it into practice at both the teacher and school levels to ensure sustainability.

The goal of the Erikson Early Math Collaborative is to help teachers get better at teaching math to young children through intensive professional development, on-site coaching, school-based learning communities, and in-class support. Erikson’s i3 grant is supporting work in eight control and eight intervention schools. When asked about their goals for scale and sustainability, like many other i3 development project directors, McCray is uncertain. “This gets back to the question of what our role should be,” she commented. “We know we want to help teachers and schools be better math educators. And we want to learn from what we are doing and document that learning. I’d even say we’d like to try this again in eight new schools so we can apply what we have learned. But I’m not sure that our role is to figure out how to scale this up. That is an important question, but I’m not sure we are best suited to answer it.” For Erikson and for a number of university-based i3 projects, development grants are really about learning, understanding what seems to work well and what does not, and sharing lessons with others who may be positioned to implement and potentially scale up these lessons. Some of Erikson’s initial lessons are summarized below.

Pay attention to the context of implementation

Originally, the project engaged teachers largely in isolation. That is, teachers would come to Erikson for training or to participate in learning labs. Sometimes professional development would be delivered at school sites, but little attention was paid to schools as the contexts in which the intervention was unfolding. After a year or so of implementation, they began to see how important the individual school environment was to the success of their project. McCray muses, “We knew on paper how important the climate and culture were, but it took time to understand the processes at each school that could help our intervention thrive. For example, grade-level meetings were new to us. We were surprised by how rich and fruitful they turned out to be. We began to take more advantage of those meetings as a place to discuss teaching strategies.” One story illustrates the value of this shift. One of their teachers was reluctant to try a piece of the intervention called “number strings,” which helps students see the relationship between numbers. “The teacher really didn’t want to do it,” recalls McCray, “but the whole grade-level team decided they were going to do it because they had identified flexible solutions to solving arithmetic problems as a priority. So the teacher did it even though she didn’t want to and was amazed at how well it worked and how much the students seemed to really like it. Now it has become a regular part of her classroom. What she learned isn’t just a little trick – it is a whole approach that she will hopefully keep using in her classroom forever.” Having the grade-level team behind her, as opposed to just Erikson staff encouraging her to try something new, this teacher adopted a strategy that she may not have otherwise.

Think about sustainability at multiple levels

The team at Erikson knows they will need to begin to pull back next year and work out a thoughtful exit strategy from their current schools. Although coaching, training sessions at Erikson, and on-site support will continue to be major components of the project, they are thinking a lot about school-level sustainability. McCray noted, “Teachers really need more ownership of their learning process, so we are trying to create a professional learning community at the school level that institutionalizes and creates supports for teachers.” This year, Erikson will implement a new intervention – lesson study – a process of school-based, peer learning that involves teachers working together to do intensive lesson planning. McCray explains a lesson study this way: “A group of teachers starts by thinking about a unit and talking about which topics in the unit kids are not getting the way they should be. Then the teachers do research, find resources, and learn how others teach this topic. Then they plan a lesson and deliver it in front of their colleagues for feedback and discussion.” The hope is to instigate the formation of a local professional learning community that can foster school-level supports and embed reflective practice into the culture of the school.

Be prepared to prioritize

Because the Early Math Collaborative is intensive and requires a great deal of teachers’ time, Erikson knew from the outset that in order for it to be sustainable, they would have to think strategically about what is core and what can be optional. “We are aware that this is an expensive intervention,” McCray explains. “We always knew we would have to streamline for it to be sustainable beyond the i3 grant. We have learned to watch for which pieces seem to have more bang for the buck, which pieces seem to facilitate greater shifts in sustainability, and where there may be potential efficiencies. For example, we have found that when teachers have a chance to share with each other more and come together to discuss strategies, that helps with accountability and engagement more so than one-on-one coaching and PD.” Prioritizing is key.

A Purpose-Driven Government: Moneyball and Impact


In 2002, Billy Beane, General Manager of the Oakland A’s, created “Moneyball,” a method of using statistics to evaluate the performance of ball players and hire the most productive players for the least money. John Bridgeland and Peter Orszag wonder if government can equally play “moneyball.” They made similar arguments in an article inThe Atlantic over the summer. Bridgeland and Orszag were, respectively, top officials in the G. W. Bush and early Obama administrations’ Office of Management and Budget (OMB), so they write from deep experience. In a nutshell, their point is that evidence of effectiveness matters very little in government funding, but should matter much more. Readers of this column will not be surprised to hear me applaud this position.

However, I am concerned about one aspect of their articles, the notion that evaluations of federal funding streams are needed so that ineffective ones can be terminated. Sometimes this is true, but in reality most federal social funding supports investments in purposes that have a value for most Americans. Specific programs are then funded within these streams to accomplish the goals represented by that value. In a time when many politicians are looking for things to cut to reduce taxes or deficits, it is dangerous to put everything on the evaluation operating table. Individual programs (such as “Scared Straight,” the delinquency prevention program proven to increase delinquency) can and should be evaluated and cut if they are ineffective. However, “reducing delinquency” is a valid purpose that is a worthwhile investment of federal dollars. This purpose should not have to meet an evidence standard in the same way “Scared Straight” or all other specific delinquency prevention programs should.

In an earlier blog, I discussed this distinction, contrasting Title I (a funding stream for the widely shared value of improving achievement in high-poverty schools) and specific uses of Title I funding (such as specific after-school tutoring programs). My argument was that a “moneyball” approach, in which rigorous research is used to determine the impacts of specific Title I expenditures was necessary, feasible, and in an ideal world acceptable to politicians and taxpayers of all stripes, who share an interest in cost-effective government. However, Title I itself is a funding stream, not a specific program, and it exists because most Americans agree that schools serving children in poverty need extra help. Title I has specific rules and procedures, but at its core it provides funding for a valid purpose that probably will always exist. For this reason, giant evaluations of Title I are not valuable. Instead, we need substantial investment in development, evaluation, and dissemination of specific approaches proven to actually fulfill the good intentions of Title I funding.

The same logic would apply to all government social programs. Each has a set of outcomes that are worthy of federal investment: reducing delinquency, hunger, homelessness, unemployment, teen pregnancy, and so on. Few would argue that these goals are unimportant, and a great, modern nation should surely be investing in them. But in each case, that investment goes into many specific programs, and the effectiveness of these programs ultimately adds up to the effectiveness of the funding stream. Further, improving the evidence base for proven individual programs (and weeding out ineffective ones) is uniquely a federal role, which states and localities are unlikely to do very well. When federal R&D identifies effective social programs, that information potentially magnifies the effectiveness of social spending at all levels of government.

In the case of the original “Moneyball,” evaluating the whole of the Oakland A’s would not have made much difference; they were already being evaluated (harshly) in their league standings. Yet using statistical evaluations of current and potential players to improve the quality of players on their roster gradually improved their overall performance. In the same way, social programs need to build up rosters of proven programs to improve their overall outcomes. A purpose-driven government does not cut funding for valid purposes if they are not being adequately attained. Instead, it improves the programs that help achieve widely supported purposes, using good science and scaling up effective innovations.

Why Control Groups are Ethical and Necessary


A big reason that many educators don’t like to participate in experiments is that they don’t want to take a 50-50 chance of being assigned to a control group. That is, in randomized experiments, schools or teachers or students are often assigned at random to receive the innovative treatment (the experimental group) or to keep doing whatever they were doing (the control group). Educators often object, complaining that they do not think it is fair that they might be assigned to the control group.

This objection always comes up, and I’m always surprised by it. What the educators who are so concerned about being in the control group don’t seem to realize is that they are already – for all intents and purposes – in the control group, but we’re giving them the potential opportunity to receive the new services and move into the experimental group. If the coin comes up heads, they move to the experimental group; if the coin comes up tails, they simply continue doing what they’ve already been comfortably doing, perhaps for many years. Usually, schools can purchase materials and training to adopt an innovative program outside of the experiment; all the experiment typically offers is a 50-50 chance to get the treatment for free, or at a deep discount. So if schools want the treatment, and are willing to pay for it, they can get it. Ending up in the control group is not so bad, either. Incentives are usually offered to schools in the control group, so a school might receive several thousand dollars to do anything they want other than the experimental treatment. Further, many studies use a “delayed treatment” design in which the control group gets training and materials to implement the experimental program after the study is over, a year or two after the schools in the experimental group received the program. In this way, having been in the control group in the short term serves to improve the school in the long run.

But isn’t it unethical to deprive schools or children of effective methods? If the methods were so proven and so widely used that not to use them would truly deprive students, then this would be unethical. But every experiment has to be passed by an Institutional Review Board (IRB), usually located in a university. IRB regulations require that the control group receive a treatment that is at least “state of the art,” so that no one gets less than the current standard of best practice. The experiment is designed to test out innovations whose effectiveness has not yet been established and that are not yet standard best practice.

In fields that respect evidence, yesterday’s experimental treatment becomes the standard of practice, and thereby becomes the minimum acceptable offering for control groups. This cycle continues indefinitely, with experimental treatments being progressively compared to harder-to-beat control treatments as evidence-based practice advances. In other words, doing experiments using control groups to improve education based on evidence would put education into a virtuous cycle of innovation, evaluation, and progressive improvement like that which has transformed fields such as medicine, agriculture, and technology to the benefit of all.

Most educators would prefer not to be in the control group, but they should at least be consoled by the knowledge that control groups play an essential role in evidence-based reform. In fact, the importance of knowing whether or not new methods add to student outcomes is so great that one could argue that it is unethical not to agree to participate in experiments in which one might be assigned to the control group. In an education system offering many more opportunities to participate in research, individual schools or educators may be in control groups in some studies and experimental groups in others.

As teachers, principals, and superintendents get used to participating in experiments, they are losing some of their earlier reluctance. In particular, when educators are asked to play an active role in developing and evaluating new approaches and in choosing which experiments to volunteer for, they become more comfortable with the concept. And this comfort will enable education to join medicine, agriculture, technology, and other fields in which a respect for evidence and innovation drives rapid progress in meeting human needs.

If you like evidence-based reform in education, then be sure to tip your hat to the little-appreciated control group, without which most experiments would not be possible.

Tailoring Evidence-Based Reform to Different Problems


Not long ago, I gave a speech at the American Psychological Association’s convention in Honolulu (all right, fighting for evidence-based reform does have its pleasures). Readers of this column will not be surprised at anything I said, but I got one question that provoked some thought. My questioner wanted to know why I kept referring to such easy-to-define-and-measure problems as ensuring that children can read or understand algebra, rather than much more complex problems such as how to lead schools.

I didn’t say it at the time, but I think this is both a silly and a profound question. The silly part is its implication that if evidence can’t solve all problems, it is of little value. In fact, is there anyone out there who thinks that it is not important to identify effective and replicable approaches to teaching reading, algebra, and all the other relatively easy-to-define, easy-to-measure problems of education?

Yet solving these does still leave some very important but less-well-defined problems that may take different approaches. These approaches should still be informed by evidence, but perhaps different types of evidence from the design-build-evaluate-disseminate model that usually leads to proven and replicable approaches to reading or algebra, if anything does.

Take leadership, which was my questioner’s example. I don’t think anyone will ever develop and evaluate a “principal protocol” for all schools, but it’s easy to imagine many innovations that could help principals be more effective, if they turned out to work in well-designed evaluations. If you break the principal’s role into its components, this is easy to see. For example, principals play a key role in teacher evaluation, and if anyone designs a teacher evaluation strategy that improves teacher performance overall, this should improve students’ performance, and that is easy to measure. Principals can play a key role in such schoolwide issues as attendance and behavior problems, and solutions for these exist and can readily be implemented and measured.

Principals play a leading role in managing resources, and the impact of each resource can have its own evidence base. Given a certain level of discretionary funding, should principals hire classroom aides, reduce class size, adopt particular programs, implement after-school programs, or purchase playground equipment? There is already evidence on most of these; the principal should be aware of this evidence and take it into account, and more evidence of this kind and better dissemination would be helpful.

Principals should be able to collaborate with staff to set goals, and then motivate and enable the staff to achieve the goals and monitor progress toward doing so. This is less cut-and-dried, but professional development for goal-setting and continuous progress toward targets is available, and the use of specific professional development models can be evaluated and replicated.

The kinds of leadership skills I suspect my questioner was thinking about are perhaps harder to measure and harder to influence. For example, positive relationships with staff, students, parents, and community leaders. Or the ability to make good decisions under pressure. Or the ability to communicate enthusiasm and high expectations for learning, or to serve as a positive role model and moral exemplar. Principals are probably more likely to learn these skills from observing their own principals and through other general life experience, but I would never rule out the possibility that professional development and coaching could build these skills as well. Research might also focus on identifying and selecting extraordinary leaders and keeping them on the job and growing in wisdom and capability over time.

Of course, if principals and school staffs chose and effectively implemented proven classroom programs, proven attendance and behavior programs, proven programs for struggling students, proven parent involvement programs, and so on, then the job of being principal would be a lot easier, and a lot more fulfilling. So rather than worrying about areas in which develop-evaluate-disseminate models don’t directly apply, it might be a good idea to use what we already know how to do, expand the range of proven approaches, and establish incentives and supports to create a school culture that respects and seeks proven solutions.

This may not solve every problem faced by every principal, but it would be a heck of a start, and then we could use different research and development methods to solve the remaining problems.

NOTE: You can obtain my APA address at http://www.youtube.com/watch?v=peiZVIJ0Gaw. I’m sorry you couldn’t be in Hawaii to hear it!