Education Innovation and Research: Innovating Our Way to the Top

How did America get to be the wealthiest and most powerful country on Earth?

To explain, let me tell you about visiting a remote mountain village in Slovakia. I arrived in the evening, as the ancient central square filled up with people. Every man, woman, and child had a cell phone. Invented in America.

In the local hospital, I’m sure that most medicines were invented in America, which does more medical research than all other nations combined. Local farmers probably planted seeds and used methods developed in the U.S. Everywhere in the world, everyone watches American movies, listens to American music, and on and on.

America’s brand, the source of our wealth, is innovation.

America has long led the world in creating wealth by creating new ideas and putting them in practice. Technology? Medicine? Agriculture? America dominates the world in each of these fields, and many more. The reason is that America innovates, constantly finding new ways to solve problems, cure diseases, grow better crops, and generally do things less expensively. I am often at Johns Hopkins Hospital, where the halls are full of patients from every part of the globe. They come to Johns Hopkins because of its reputation for innovation.

In education, we face daunting problems, especially in educating disadvantaged students. So to solve these problems, you’d naturally expect that we’d turn to the principle that has led to our success in so many fields – innovation.

The Every Student Succeeds Act (ESSA), passed by Congress and signed into law in December, 2015, has taken just this view. In it, for the first time ever, is a definition of the evidence required for a program or practice to be considered “strong,” “moderate,” or “promising.” These definitions encourage educators to adopt proven programs, but for this to work, we have to have a steady stream of proven innovations appearing each year. This function is fulfilled by another part of ESSA, the Education Innovation and Research (EIR) grant program. The EIR provision, which was included in ESSA with bipartisan support, provides a tiered evidence approach to research that will constantly add to the body of programs that meet the ESSA evidence requirements. Proposals are invited for “early phase,” “mid-phase,” and “expansion” grants to support the development, validation, and scale-up of successful innovations that originate at the state and local levels. Based on the U.S. Department of Education’s recent EIR grant application process, it appears (as is expected from a tiered evidence design) that lots of early stage grants of up to $3 million will be made, fewer mid-stage grants of up to $8 million, and very few expansion grants of up to $15 million, all over 5 years. Anyone can apply for an early-stage grant, but applicants must already have some evidence to support their program to get a mid-stage grant, and a lot of very rigorous evidence to apply for an expansion grant. All three types of grants require third-party evaluations – which will serve to improve programs all along the spectrum of effectiveness – but mid-stage and expansion grants require large, randomized evaluations, and expansion grants additionally require national dissemination.

The structure of EIR grants is intended to make the innovation process wide open to educators at all levels of state and local governments, non-profits, businesses, and universities. It is also designed to give applicants the freedom to suggest the nature of the program they want to create, thus allowing for a broad range of field-driven ideas that arise to meet recognized needs. EIR does encourage innovation in rural schools, which must receive at least 25% of the funding, but otherwise there is considerable freedom, drawing diverse innovators to the process.

EIR is an excellent investment. If only a few of the programs it supports end up showing positive outcomes and scaling up to serve many students across the U.S., then EIR funding will make a crucial difference to the educational success of hundreds of thousands or millions of students, improving outcomes on a scale that matters at modest cost.

EIR provides an opportunity for America to solve its education problems just as it has solved problems in many other fields: through innovation. That is what America does when it needs rapid and widespread success, as it so clearly does in education. In every subject and grade level, we can innovate our way to the top. EIR is providing the resources and structure to do it.

This blog is sponsored by the Laura and John Arnold Foundation

Good Programs? Bad Programs? Show Me the Data!

Since launching Evidence for ESSA on February 28, I’ve gotten a lot of emails. In general, the responses to the website have been very positive. However, a small minority of emails have been really angry about the entire project.

The writers of these angry emails are upset that positive ESSA evidence levels were assigned to what they considered “bad programs” and less positive ESSA evidence levels were assigned to what they considered “good programs.” Of course, in each case I explain that we are only reviewing existing evidence for demonstrated impact on students’ learning and assigning ESSA evidence levels according to the standards defined by the evidence provisions included in the Every Student Succeeds Act (ESSA), which is now the law of the land. We are not assigning ESSA evidence levels to programs based on their “goodness” or “badness” on any dimension other than impact on achievement.

Evidence for ESSA critics were having none of it. In their minds, “good programs” are ones that adhere to well-established principles, or have been supported by experts, or are aligned with state or national standards. “Bad programs” are ones that, in their opinions, violate these standards or fail to incorporate well-supported principles.

Expert opinions and standards are important, of course, but how about effectiveness? I asked how anyone could tell if a program was good or bad unless they knew if it actually benefitted students. This did no good. “Don’t you understand?” they asked. “Such-and-such experts or so-and-so standards support these programs, so they are good.” End of story.

But adhering to principles of good practice is not at all the same as demonstrated effectiveness. To understand this, imagine a textbook that meets every standard and conforms to all current conceptions of good practice, yet teachers are given only three hours of in-service to use it. An evaluation would probably find no improvement in learning. Now imagine a program built around the very same textbook that provides a week of training, in-school coaching once a month, videos to demonstrate program elements, and so on. This program is much more likely to work. The point is, the content of a curriculum is part of what might make it effective or ineffective. The professional development and other features are also essential. So declaring a program or curriculum “good” or “bad” based on content alone is misleading.

The conversations I am having with Evidence for ESSA critics illustrate the sea change being brought about by the ESSA evidence standards. Way back in . . ., well, 2016, educational programs were largely judged according to alignment with standards, state textbooks and software reviews, correspondence with expert opinion, or most often, perhaps, based on leaders’ preferences, tips from nearby districts, and appeals from sales reps. Actual proven impact on students was hardly ever involved. Today, as the ESSA evidence standards begin to be implemented, evidence of effectiveness is beginning to get some respect. This is a good thing for students, teachers, parents, and our nation, but it is deeply uncomfortable for those who have long relied on curriculum content or opinion to drive selection of educational programs. Those are the people contacting me to complain about the “bad programs” being assigned positive ESSA evidence levels, the ones that, “bad” as they may be according to some peoples’ opinions, actually enhance student achievement.

For many years, in school principals’ and superintendents’ offices all over America, I’ve seen the following statement proudly displayed on the wall:

“In God we trust. All others bring data.”

At long last, this saying is beginning to apply to the critical choices educators make in selecting programs, books, software, and professional development. Good programs? Bad programs? Don’t tell me your opinions. Show me the data!

This blog is sponsored by the Laura and John Arnold Foundation

Evidence and Freedom

One of the strangest arguments I hear against evidence-based reform in education is that encouraging or incentivizing schools to use programs or practices proven to work in rigorous experiments will reduce the freedom of schools to do what they think is best for their students.

Freedom? Really?

To start with, consider how much freedom schools have now. Many districts and state departments of education have elaborate 100% evidence-free processes of restricting the freedom of schools. They establish lists of approved providers of textbooks, software, and professional development, based perhaps on state curriculum standards but also on current trends, fads, political factors, and preferences of panels of educators and other citizens. Many states have textbook adoption standards that consider paper weight, attractiveness, politically correct language, and other surface factors, but never evidence of effectiveness. Federal policies specify how teachers should be evaluated, how federal dollars should be utilized, and how students should be assessed. I could go on for more pages than anyone wants to read with examples of how teachers’ and principals’ choices are constrained by district, state, and federal policies, very few of which have ever been tested in comparison to control groups. Why do schools use this textbook or that software or the other technology? Because their district or state bought it for them, trained them in its use (perhaps), and gave them no alternative.

The evidence revolution offers the possibility of freedom, if the evidence now becoming widely available is used properly. The minimum principle of evidence-based reform should be this: “If it is proven to work, you are allowed to use it.”

At bare minimum, evidence of effectiveness should work as a “get out of jail free” card to counter whatever rules, restrictions, or lists of approved materials schools have been required to follow.

But permission is not enough, because mandated, evidence-free materials, software, and professional development may eat up the resources needed to implement proven programs. So here is a slightly more radical proposition: “Whenever possible, school staffs should have the right, by majority vote of the staff, to adopt proven programs to replace current programs mandated by the district or state.”

For example, when a district or state requires use of anything, it could make the equivalent in money available to schools to use to select and implement programs proven to be effective in producing the desired outcome. If the district adopts a new algebra text or elementary science curriculum, for instance, it could allow schools to select an alternative with good evidence of effectiveness for algebra or elementary science, as long as the school agrees to implement the program with fidelity and care, achieving levels of implementation like those in the research that validated the program.

The next level of freedom to choose what works would be to provide incentives and support for schools that select proven programs and promise to implement them with fidelity.

“Schools should be able to apply for federal, state, or local funds to implement proven programs of their choice. Alternatively, they may receive competitive preference points on grants if they promise to adopt and effectively implement proven programs.”

This principle exists today in the Every Student Succeeds Act (ESSA), where schools applying for school improvement funding must select programs that meet one of three levels of evidence: strong (at least one randomized experiment with positive outcomes), moderate (at least one quasi-experimental [matched] study with positive outcomes), or promising (at least one correlational study with positive outcomes). In seven other programs in ESSA, schools applying for federal funds receive extra competitive preference points on their applications if they commit to using programs that meet one of those three levels of evidence. The principle in ESSA – that use of proven programs should be encouraged – should be expanded to all parts of government where proven programs exist.

One problem with these principles is that they depend on having many proven programs in each area from which schools can choose. At least in reading and math, grades K-12, this has been accomplished; our Evidence for ESSA website describes approximately 100 programs that meet the top three ESSA evidence standards. More than half of these meet the “strong” standard.

However, we must have a constant flow of new approaches in all subjects and grade levels. Evidence-based policy requires continuing investments in development, evaluation, and dissemination of proven programs. The Institute of Education Sciences (IES), the Investing in Innovation (i3) program, and now the Education Innovation and Research (EIR) grant program, help fulfill this function, and they need to continue to be supported in their crucial work.

So is this what freedom looks like in educational innovation? I would argue that it does. Note that what I did not say is that programs lacking evidence should be forbidden. Mandating use of programs, no matter how well evaluated, is a path to poor implementation and political opposition. Instead, schools should have the opportunity and the funding to adopt proven programs. If they prefer not to do so, that is their choice. But my hope and expectation is that in a political system that encourages and supports use of proven programs, educators will turn out in droves to use better programs, and the schools that might have been reluctant at first will see and emulate the success their neighbors are having.

Freedom to use proven programs should help districts, states, and the federal government have confidence that they can at long last stop trying to micromanage schools. If policymakers know that schools are making good choices and getting good results, why should they want to get in their way?

Freedom to use whatever is proven to enhance student learning. Doesn’t that have a nice ring to it? Like the Liberty Bell?

This blog is sponsored by the Laura and John Arnold Foundation

Gambling With Our Children’s Futures

I recently took a business trip to Reno and Las Vegas. I don’t gamble, but it’s important to realize that casinos don’t gamble either. A casino license is permission to make a massive amount of money, risk free.

Think of a roulette table, for example, as a glitzy random numbers generator. People can put bets on any of 38 numbers, and if that number comes up, you get 36 times your bet. The difference between 38 and 36 is the “house percentage.” So as long as the wheel is spinning and people are betting, the casino is making money, no matter what the result is of a particular spin. This is true because over the course of days, weeks, or months, that small percentage becomes big money. The same principle works in every game in the casino.

In educational research, we use statistics much as the casinos do, though for a very different purpose. We want to know what the effect of a given program is on students’ achievement. Think of each student in an experiment as a separate spin of the roulette wheel. If you have just a few students, or a few spins, the results may seem very good or very bad, on average. But when you have hundreds or thousands of students (or spins), the averages stabilize.

In educational experiments, some students usually get an experimental program and others serve as controls. If there are few students (spins) in each group, the differences are unreliable. But as the numbers get larger, the difference between experimental and control groups gets reliable.

This explains why educational experiments should involve large numbers of students. With small numbers, differences could be due to chance.

Several years ago, I wrote an article on the relationship between sample size and effect size in educational experiments. Small studies (e.g., fewer than 100 students in each group) had much larger experimental-control differences (effect sizes) than big ones. How could this be?

What I think was going on is that in small studies, effect sizes could be very positive or very negative (favoring the control group). When positive results are found, results are published and publicized. When results go the other way? Not so much. The studies may disappear.

To understand this, go back to the casino. Imagine that you bet on 20 spins, and you make big money. You go home and tell your friends you are a genius, or you credit your lucky system or your rabbit’s foot. But if you lose your shirt on 20 spins, you probably slink home and stay quiet about the whole experience.

Now imagine that you bet on 1000 spins. It is statistically virtually certain that you will lose a certain amount of money (about 2/38 of what you bet, to be exact, because of 0 and 00). This outcome is not interesting, but it tells you exactly how the system works.

In big studies in education, we can also produce reliable measures of “how the system works” by comparing hundreds or thousands of experimental and control students.

Critics of quantitative research in education seem to think we are doing some sort of statistical mumbo-jumbo with our computers and baffling reports. But what we are doing is trying to get to the truth, with enough “spins” of the roulette wheel to even out chance factors.

Ironically, what large-scale research in education is intended to do is to diminish the role of chance in educational decisions. We want to help educators avoid gambling with their children’s futures.

This blog is sponsored by the Laura and John Arnold Foundation