How do Textbooks Fit Into Evidence-Based Reform?

In a blog I wrote recently, “Evidence, Standards, and Chicken Feathers,” I discussed my perception that states, districts, and schools, in choosing textbooks and other educational materials, put a lot of emphasis on alignment with standards, and very little on evidence of effectiveness.  My colleague Steve Ross objected, at least in the case of textbooks.  He noted that it was very difficult for a textbook to prove its effectiveness, because textbooks so closely resemble other textbooks that showing a difference between them is somewhere between difficult and impossible.  Since the great majority of classrooms use textbooks (paper or digital) or sets of reading materials that collectively resemble textbooks, the control group in any educational experiment is almost certainly also using a textbook (or equivalents).  So as evidence becomes more and more important, is it fair to hold textbooks to such a difficult standard of evidence? Steve and I had an interesting conversation about this point, so I thought I would share it with other readers of my blog.

blog_12-6-18_textbook_500x404

First, let me define a couple of key words.  Most of what schools purchase could be called commodities.  These include desks, lighting, carpets, non-electronic whiteboards, playground equipment, and so on. Schools need these resources to provide students with safe, pleasant, attractive places in which to learn. I’m happy to pay taxes to ensure that every child has all of the facilities and materials they need. However, no one should expect such expenditures to make a measurable difference in achievement beyond ordinary levels.

In contrast, other expenditures are interventions.  These include teacher preparation, professional development, innovative technology, tutoring, and other services clearly intended to improve achievement beyond ordinary levels.   Educators would generally agree that such investments should be asked to justify themselves by showing their effectiveness in raising achievement scores, since that is their goal.

By analogy, hospitals invest a great deal in their physical plants, furniture, lighting, carpets, and so on. These are all necessary commodities.   No one should have to go to a hospital that is not attractive, bright, airy, comfortable, and convenient, with plenty of parking.  These things may contribute to patients’ wellness in subtle ways, but no one would expect them to make major differences in patient health.  What does make a measurable difference is the preparation and training provided to the staff, medicines, equipment, and procedures, all of which can be (and are) constantly improved through ongoing research, development, and dissemination.

So is a textbook a commodity or an intervention?  If we accept that every classroom must have a textbook or its equivalent (such as a digital text), then a textbook is a commodity, just an ordinary, basic requirement for every classroom.  We would expect textbooks-as-commodities to be well written, up-to-date, attractive, and pedagogically sensible, and, if possible, aligned with state and national standards.  But it might be unfair and perhaps futile to expect textbooks-as-commodities to significantly increase student achievement in comparison to business as usual, because they are, in effect, business as usual.

If, somehow, a print or digital textbook, with associated professional development, digital add-ons, and so forth, turns out to be significantly more effective than alternative, state-of-the-art textbooks, then a textbook could also be considered an intervention, and marketed as such.  It would then be considered in comparison to other interventions that exist only, or primarily, to increase achievement beyond ordinary levels.

The distinction between commodities and interventions would be academic but for the appearance of the ESSA evidence standards.  The ESSA law requires that schools seeking school improvement funding select and implement programs that meet one of the top three standards (strong, moderate, or promising). It gives preference points on other federal grants, especially Title II (professional development), to applicants who promise to implement proven programs. Some states have applied more stringent criteria, and some have extended use of the standards to additional funding initiatives, including state initiatives.  These are all very positive developments. However, they are making textbook publishers anxious. How are they going to meet the new standards, given that their products are not so different from others now in use?

My answer is that I do not think it was the intent of the ESSA standards to forbid schools from using textbooks that lack evidence of effectiveness. To do so would be unrealistic, as it would wipe out at least 90% of textbooks.  Instead, the purpose of the ESSA evidence standards was to encourage and incentivize the use of interventions proven to be effective.  The concept, I think, was to assume that other funding (especially state and local funds) would support the purchase of commodities, including ordinary textbooks.  In contrast, the federal role was intended to focus on interventions to boost achievement in high-poverty and low-achieving schools.  Ordinary textbooks that are no more effective than any others are clearly not appropriate for those purposes, where there is an urgent need for approaches proven to have significantly greater impacts than methods in use today.

It would be a great step forward if federal, state, and local funding intended to support major improvements in student outcomes were held to tough standards of evidence.  Such programs should be eligible for generous and strategic funding from federal, state, and local sources dedicated to the enhancement of student outcomes.  But no one should limit schools in spending their funds on attractive desks, safe and fun playground equipment, and well-written textbooks, even though these necessary commodities are unlikely to accelerate student achievement beyond current expectations.

Photo credit: Laurentius de Voltolina [Public domain]

 This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

Nevada Places Its Bets on Evidence

blog_3-29-18_HooverDam_500x375In Nevada, known as the land of big bets, taking risks is what they do. The Nevada State Department of Education (NDE) is showing this in its approach to ESSA evidence standards .  Of course, many states are planning policies to encourage use of programs that meet the ESSA evidence standards, but to my knowledge, no state department of education has taken as proactive a stance in this direction as Nevada.

 

Under the leadership of their state superintendent, Steve Canavero, Deputy Superintendent Brett Barley, and Director of the Office of Student and School Supports Seng-Dao Keo, Nevada has taken a strong stand: Evidence is essential for our schools, they maintain, because our kids deserve the best programs we can give them.

All states are asked by ESSA to require strong, moderate, or promising programs (defined in the law) for low-achieving schools seeking school improvement funding. Nevada has made it clear to its local districts that it will enforce the federal definitions rigorously, and only approve school improvement funding for schools proposing to implement proven programs appropriate to their needs. The federal ESSA law also provides bonus points on various other applications for federal funding, and Nevada will support these provisions as well.

However, Nevada will go beyond these policies, reasoning that if evidence from rigorous evaluations is good for federal funding, why shouldn’t it be good for state funding too? For example, Nevada will require ESSA-type evidence for its own funding program for very high-poverty schools, and for schools serving many English learners. The state has a reading-by-third-grade initiative that will also require use of programs proven to be effective under the ESSA regulations. For all of the discretionary programs offered by the state, NDE will create lists of ESSA-proven supplementary programs in each area in which evidence exists.

Nevada has even taken on the holy grail: Textbook adoption. It is not politically possible for the state to require that textbooks have rigorous evidence of effectiveness to be considered state approved. As in the past, texts will be state adopted if they align with state standards. However, on the state list of aligned programs, two key pieces of information will be added: the ESSA evidence level and the average effect size. Districts will not be required to take this information into account, but by listing it on the state adoption lists the state leaders hope to alert district leaders to pay attention to the evidence in making their selections of textbooks.

The Nevada focus on evidence takes courage. NDE has been deluged with concern from districts, from vendors, and from providers of professional development services. To each, NDE has made the same response: we need to move our state toward use of programs known to work. This is worth undergoing the difficult changes to new partnerships and new materials, if it provides Nevada’s children better programs, which will translate into better achievement and a chance at a better life. Seng-Dao Keo describes the evidence movement in Nevada as a moral imperative, delivering proven programs to Nevada’s children and then working to see that they are well implemented and actually produce the outcomes Nevada expects.

Perhaps other states are making similar plans. I certainly hope so, but it is heartening to see one state, at least, willing to use the ESSA standards as they were intended to be used, as a rationale for state and local educators not just to meet federal mandates, but to move toward use of proven programs. If other states also do this, it could drive publishers, software producers, and providers of professional development to invest in innovation and rigorous evaluation of promising approaches, as it increases use of approaches known to be effective now.

NDE is not just rolling the dice and hoping for the best. It is actively educating its district and school leaders on the benefits of evidence-based reform, and helping them make wise choices. With a proper focus on assessments of needs, facilitating access to information, and assistance with ensuring high quality implementation, really promoting use of proven programs should be more like Nevada’s Hoover Dam: A sure thing.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Photo by: Michael Karavanov [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Money and Evidence

Many years ago, I spent a few days testifying in a funding equity case in Alabama. At the end of my testimony, the main lawyer for the plaintiffs drove me to the airport. “I think we’re going to win this case,” he said, “But will it help my clients?”

The lawyer’s question has haunted me ever since. In Alabama, then and now, there are enormous inequities in education funding in rich and poor districts due to differences in property tax receipts in different districts. There are corresponding differences in student outcomes. The same is true in most states. To a greater or lesser degree, most states and the federal government provide some funding to reduce inequalities, but in most places it is still the case that poor districts have to tax themselves at a higher rate to produce education funding that is significantly lower than that of their wealthier neighbors.

Funding inequities are worse than wrong, they are repugnant. When I travel in other countries and try to describe our system, it usually takes me a while to get people outside the U.S. to even understand what I am saying. “So schools in poor areas get less than those in wealthy ones? Surely that cannot be true.” In fact, it is true in the U.S., but in all of our peer countries, national or at least regional funding policies ensure basic equality in school funding, and in most cases I know about they then add additional funding on top of equalized funding for schools serving many children in poverty. For example, England has long had equal funding, and the Conservative government added “Pupil Premium” funding in which each disadvantaged child brings additional funds to his or her school. Pupil Premium is sort of like Title I in the U.S., if you can imagine Title I adding resources on top of equal funding, which it does in only a few U.S. states.

So let’s accept the idea that funding inequity is a BAD THING. Now consider this: Would eliminating funding inequities eliminate achievement gaps in U.S. schools? This gets back to the lawyer’s question. If we somehow won a national “case” that required equalizing school funding, would the “clients” benefit?

More money for disadvantaged schools would certainly be welcome, and it would certainly create the possibility of major advances. But in order to maximize the impact of significant additional funding, it all depends on what schools do with the added dollars. Of course you’d have to increase teachers’ salaries and reduce class sizes to draw highly qualified teachers into disadvantaged schools. But you’d also have to spend a significant portion of new funds to help schools implement proven programs with fidelity and verve.

Again, England offers an interesting model. Twenty years ago, achievement in England was very unequal, despite equal funding. Children of immigrants from Pakistan and Bangladesh, Africans, Afro-Caribbeans, and other minorities performed well below White British children. The Labour government implemented a massive effort to change this, starting with the London Challenge and continuing with a Manchester Challenge and a Black Country Challenge in the post-industrial Midlands. Each “challenge” provided substantial professional development to school staffs, as well as organizing achievement data to show school leaders that other schools with exactly the same demographic challenges were achieving far better results.

Today, children of Pakistani and Bangladeshi immigrants are scoring at the English mean. Children of African and Afro-Caribbean immigrants are just below the English mean. Policy makers in England are now turning their attention to White working-class boys. But the persistent and substantial gaps we see as so resistant to change in the U.S. are essentially gone in England.

Today, we are getting even smarter about how to turn dollars into enhanced achievement, due to investments by the Institute of Education Sciences (IES) and Investing in Innovation (i3) program in the U.S. and the Education Endowment Foundation (EEF) in England. In both countries, however, we lack the funding to put into place what we know how to do on a large enough scale to matter, but this need not always be the case.

Funding matters. No one can make chicken soup out of chicken feathers, as we say in Baltimore. But funding in itself will not solve our achievement gap. Funding needs to be spent on specific, high-impact investments to make a big difference.

Accelerating the Pace of Innovation

The biggest problem in evidence-based reform in education is that there are too few replicable programs that have strong evidence of effectiveness available to educators. The evidence provisions of the Every Student Succeeds Act (ESSA) encourage the use of programs that have strong, moderate, or promising evidence of effectiveness, and they require School Improvement efforts (formerly SIG) to include approaches with evidence that meets these definitions. There are significant numbers of programs that do meet these definitions, but not enough to give educators multiple choices of proven programs for each subject and grade level. The Institute for Education Sciences (IES), Investing in Innovation (i3) program, the National Science Foundation (NSF), and England’s Education Endowment Foundation (EEF) have all been supporting rigorous evaluations of replicable programs at all levels, and this work (and work funded by others) is progressively enriching offerings of programs that are both proven to be effective and ready for widespread dissemination. However, progress is slow. Large-scale randomized experiments demanded by these funders are expensive and may take many years to be completed. As in any scientific field (such as medicine), most experiments do not show positive outcomes for innovative treatments. At a time when demand is starting to pick up, the supply needs to keep pace.

Given that money is not being thrown at education research by Congress or other funders, how can promising innovations be evaluated, made ready for dissemination, and taken to scale? First, existing funders need to be supported adequately to continue the good work they are doing. Grants for Education Innovation and Research (EIR) will pick up where i3 ends, and IES needs to maintain its leadership in supporting development and evaluation of promising programs in all subjects and grade levels. The National Science Foundation should invest far more in creating, evaluating, and disseminating proven STEM approaches. All of this work, in fact, is in need of increased funding and publicity to build political and public support for the entire enterprise.

However, there are several additional avenues that might be pursued to increase the number of proven, ready-to-disseminate approaches. One promising model is low-cost randomized evaluations of interventions supported by government or other funding. Both IES and the Laura and John Arnold Foundation are offering support for such studies. For example, imagine that a school district is introducing a digital textbook to its schools, however, it only can afford to provide the program to 30 schools each year. If the district finds 60 schools willing to receive the program and randomly assigns half of them to start in a given year, then it is spending no more on digital textbooks than it planned to spend. If state test scores can be obtained and used as pre- and post-tests, then the measurement costs nothing. The only costs of studying the effects of the digital textbooks might be the costs of data analysis, perhaps some questionnaires or observations to find out what schools did with the digital textbooks, and a report. Such a study would be very inexpensive, might produce results within a year or two, and would be evaluating something that is appealing to schools and ready to go.

Beyond these existing strategies, others might be considered to speed up the proven programs process. One example might be to build on Small Business Innovation Research (SBIR) grants. At $1 million over two years, these grants, limited to for-profit companies, are often too small to develop and evaluate promising approaches (usually, technology applications). IES or other funders might proactively look for promising SBIR projects and encourage them to apply for larger funding to complete development and do rigorous evaluations. One advantage of SBIRs is that they are usually created by small, ambitious, undercapitalized companies, which are motivated to take their programs to scale.

Another strategy that might work could be to fund “aggregators” whose job would be to identify promising approaches from any source, help assemble partnerships if necessary, and then help prepare applications for funding. This could help young innovators with great ideas combine their efforts, create more complete and powerful innovations, and subject them to rigorous evaluations. In addition to SBIR-funded projects, promising program elements might be found in projects funded by private foundations or agencies outside of education. They might be components of IES or i3 projects that produced promising but not conclusive outcomes in their evaluations, perhaps due to insufficient sample size. Aggregators might link up programs with broad reach but limited technology with brash technology start-ups in need of access to markets. If the goal is finding promising but incomplete efforts and helping them reach effectiveness and scale, every source should be fair game.

Government has made extraordinary progress in promoting the development, rigorous evaluation, and scale-up of proven programs. However, its success has led to a demand for proven programs that it cannot fulfill at the usual pace. Current grant programs at IES and i3/EIR should continue, but in addition we need innovative strategies capable of greatly accelerating the pace of development, evaluation, and scale up.

Educationists and Economists

I used to work part time in England, and I’ve traveled around the world a good bit speaking about evidence-based reform in education and related topics. One of the things I find striking in country after country is that at the higher levels, education is not run by educators. It is run by economists.

In the U.S., this is also true, though it’s somewhat less obvious. The main committees in Congress that deal with education are the House Education and the Workforce Committee and the Senate Health, Education, Labor, and Pensions (HELP) Committee. Did you notice the words “workforce” and “labor”? That’s economists. Further, politicians listen to economists, because they consider them tough-minded, data-driven, and fact-friendly. Economists see education as contributing to the quality of the workforce, now and in the future, and this makes them influential with politicians.

A lot of the policy prescriptions that get widely discussed and implemented broadly are the sorts of things economists love to dream up. For example, they are partial to market incentives, new forms of governance, rewards and punishments, and social impact bonds. Individual economists, and the politicians who listen to them, take diverse positions on these policies, but the point is that economists rather than educators often set the terms of the debates on both sides. As one example, educators have been talking about long-term impacts of quality preschool for 30 years, but when Nobel Prize-winning economist James Heckman took up the call, preschool became a top priority of the Obama Administration.

I have nothing against economists. Some of my best friends are economists. But here is why I am bringing them up.

Evidence-based reform is creating a link between educationists and economists, and thereby to the politicians who listen to them, that did not exist before. Evidence-based reform speaks the language that economists insist on: randomized evaluations of replicable programs and practices. When an educator develops a program, successfully evaluates it at scale, and shows it can be replicated, this gives economists a tangible tool they can show will make a difference in policy. Other research designs are simply not as respected or accepted. But an economist with a proven program in hand has a measurable, powerful means to affect policy and help politicians make wise use of resources.

If we want educational innovation and research to matter to public policy, we have to speak truth to power, in the language of power. And that language is increasingly the language of rigorous evidence. If we keep speaking it, our friends the economists will finally take evidence from educational research seriously, and that is how policy will change to improve outcomes for children on a grand scale.

R&D That Makes a Difference

Over the course of my career, I’ve written a lot of proposals. I’ve also reviewed a lot, and mostly, I’ve seen many funded projects crash and burn, or produce a scholarly article or two that are never heard of again.

As evidence becomes more important in educational policy and practice, I think it’s time to rethink the whole process of funding for development, evaluation, and dissemination.

Here’s how the process works now at the federal level. The feds put out a Request for Proposals (RFP) in the Federal Register. It specifies the purpose of the grant, who is eligible, funding available, deadlines, and most importantly, the criteria on which the proposals will be judged. Proposal writers know that they must follow those criteria very carefully to make it easy for readers to know that each criterion has been satisfied.

The problem with the whole proposal system lies in the perception that each proposal starts with a perfect score (usually 100), and is then marked down for any deficiencies. To oversimplify, reviewers nitpick, and if there is much left after the nits have been picked, the proposal wins.

What this system rewards is enormous care and OCD-level attention to detail. It does not reward creativity, risk, insight, or actual utility for schools. Yet funding grants that do not move forward practice at any significant scale do not do much good in an applied field like education (in related fields such as psychology, purely basic research might justify such approaches, but in education this is a hard argument to make). Maybe our collective inability to do research that affects practice on a broad scale explains some of the lack of enthusiasm our political leadership has for research.

So what would I propose as an alternative? I’m so glad you asked. I’d propose that RFPs be explicitly structured to ask not, “Why shouldn’t we fund this proposal,” but, “Why should we?” That is, proposal writers should be asked to make a case for the potential importance of their work. Here’s a model set of evaluation standards to illustrate what I mean.

A. Significance
1. What are you planning to create?
2. What national problem does your proposed program potentially solve?
3. What outcomes do you expect to achieve, and why are these important?
4. Based on prior research by yourself and others, what is the likelihood that your program will produce the outcomes you expect?
5. What is the likelihood that, if your program is successful, it will work on a significant scale? What is your experience with working at scale or scaling up proven programs in educational settings?
6. In what way is your program creative or distinctive? How might it spark new thinking or development to solve longstanding problems in education?

B. Capabilities
1. Describe the organizational capabilities of the partners to this proposal, as well as the capabilities of the project leadership. Consider capabilities in the following areas:
a. development
b. roll-out, piloting
c. evaluation
d. reporting
e. scale-up
f. communications, marketing
2. Timelines, milestones

C. Evaluation
1. Research questions
2. Design, analysis

D. Impact
Given all you’ve written so far, summarize in one page why this project will make a substantial difference in educational practice and policy.

If we want research and development to produce useful solutions to educational problems, we have to ask the field for just that, and reward those able to produce, evaluate, and disseminate such solutions. Ironically, the federal funding stream closest to the ideal I’ve described is the Investing in Innovation (i3) program, which Congress may be about to shut down. i3 is at least focused on pragmatic solutions rather than theory-building and it has high standards of evidence. But if i3 survives or if it is replaced by another initiative to support innovation, development, evaluation, and scale-up of proven programs, I’d argue that it needs to focus even more on pragmatic issues of effectiveness and scale. Reviewers should be exclaiming, “I get it!” rather than “I gotcha!”

The Evidence or The Morgue

Many years ago, when I was a special education teacher, I had a summer job at a residential school for emotionally disturbed children. The school happened to be located in a former tuberculosis sanitarium. Later, I heard from other teachers elsewhere about having worked in schools in one-time sanitaria as well.

How did it come about, one might ask, that tuberculosis sanitaria across the country became available for use as schools? The answer is that researchers cured the disease. The sanitaria were no longer needed for their original purpose, so they were turned into schools.

One feature of the former sanitaria is that they all had morgues. We used ours to store curriculum materials, because it had very sturdy and useful sliding horizontal cabinets. This arrangement led to a certain amount of macabre humor, but the morgue reminded us that what the sanitaria had once done was deadly serious indeed.

I was recalling my summer in the sanitarium after reading about the latest developments in the reauthorization of ESEA. Both the House and the Senate have now passed bills that eliminate the Investing in Innovation (i3) program and cut funding for the Institute of Education Sciences. In their place, the bills have a lot of language about state and local control, and about identifying and publicizing individual schools that are doing a particularly good job so their good works can help inspire and influence other schools. None of this would bother me if the legislation contained a clear commitment to rigorous research, development, and dissemination, but this may or may not be the case.

The Senate bill, which passed with bipartisan support last week, does authorize an evidence-based innovation fund. Modeled on the successful Small Business Innovation Research (SBIR) program, which funds innovation and evaluation in 11 different government agencies, this initiative would provide flexible funding for a broad range of field-driven projects and allow schools, districts, non-profits, and small businesses to develop and grow innovative programs to improve student achievement. Grants would be awarded using a tiered evidence framework based on an applicant’s proven effectiveness. The provision was initially offered and accepted as a bipartisan amendment during the Senate HELP Committee markup of its bill. However, the House bill has no comparable provision, and I have to wonder if the Senate provision will survive the grueling conference process and make it into the final bill.

Try to imagine what would have happened if tuberculosis research had been treated the way education research has been treated in the House version of the ESEA reauthorization bill. Individual sanitaria with lower death rates might be recognized. States and localities might try out ideas to make the sanitaria more effective, but few if any states or localities would be large enough to do the necessary sustained R & D. “Best practices” would be constrained by the current system, so they might involve better ways for sanitarium staff to do exercises with patients, for example, rather than experimenting with medications or other treatments. The disease would never have been cured. The morgues would still be used for unfortunate patients, not for curriculum materials.

The U.S. spends hundreds of billions of dollars every year on education. What student, parent, teacher, administrator, or policy maker does not want those billions used to make as much of a difference as possible? The pursuit of knowledge about how to improve educational outcomes is obviously important, but it is rarely very high on anyone’s priority list.

Fortunately, medicine and other fields long ago decided that research was in the national interest, and that investments in research were the most reliable way forward in improving important outcomes. In medicine, the choice is stark: either the evidence prevails or the morgue does. Yet in education, anyone with eyes to see knows what happens when children fail to learn. Most of the children who cannot read end up unemployed. Many end up in prison, and all too many in the morgue. We know enough now to be able to say that the great majority of reading failure, for example, is preventable. Yet we choose not to prevent it. What does this say about us as a people, as a society, as a political system?

I hope our leaders in Congress approve the Senate language on evidence, or something similar, and reinstate and fund programs that have the greatest promise in identifying and disseminating effective approaches to key problems. The lives of a generation of vulnerable children depend on their wisdom and courage at this critical juncture.