The WWC’s 25% Loophole

I am a big fan of the concept of the What Works Clearinghouse (WWC), though I have concerns about various WWC policies and practices. For example, I have written previously with concerns about WWC’s acceptance of measures made by researchers and developers and WWC’s policy against weighting effect sizes by sample sizes when computing mean effect sizes for various programs. However, there is another WWC policy that is a problem in itself, but this problem is made more serious in light of recent Department of Education guidance on the ESSA evidence standards.

The WWC Standards and Procedures 3.0 manual sets rather tough standards for programs to be rated as having positive effects in studies meeting standards “without reservations” (essentially, randomized experiments) and “with reservations” (essentially, quasi-experiments, or matched studies). However, the WWC defines a special category of programs for which all caution is thrown to the winds. Such studies are called “substantively important,” and are treated as though they met WWC standards. Quoting from Standards and Procedures 3.0: “For the WWC, effect sizes of +0.25 standard deviations or larger are considered to be substantively important…even if they might not reach statistical significance…” The “effect size greater than +0.25” loophole (the >0.25 loophole, for short) is problematic in itself, but could lead to catastrophe for the ESSA evidence standards that now identify programs that meet “strong,” “moderate,” and “promising” levels of evidence.

The problem with the >0.25 loophole is that studies that meet the loophole criterion without meeting the usual methodological criteria are usually very, very, very bad studies, usually with a strong positive bias. These studies are often very small (far too small for statistical significance). They usually use measures made by the developers or researchers, or ones that are excessively aligned with the content of the experimental group but not the control group.

One example of the >0.25 loophole is a Brady (1990) study accepted as “substantively important” by the WWC. In it, 12 students in rural Alaska were randomly assigned to Reciprocal Teaching or to a control group. The literacy treatment was built around specific science content, but the control group never saw this content. Yet one of the outcome measures, focused on this content, was made by Mr. Brady, and two others were scored by him. Mr. Brady also happened to be the teacher of the experimental group. The effect size in this awful study was an extraordinary +0.65, though outcomes in other studies assessed on measures more fair to the control group were much smaller.

Because the WWC does not weight studies by sample size, this tiny, terrible study had the same impact in the WWC summary as studies with hundreds or thousands of students.

For the ESSA evidence standards, the >0.25 loophole can lead to serious errors. A single study meeting standards makes a program qualify for one of the top-three ESSA standards (strong, moderate, or promising). There can be financial consequences for schools using programs in the top three categories (for example, use of such programs is required for schools seeking school improvement grants). Yet a single study meeting the standards, including the awful 12-student study of Reciprocal Teaching, qualify the program for the ESSA category, no matter what is found in all other studies (unless there are qualifying studies with negative impacts). Also, the loophole works in the negative direction too, so a small, terrible study could find an effect size less than -0.25, and no amount or quality of positive findings could make that program meet WWC standards.

The >0.25 loophole is bad enough for research that already exists, but for the future, the problem is even more serious. Program developers or commercial publishers could do many small studies of their programs or could commission studies using developer-made measures. Once a single study exceeds an effect size of +0.25, the program may be considered validated forever.

To add to the problem, in recent guidance from the U. S. Department of Education, a definition of the ESSA “promising” definition specifically mentions the idea that programs can meet the promising definition if they can report statistically significant or substantively important outcomes. The guidance refers to the WWC standards for the “strong” and “moderate” categories, and the WWC standards themselves allow for the >0.25 loophole (even though this is not mentioned or implied by the law itself, which consistently requires statistically significant outcomes, not “substantially important”). In other words, programs that meet WWC standards for “positive” or “potentially positive” based on substantively important evidence alone explicitly do not meet ESSA standards, which require statistical significance. Yet the recent regulations do not recognize this problem.

The >0.25 loophole began, I’d assume, when the WWC was young and few programs met its standards. It was jokingly called the “Nothing Works Clearinghouse.” The loophole was probably added to increase the numbers of included programs. This loophole produced misleading conclusions, but since the WWC did not matter very much to educators, there were few complaints. Today, however, the WWC has greater importance because of the ESSA evidence standards.

Bad loopholes make bad laws. It is time to close this loophole, and eliminate the category of “substantively important.”

Advertisements

Immigrants and Evidence

My grandfather was an immigrant from Argentina, by way of Ellis Island. My three children were all adopted from Chile, so I’d experienced naturalization before. But last week, for the first time, I saw a naturalization ceremony for adults. My oldest son married a wonderful Russian woman, and she just become a U.S. citizen.

The whole experience was quite impressive. Perhaps fifty people from 18 different countries all over the globe were sworn in. The staff couldn’t have been more welcoming. They showed a video, just a slide show, showing pictures of immigrants over time. A new citizen from Mexico volunteered to read the Pledge of Allegiance—so worn by constant usage to most of us, but full of meaning and promise to this group: “…with liberty and justice for all.” Stop and think what those words must mean to immigrants from places in which these concepts do not exist. By my count, in 15 of the 18 countries from which these new citizens came, you could be arrested for criticizing the government.

In history, and up to the present, immigrants come to America for many reasons and in many circumstances, but they know for sure that the streets of America are not made of gold. For most, they are made of hard work, long hours in two or three menial jobs, not to mention cultural disruption, hardship, and all too often, discrimination. Perhaps life is materially better in America, perhaps it’s not. So why do so many come to our shores?

The answer for most: they come for their children, not for themselves. Even for children they don’t have yet. It’s the second or third generation, not the first, that most benefits from immigration. My grandfather from Argentina arrived with little education, no money, and no English. He became a sign painter. But my father, helped by the New York City Public Schools and then the GI Bill, went to college and graduate school, and become a clinical psychologist.

There are two key factors in every immigrant’s story of triumph. One is the determination of loving parents. But the second, is the school. The children of immigrants who succeed in school achieve the American Dream, for themselves and for our country. That’s the way things should happen, in a country founded on an ideology of the perfectibility of mankind through the powerful impact of opportunity and education.

For all of us as educators, this is a weighty responsibility. We have to see the promise in every child, immigrant or native born, and then do our part to make that promise a reality.

As researchers, developers, publishers, principals, teachers, and citizens, the responsibility for children’s futures requires that we do whatever it takes to see that all students succeed. Using proven programs is, of course, a part of this. It’s simply not good enough to have a list of excuses to explain why we cannot help far more of our most at-risk students to succeed. Sure, innovation is hard. It takes money, time, effort, and breaking of long-established routines. Many educators would prefer to just use the textbook because it’s easy. Others would prefer to make up their own, untested approaches. But schools were not built for us educators. They were built for the kids, and we owe it to every one of them to use proven strategies with enthusiasm, care, knowledge, and skill. This means developing and validating approaches specifically for the children of immigrants, but also improving instructional practices for all students.

A school full of the children of immigrants is full of wonderful stories yet to be told, versions of the same stories of triumph we tell of our own families. We cannot do any less than we are able to do to see that these stories come to pass. Immigrants do not ask for any guarantees, for themselves or for their children, but they do ask for opportunity. Enhancing the effectiveness of our schools is the best way we have to give them that opportunity and to thereby build the nation we want. And need.

This blog is sponsored by the Laura and John Arnold Foundation

The High School Graduation Miracle

High school graduation rates have skyrocketed in recent years. From 2006 to 2013, U.S. graduation rates increased from 73% to 82%. Yet over this same time period, the National Assessment of Educational Progress (NAEP) reports that the reading and math achievement of 12th graders have not budged at all.

How can these two apparently contradictory facts be reconciled? The unavoidable conclusion is that many students who were not graduating before are graduating now, or put another way, high school graduates today have lower skills than did students just a few years ago.

I don’t know exactly why this is happening, but I have a few guesses. One is that the use of what is called “credit recovery” has increased dramatically. Credit recovery means providing students who failed a given course another opportunity to pass. Apparently these courses are much easier to pass than the initial course. For example, a July 2, 2017 article in the LA Times described a credit recovery program in which a student could raise his grade from F to C in one week during the winter break. The report followed one student, who never did any lab work, but was seen copying a food pyramid from the Internet onto a worksheet. Credit Recovery courses are often offered online, in which case students can take them at home. Does this worry you? It does me.

Another possibility is that as graduation has become a focus of school accountability in many states and districts, teachers come under pressure to let marginal students pass. Unlike other accountability measures, graduation is determined by students’ grades, course credits, and other indicators that are subjective. Teachers may reason that passing such students benefits the students, the school, and themselves. So why not?

There is nothing wrong in principle with higher graduation rates, but if they are accomplished by lowering standards, then a high school diploma becomes even less valued than diplomas were in the past. This is unfair to students who work hard and pass their courses fairly, and it may contribute to cynicism throughout the system.

Further, reducing graduation standards undermines the efforts of administrators and teachers who truly want to improve student achievement as a way to improve graduation rates. If it’s a lot easier to provide credit recovery classes or to lower standards, then genuine reformers may be discouraged.

I hope there is some more optimistic explanation for the increase in high school graduation contrasted with the lack of gains in achievement. I’d love to believe that graduation rates are truly going up because of better schools and teachers, harder-working students, or other factors. Graduation is important for students, but for our society and our economy, it matters more what students can actually do. Letting students graduate without adequate skills is something we should not let pass.

This blog is sponsored by the Laura and John Arnold Foundation

Research and Development Saved Britain. Maybe They Will Save U.S. Education

One of my summer goals is to read the entire 6 volume history of the Second World War by Winston Churchill. So far, I’m about halfway through the first volume, The Gathering Storm, about the period leading up to 1939.

The book is more or less a wonderfully written rant about the Allies’ shortsightedness. As Hitler built up his armaments, Britain, France, and their allies maintained a pacifist insistence on reducing theirs. Only in the mid-thirties, when war was inevitable, did Britain start investing in armaments, but even then at a very modest pace.

Churchill was a Member of Parliament but was out of government. However, he threw himself into the one thing he could do to help Britain prepare: research and development. In particular, he worked with top scientists to develop the capacity to track, identify, and shoot down enemy aircraft.

When the 1940 Battle of Britain came and German planes tried to destroy and demoralize Britain in advance of an invasion, the inventions by Churchill’s group were a key factor in defeating them.

Churchill’s story is a good analogue to the situation of education research and development. In the current environment, the best-evaluated, most effective programs are not in wide use in U.S. schools. But the research and development that creates and evaluates these programs is essential. It is useful right away in hundreds of schools that do use proven programs already. But imagine what would happen if federal, state, or local governments anywhere decided to use proven programs to combat their most important education problems at scale. Such a decision would be laudable in principle, but where would the proven programs come from? How would they generate convincing evidence of effectiveness?  How would they build robust and capable organizations to provide high-quality professional development materials, and software?

The answer is research and development, of course. Just as Churchill and his scientific colleagues had to create new technologies before Britain was willing to invest in air defenses and air superiority at scale, so American education needs to prepare for the day when government at all levels is ready to invest seriously in proven educational programs.

I once visited a secondary school near London. It’s an ordinary school now, but in 1940 it was a private girls’ school. A German plane, shot down in the Battle of Britain, crash landed near the school. The girls ran out and captured the pilot!

The girls were courageous, as was the British pilot who shot down the German plane. But the advanced systems the British had worked out and tested before the war were also important to saving Britain. In education reform we are building and testing effective programs and organizations to support them. When government decides to improve student learning nationwide, we will be ready, if investments in research and development continue.

This blog is sponsored by the Laura and John Arnold Foundation

How Is Standards-Based Instruction Standard in Classroom Practice?

Education policy, goes an old saying, is like a storm at sea. Crashing waves, thunder, lightning, and cross-currents at the surface, but 10 fathoms down, nothing ever changes.

In our time, one of the epic storms in education relates to the Common Core State Standards (CCSS) and other college- and career-ready standards that resemble CCSS. Common Core was intended to move instructional practices and student outcomes toward problem-solving, higher-order thinking, and contextualized knowledge, and away from rote learning, memorization, and formulas. We are now five years into this reform. How is it changing what teachers actually do in the classroom?

Fortunately, there is a research center working to understand this question: the Center for Standards, Alignment, Instruction, and Learning (C-SAIL) at the University of Pennsylvania. C-SAIL, funded by the Institute of Education Sciences (IES) at the U.S. Department of Education, is carrying out several studies to examine how state policies more or less aligned with Common Core or college- and career-ready standards are changing teachers’ instruction.

Recently, C-SAIL researchers Adam Edgerton, Morgan Polikoff, and Laura Desimone, published an article on representative state-wide surveys of teachers in Kentucky, Ohio, and Texas. Kentucky and Ohio were early, enthusiastic adopters of Common Core, while Texas was one of just two states (with Virginia) that have always refused to adopt Common Core standards, giving up a shot at substantial Race to the Top grants for their obstinance. However, Texas has its own standards intended to be college-and career-ready.

In all three states, teachers reported teaching many objectives emphasized by their states’ standards, but a roughly equal number of objectives de-emphasized by the standards. This was true in reading and math and in elementary and secondary schools.

C-SAIL is doing studies in which they will obtain logs and other more detailed information on daily lessons. Perhaps they will find that teachers are increasingly teaching content aligned with their state standards. But from the C-SAIL survey, it seems unlikely that these differences will be profound enough to greatly affect students’ achievement, which is, after all, the ultimate goal.

There is a lot to admire in the Common Core and other college-and career-ready standards, and perhaps the best parts are in fact changing practices. However, turning around a country the size of the U.S., with 50 states, 14,000 school districts, 120,000 public schools, and a tradition of local autonomy is no easy feat. One big problem is that issues that rise to the stormy surface are buffeted by political currents, so they often don’t last long enough to make widespread change.

C-SAIL provides a useful service in monitoring how policy changes practice, so we can at least learn from the college-and career-ready standards movement (within which Common Core plays an important part). Journalists love to write about the crashing waves at the surface of education reform, but we need independent, scientific organizations like C-SAIL to tell us what’s really happening 10 fathoms down, where the kids are.