Preschool is Not Magic. Here’s What Is.

If there is one thing that everyone knows about policy-relevant research in education, it is this: Participation in high-quality preschool programs (at age 4) has substantial and lasting effects on students’ academic and life success, especially for students from disadvantaged homes. The main basis for this belief is the findings of the famous Perry Preschool program, which randomly assigned 128 disadvantaged youngsters in Ypsilanti, Michigan, to receive intensive preschool services or not to receive these services. The Perry Preschool study found positive effects at the end of preschool, and long-term positive impacts on outcomes such as high school graduation, dependence on welfare, arrest rates, and employment (Schweinhart, Barnes, & Weikart, 1993).

blog_8-2-18_magicboy_500x333

But prepare to be disappointed.

Recently, a new study has reported a very depressing set of outcomes. Lipsey, Farran, & Durkin (2018) published a large, randomized study evaluating Tennessee’s statewide preschool program. 2990 four year olds were randomly assigned to participate in preschool, or not. As in virtually all preschool studies, children who were randomly assigned to preschool scored much better than those who were assigned to the control group. But these results diminished in kindergarten, and by first grade, no positive effects could be detected. By third grade, the control group actually scored significantly higher than the former preschool students in math and science, and non-significantly higher in reading!

Jon Baron of the Laura and John Arnold Foundation wrote an insightful commentary on this study, noting that when such a large, well-done, long-term, randomized study is reported, we have to take the results seriously, even if they disagree with our most cherished beliefs. At the end of Baron’s brief summary was a commentary by Dale Farran and Mark Lipsey, two the study’s authors, telling the story of the hostile reception to their paper in the early childhood research community and the difficulties they had getting this exemplary experiment published.

Clearly, the Tennessee study was a major disappointment. How could preschool have no lasting effects for disadvantaged children?

Having participated in several research reviews on this topic (e.g., Chambers, Cheung, & Slavin, 2016), as well as some studies of my own, I have several observations to make.

Although this may have been the first large, randomized evaluation of a state-funded preschool program in the U.S., there have been many related studies that have had the same results. These include a large, randomized study of 5000 children assigned to Head Start or not (Puma et al., 2010), which also found positive outcomes at the end of the pre-K year, but only scattered lasting effects after pre-K. Very similar outcomes (positive pre-k outcomes with little or no lasting impact) have been found in a randomized evaluation of a national program called Sure Start in England (Melhuish, Belsky, & Leyland, 2010), and one in Australia (Claessens & Garrett, 2014).

Ironically, the Perry Preschool study itself failed to find lasting impacts, until students were in high school. That is, its outcomes were similar to those of the Tennessee, Head Start, Sure Start, and Australian studies, for the first 12 years of the study. So I suppose it is possible that someday, the participants in the Tennessee study will show a major benefit of having attended preschool. However, this seems highly doubtful.

It is important to note that some large studies of preschool attendance do find positive and lasting effects. However, these are invariably matched, non-experimental studies of children who happened to attend preschool, compared to others who did not. The problem with such studies is that it is essentially impossible to statistically control for all the factors that would lead parents to enroll their child in preschool, or not to do so. So lasting effects of preschool may just be lasting effects of having the good fortune to be born into the sort of family that would enroll its children in preschool.

What Should We Do if Preschool is Not Magic?

Let’s accept for the moment the hard (likely) reality that one year of preschool is not magic, and is unlikely to have lasting effects of the kind reported by the Perry Preschool study (and no other randomized studies.) Do we give up?

No.  I would argue that rather than considering preschool magic-or-nothing, we should think of it the same way we think about any other grade in school. That is, a successful school experience should not be one terrific year, but fourteen years (pre-k to 12) of great instruction using proven programs and practices.

First comes the preschool year itself, or the two year period including pre-k and kindergarten. There are many programs that have been shown in randomized studies to be successful over that time span, in comparison to control groups of children who are also in school (see Chambers, Cheung, & Slavin, 2016). Then comes reading instruction in grades K-1, where randomized studies have also validated many whole-class, small group, and one-to-one tutoring methods (Inns et al., 2018). And so on. There are programs proven to be effective in randomized experiments, at least for reading and math, for every grade level, pre-k to 12.

The time has long passed since all we had in our magic hat was preschool. We now have quite a lot. If we improve our schools one grade at a time and one subject at a time, we can see accumulating gains, ones that do not require waiting for miracles. And then we can work steadily toward improving what we can offer children every year, in every subject, in every type of school.

No one ever built a cathedral by waving a wand. Instead, magnificent cathedrals are built one stone at a time. In the same way, we can build a solid structure of learning using proven programs every year.

References

Baron, J. (2018). Large randomized controlled trial finds state pre-k program has adverse effects on academic achievement. Straight Talk on Evidence. Retrieved from http://www.straighttalkonevidence.org/2018/07/16/large-randomized-controlled-trial-finds-state-pre-k-program-has-adverse-effects-on-academic-achievement/

Chambers, B., Cheung, A., & Slavin, R. (2016). Literacy and language outcomes of balanced and developmental-constructivist approaches to early childhood education: A systematic review. Educational Research Review 18, 88-111.

Claessens, A., & Garrett, R. (2014). The role of early childhood settings for 4-5 year old children in early academic skills and later achievement in Australia. Early Childhood Research Quarterly, 29, (4), 550-561.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Lipsey, Farran, & Durkin (2018). Effects of the Tennessee Prekindergarten Program on children’s achievement and behavior through third grade. Early Childhood Research Quarterly. https://doi.org/10.1016/j.ecresq.2018.03.005

Melhuish, E., Belsky, J., & Leyland, R. (2010). The impact of Sure Start local programmes on five year olds and their families. London: Jessica Kingsley.

Puma, M., Bell, S., Cook, R., & Heid, C. (2010). Head Start impact study: Final report.  Washington, DC: U.S. Department of Health and Human Services.

Schweinhart, L. J., Barnes, H. V., & Weikart, D. P. (1993). Significant benefits: The High/Scope Perry Preschool study through age 27 (Monographs of the High/Scope Educational Research Foundation No. 10) Ypsilanti, MI: High/Scope Press.

 

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

Little Sleepers: Long-Term Effects of Preschool

In education research, a “sleeper effect” is not a way to get all of your preschoolers to take naps. Instead, it is an outcome of a program that appears not immediately after the end of the program, but some time afterwards, usually a year or more. For example, the mother of all sleeper effects was the Perry Preschool study, which found positive outcomes at the end of preschool but no differences throughout elementary school. Then positive follow-up outcomes began to show up on a variety of important measures in high school and beyond.

Sleeper effects are very rare in education research. To see why, consider a study of a math program for third graders that found no differences between program and control students at the end of third grade, but then a large and significant difference popped up in fourth grade or later. Long-term effects of effective programs are often seen, but how can there be long-term effects if there are no short-term effects on the way? Sleeper effects are so rare that many early childhood researchers have serious doubts about the validity of the long-term Perry Preschool findings.

I was thinking about sleeper effects recently because we have recently added preschool studies to our Evidence for ESSA website. In reviewing the key studies, I was once again reading an extraordinary 2009 study by Mark Lipsey and Dale Farran.

The study randomly assigned Head Start classes in rural Tennessee to one of three conditions. Some were assigned to use a program called Bright Beginnings, which had a strong pre-literacy focus. Some were assigned to use Creative Curriculum, a popular constructive/developmental curriculum with little emphasis on literacy. The remainder were assigned to a control group, in which teachers used whatever methods they ordinarily used.

Note that this design is different from the usual preschool studies frequently reported in the newspaper, which compare preschool to no preschool. In this study, all students were in preschool. What differed is only how they were taught.

The results immediately after the preschool program were not astonishing. Bright Beginnings students scored best on literacy and language measures (average effect size = +0.21 for literacy, +0.11 for language), though the differences were not significant at the school level. There were no differences at all between Creative Curriculum and control schools.

Where the outcomes became interesting was in the later years. Ordinarily in education research, outcomes measured after the treatments have finished diminish over time. In the Bright Beginnings/Creative Curriculum study the outcomes were measured again when students were in third grade, four years after they left school. Most students could be located because the test was the Tennessee standardized test, so scores could be found as long as students were still in Tennessee schools.

On third grade reading, former Bright Beginnings students now scored significantly better than former controls, and the difference was statistically significant and substantial (effect size = +0.27).

In a review of early childhood programs at www.bestevidence.org, our team found that across 16 programs emphasizing literacy as well as language, effect sizes did not diminish in literacy at the end of kindergarten, and they actually doubled on language measures (from +0.08 in preschool to +0.15 in kindergarten).

If sleeper effects (or at least maintenance on follow-up) are so rare in education research, why did they appear in these studies of preschool? There are several possibilities.

The most likely explanation is that it is difficult to measure outcomes among four year-olds. They can be squirrely and inconsistent. If a pre-kindergarten program had a true and substantial impact on children’s literacy or language, measures at the end of preschool may not detect it as well as measures a year later, because kindergartners and kindergarten skills are easier to measure.

Whatever the reason, the evidence suggests that effects of particular preschool approaches may show up later than the end of preschool. This observation, and specifically the Bright Beginnings evaluation, may indicate that in the long run it matters a great deal how students are taught in preschool. Until we find replicable models of preschool, or pre-k to 3 interventions, that have long-term effects on reading and other outcomes, we cannot sleep. Our little sleepers are counting on us to ensure them a positive future.

This blog is sponsored by the Laura and John Arnold Foundation

You Can Step Twice in the Same River: Systems in Education

You can never step twice in the same river.  At least that is what Greek philosopher Heraclitus said a long time ago, when Socrates was just a pup.  What he meant, of course, was that a river is constantly changing, for reasons large and small, so the river you waded across yesterday, or even a minute ago, is not the same one you wade in now.

This proposition is both obvious and wrong.  Sure, rivers are never 100% the same.  But does it matter?  Imagine, for example, that you somehow drained all the water out of a river.  Within a few days or weeks, it would entirely revive itself.  The reason is that a river is not a “thing.”  It is a system.  In other words, a river exists because there is a certain level of rainfall or groundwater or water from upstream, and then a certain topography (rivers are in low-lying areas, compared to surrounding land).  Those factors create the river, and as long as they exist, the river exists.  So when you wade into a river, you are wading into a system, and (sorry, Heraclitus) it is always the same system, because even if the river is higher or lower or muddier or clearer than usual, the system is always the same, unless something pretty dramatic happens upstream.

So why am I rattling on about rivers?  The point I hope to make is that genuine and lasting change in a school depends on changing the system in which the school operates, not just small parts of the school that will be swept away if the system stays unchanged.

Here’s what I mean from an education reform perspective.  Teachers’ daily practices in classrooms are substantially determined by powerful systems.  Whatever innovations you introduce in a school, no matter how effective in the short term, will be eliminated and forgotten if the rest of the system does not change.  For example, if a school implements a great new math program but does not solve classroom management or attendance problems, the school may not maintain its math reform.  Lasting change in math, for example, might require attending to diversity in achievement levels by providing effective tutoring or small-group assistance.  It might require providing eyeglasses to children who need them.  It might require improving reading performance as well as math.  It might require involving parents.  It might require constant monitoring of students’ math performance and targeted responses to solve problems.  It might require recruiting volunteers, or making good use of after school or summer time.  It might require mobilizing department heads or other math leaders within the school to support implementation, and to help maintain the effective program when (predictable) turmoil threatens it.  Policy changes at the district, state, and national levels may also help, but I’m just focusing for the moment on aspects of the system that an individual school or district can implement on its own.  Attending to all of these factors at once may increase the chances that in five or ten years, the effective program remains in place and stays effective, even if the original principal, department head, teachers, and special funds are no longer at the school.

It’s not that every school has to do all of these things to improve math performance over time, but I would argue that lasting impact will depend on some constellation of supports that change the system in which the math reform operates.  Otherwise, the longstanding system of the school will return, washing away the reform and taking the school back to its pre-reform behaviors and policies.

A problem in all of this is that educational development and research often work against systemic change.  In particular, academic researchers are rewarded for publishing articles, and it helps if they evaluate approaches that purely represent a given theory.  Pragmatically, an approach with many components may be more expensive and more difficult to put in place.  As a result, a lot of proven programs available to educators are narrow, focused on the main objective but not on the broader system of the school.  This may be fine in the short run, but in the long run the narrowly focused treatment may not maintain over time.

Seen as a system, a river will never change its course until the key elements that determine its course themselves change.  Unless that happens, we’ll always be stepping into the same river, over and over again, and getting the same results.

Keep Up the Good Work (To Keep Up the Good Outcomes)

I just read an outstanding study that contains a hard but crucially important lesson. The study, by Woodbridge et al. (2014), evaluated a behavior management program for students with behavior problems. The program, First Step to Success, has been successfully evaluated many times. In the Woodbridge et al. study, 200 children in grades 1 to 3 with serious behavior problems were randomly assigned to experimental or control groups. On behavior and achievement measures, students in the experimental group scored much higher, with effect sizes of +0.44 to +0.87. Very impressive.

The researchers came back a year later to see if the outcomes were still there. Despite the substantial impacts seen at posttest, none of three prosocial/adaptive behavior measures, only one of three problem/maladaptive behaviors, and none of four academic achievement measures showed positive outcomes.

These findings were distressing to the researchers, but they contain a message. In this study, students passed from teachers who had been trained in the First Step method to teachers who had not. The treatment is well-established and inexpensive. Why should it ever be seen as a one-year intervention with a follow-up? Instead, imagine that all teachers in the school learned the program and all continued to implement it for many years. In this circumstance, it would be highly likely that the first-year positive impacts would be sustained and most likely improved over time.

Follow-up assessments are always interesting, and for interventions that are very expensive it may be crucial to demonstrate lasting impacts. But so often in education effective treatments can be maintained for many years, creating more effective school-wide environments and lasting impacts over time. Much as we might like to have one-shot treatments with long-lasting impacts, this does not correspond to the nature of children. The personal, family, or community problems that led children to have problems at a given point in time are likely to lead to problems in the future, too. But the solution is clear. Keep up the good work to keep up the good outcomes!

Preschools and Evidence: A Child Will Lead Us

2014-02-06-HPImage.jpg

These are exciting times for people who care about preschool, for people who care about evidence, and especially for people who care about both. President Obama advocated for expanding high-quality preschool opportunities, Bill de Blasio, the new Mayor of New York City, is proposing new taxes on the wealthy for this purpose, and many states are moving toward universal preschool, or at least considering it. The recently passed Omnibus Budget had $250 million in it for states to add to or improve their preschool programs.

What is refreshing is that after thirty years of agreement among researchers that it’s only high-quality preschools that have long-term positive effects, the phrase “high quality” has become part of the political dialogue. At a minimum, “high quality” means “not just underpaid, poorly educated preschool teachers.” But beyond this, “high quality” is easy to agree on, difficult to define.

This is where evidence comes in. We have good evidence about long-term effects of very high-quality preschool programs compared to no preschool, but identifying exceptionally effective, replicable programs (in comparison to run-of-the-mill preschools) has been harder.

The importance of identifying preschool programs that actually work is being recognized not only in academia, but in the general press as well. In the January 29 New York Times, Daniel Willingham and David Grissmer advocated local and national randomized experiments to find out what works in preschool. On January 30, Nicholas Kristof wrote about rigorous research supporting long-term effects of preschool. Two articles on randomized experiments in education would be a good week for Education Week, much less the New York Times.

With President Obama, John Boehner, and the great majority of Americans favoring expansion of high-quality preschools, this might be an extraordinarily good time for the U.S. Department of Education to sponsor development and evaluation of promising preschool models. At the current rate it will take a long time to get to universal pre-K, so in the meantime let’s learn what works.

The U. S. Department of Education did such a study several years ago called Preschool Curriculum Evaluation Research (PCER), in which various models were compared to ordinary preschool approaches. PCER found that only a few models did better than their control groups, but there was a clear pattern to the ones that did. These were models that provided teachers with extensive professional development and materials with a definite structure designed to build vocabulary, phonemic awareness, early math concepts, and school skills. They were not just early introduction of kindergarten, but focused on play, themes, rhymes, songs, stories, and counting games with specific purposes well understood by teachers.

In a new R & D effort, innovators might be asked to create new, practical models, perhaps based on the PCER findings, and evaluate them in rigorous studies. Within a few years, we’d have many proven approaches to preschool, ones that would justify the optimism being expressed by politicians of all stripes.

Historically, preschool is one of the few areas of educational practice or policy in which politicians and the public consider evidence to have much relevance. Perhaps if we get this one right, they will begin to wonder, if evidence is good for four year olds, why shouldn’t we consult it for the rest of education policy? If evidence is to become important for all of education, perhaps it has to begin with a small child leading us.

Education Innovation: What It Is and Why We Need More of It

NOTE: This is a guest post from Jim Shelton, Assistant Deputy Secretary of the Office of Innovation and Improvement at the U.S. Department of Education.

Whether for reasons of economic growth, competitiveness, social justice or return on tax-payer investment, there is little rational argument over the need for significant improvement in U.S. educational outcomes. Further, it is irrefutable that the country has made limited improvement on most educational outcomes over the last several decades, especially when considered in the context of the increased investment over the same period. In fact, the total cost of producing each successful high school and college graduate has increased substantially over time instead of decreasing – creating what some argue is an inverted learning curve.

This analysis stands in stark contrast to the many anecdotes of teachers, schools and occasionally whole systems “beating the odds” by producing educational outcomes well beyond “reasonable” expectations. And, therein lies the challenge and the rationale for a very specific definition of educational innovation.

Education not only needs new ideas and inventions that shatter the performance expectations of today’s status quo; to make a meaningful impact, these new solutions must also “scale”, that is grow large enough, to serve millions of students and teachers or large portions of specific under-served populations. True educational innovations are those products, processes, strategies and approaches that improve significantly upon the status quo and reach scale.

Shelton graphic.JPG

Systems and programs at the local, state and national level, in their quest to improve, should be in the business of identifying and scaling what works. Yet, we traditionally have lacked the discipline, infrastructure, and incentives to systematically identify breakthroughs, vet them and support their broad adoption – a process referred to as field scans. Programs like the Department of Education’s Investing in Innovation Fund (i3) are designed as field scans; but i3 is tiny in comparison to both the need and the opportunity. To achieve our objectives, larger funding streams will need to drive the identification, evaluation, and adoption of effective educational innovations.

Field scans are only one of three connected pathways to education innovation, and they build on the most recognized pathway – basic and applied research. The time to produce usable tools and resources from this pathway can be long – just as in medicine where development and approval of new drugs and devices can take 12-15 years – but, with more and better leveraged resources, more focus, and more discipline, this pathway can accelerate our understanding of teaching and learning and production of performance enhancing practices and tools.

The third pathway focuses specifically on accelerating transformational breakthroughs, which require a different approach – directed development. Directed development processes identify cutting edge research and technology (technology generically, not specifically referring to software or hardware) and use a uniquely focused approach to accelerate the pace at which specific game changing innovations reach learners and teachers. Directed development within the federal government is most associated with DARPA (the Defense Advanced Research Projects Agency), which used this unique and aggressive model of R&D to produce technologies that underlie the Internet, GPS, and the unmanned aircraft (drone). Education presents numerous opportunities for such work. For example: (1) providing teachers with tools that identify each student’s needs and interests and match them to the optimal instructional resources or (2) cost-effectively achieving the 2 standard deviations of improvement that one-to-one human tutors generate. In 2010, the President’s Council of Advisors on Science and Technology recommended the creation of an ARPA for Education to pursue directed development in these and other areas of critical need and opportunity.

Each of these pathways -the field scan, basic and applied research and directed development – will be essential to improving and ultimately transforming learning from cradle through career. If done well, we will redefine “the possible” and reclaim American educational leadership while addressing inequity at home and abroad. At that point, we may be able to rely on a simpler definition of innovation:

“An innovation is one of those things that society looks at and says, if we make this part of the way we live and work, it will change the way we live and work.”

-Dean Kamen

-Jim Shelton

Note: The Office of Innovation and Improvement at the U.S. Department of Education administers more than 25 discretionary grant programs, including the Investing in Innovation Program, Charter Schools Program, and Technology in Education.

 

A Commitment to Research Yields Improvements in Charter Network

Note: This is a guest post by Richard Barth, CEO and President of the KIPP FoundationMathematica
In his inaugural post for this blog, Robert Slavin wrote, “We did not manage our way to the moon, we invented our way to the moon.” I hear echoes of this statement throughout my work. Like other national charter school leaders, I am committed to making sure innovation can blossom and spread, throughout our own network and public schools nationwide.

But along with innovation we must insist on research and results. Across the 31 KIPP regions nationally, for example, we give schools autonomy to innovate as they see fit, as long as they can demonstrate that they are producing results for our students.

So how does a charter network like ours make sure schools are producing results? Not only do we assess our own schools on a regular basis, with publications like our yearly Report Card, but we also make a practice of inviting independent researchers to evaluate our results.

By building a solid body of evidence for what works–including independent reports about student achievement in our schools–we are able to set and maintain a high bar for achievement in our schools. The evidence then helps us build on what is working and to make adjustments where the
research has identified areas where we need to improve. For example, a study by Mathematica found that KIPP middle schools students make statistically significant gains in math and reading, even though students enter KIPP with lower average test scores than their neighboring peers in district schools. The same Mathematica report also found that KIPP schools are serving fewer special-education and Limited English Proficient (LEP), students than the average for neighboring district schools. This is a challenge for many charter schools and something we are making a priority throughout our network. So where we find we are doing well in both numbers of students served and their results -like the KIPP Academy Lynn near Boston, Mass., which is highlighted in a 2010 working paper from the National Bureau of Economic Research–we have an opportunity to zero in on what’s working and spread this news to our network and charter schools nationwide.

As more of our students move on to college, research can also help us keep tabs on how they are faring. We are just starting to examine the college completion rates of our students. In April we released our first-ever College Completion Report, which looked at the college graduation rates of KIPP’s earliest graduates from the mid-1990’s. Thirty-three percent of these KIPP students had finished college by their mid-twenties which is above the national average and four times the rate of their peers from low-income communities. This is far short of our goal of 75 percent, which is the average college completion rate for kids from affluent families.

By sharing these results we hope to encourage a national dialogue about how to improve college completion rates in America, especially among low income students. But we need school districts and charter school to start publicly reporting college completion rates fully–including those of eighth grade graduates, not just high school graduates or college freshmen, a practice that fails to give us a true picture.

This process of improvement is hard work; there’s no question. But by committing to research and accountability, we can set off a more vigorous and transparent conversation among public educators across the country about what we need to do to ensure success for all of our schools and students.

 

-Richard Barth

KIPP, the Knowledge Is Power Program, is a national network of free, open-enrollment, college-preparatory public charter schools. There are currently 109 KIPP schools in 20 states and the District of Columbia serving more than 32,000 students.