Great Tutors Could Be a Source of Great Teachers

blog_4-19-18_tutoring_500x329In a recent blog, I wrote about findings of three recent reviews of research on tutoring, contained within broader reviews of research on effective programs for struggling elementary readers, struggling secondary readers, and math. The blog reported the astonishing finding that in each of the reviews, outcomes for students tutored by paraprofessionals (teaching assistants) were as good, and usually somewhat better, than outcomes for students tutored by teachers.

It is important to note that the paraprofessionals tutoring students usually had BAs, one indicator of high quality. But since paras are generally paid about half as much as teachers, using them enables schools to serve twice as many struggling students at about the same cost as hiring teacher tutors. And because there are teacher shortages in many areas, such as inner cities and rural locations, sufficient teachers may not be available in some places at any cost.

In my earlier blog, I explained all this, but now I’d like to expand on one aspect of the earlier blog I only briefly mentioned.

If any district or state decided to invest substantially in high-quality paraprofessional tutors and train them in proven one-to-one and one-to-small group tutoring strategies, it would almost certainly increase the achievement of struggling learners and reduce retentions and special education placements. But it could also provide a means of attracting capable recent university graduates into teaching.

Imagine that districts or states recruited students graduating from local universities to serve in a “tutor corps.” Those accepted would be trained and mentored to become outstanding tutors. From within that group, tutors who show the greatest promise would be invited to participate in a fast-track teacher certification program. This would add coursework to the paraprofessionals’ schedules, while they continue tutoring during other times. In time, the paraprofessionals would be given opportunities to do brief classroom internships, and then student teaching. Finally, they would receive their certification, and would be assigned to a school in the district or state.

There are several features worth noting about this proposal. First, the paraprofessionals would be paid throughout their teacher training, because at all points they would be providing valuable services to children. This would make it easier for recent university graduates to take courses leading to certification, which could expand the number of promising recent graduates who might entertain the possibility of becoming teachers. Paying teacher education candidates (as tutors) throughout their time in training could open the profession to a broader range of talented candidates, including diverse candidates who could not afford traditional teacher education.

Second, the whole process of recruiting well-qualified paraprofessionals, training and mentoring them as tutors, selecting the best of them to become certified, and providing coursework and student teaching experiences for them, would be managed by school districts or states, not by universities. School districts and states have a strong motivation to select the best teachers, see that they get excellent training and mentoring, and proceed to certification only when they are ready. Coursework might be provided by university professors contracted by the district or qualified individuals within the district or state. Again, because the district or state has a strong interest in having these experiences be optimal for their future teachers, they would be likely to take an active role in ensuring that coursework and coaching are first rate.

One important advantage of this system would be that it would give school, district, and state leaders opportunities to see future teachers operate in real schools over extended periods of time, first as tutors, then as interns, then as student teachers. At the end of the process, the school district or state should be willing to guarantee that all who succeed in this demanding sequence will be offered a job. They should be able to do this with confidence, because school and district staff would have seen the candidate work with real children in real schools.

The costs of this system might be minimal. During tutoring, internships, and student teaching, teacher candidates are providing invaluable services to struggling students. The only additional cost would entail providing coursework to meet state or district requirements. But this cost could be modest, and in exchange for paying for or providing the courses, the district or state would gain the right to select instructors of very high quality and insist on their effectiveness in practice. These are the schools’ own future teachers, and they should not be satisfied with less than stellar teacher education.

The system I’m proposing could operate alongside of traditional programs provided by universities. School districts or states might in fact create partnerships in which all teacher education candidates would serve as tutors as part of their teacher education, in which case university-led and district-led teacher education may essentially merge into one.

This system is more obviously attuned to the needs of elementary schools than secondary schools, because historically tutors have been rarely used in the secondary grades. Yet recent evidence from studies in England (http://www.bestevidence.org/reading/mhs/mhs_read.htm) has shown positive effects of tutoring in reading in the middle grades, and it seems likely that one-to-one or one-to-small group tutoring would be beneficial in all major subjects and, as in elementary school, may keep students who are far behind grade level in a given subject out of special education and able to keep up with their classmates. If paraprofessional tutors can work in the secondary grades, this would form the basis for a teacher certification plan like the one I have described.

Designing teacher certification programs around the needs of recent BAs sounds like Teach for America, and in many ways it is. But this system would, I’d argue, be more likely to attract large numbers of talented young people who would be more likely than TFA grads to stay in teaching for many years.

The main reason schools, districts, and states should invest in tutoring by paraprofessionals is to serve the large number of struggling learners who exist in every district. But in the course of doing this, districts could also take control of their own destinies and select and train the teachers they need. The result would be better teachers for all students, and a teaching profession that knows how to use proven programs to ensure the success of all.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Advertisements

Effect Sizes: How Big is Big?

blog_4-12-18_elephantandmouseAn effect size is a measure of how much an experimental group exceeds a control group, controlling for pretests. As every quantitative researcher knows, the formula is (XT – XC)/SD, or adjusted treatment mean minus adjusted control mean divided by the unadjusted standard deviation. If this is all gobbledygook to you, I apologize, but sometimes us research types just have to let our inner nerd run free.

Effect sizes have come to be accepted as a standard indicator of the impact an experimental treatment had on a posttest. As research becomes more important in policy and practice, understanding them is becoming increasingly important.

One constant question is how important a given effect size is. How big is big? Many researchers still use a rule of thumb from Cohen to the effect that +0.20 is “small,” +0.50 is “moderate,” and +0.80 or more is “large.”  Yet Cohen himself disavowed these standards long ago.

High-quality experimental-control comparison research in schools rarely gets effect sizes as large as +0.20, and only one-to-one tutoring studies routinely get to +0.50. So Cohen’s rule of thumb was demanding effect sizes for rigorous school research far larger than those typically reported in practice.

An article by Hill, Bloom, Black, and Lipsey (2008) considered several ways to determine the importance of effect sizes. They noted that students learn more each year (in effect sizes) in the early elementary grades than do high school students. They suggested that therefore a given effect size for an experimental treatment may be more important in secondary school than the same effect size would be in elementary school. However, in four additional tables in the same article, they show that actual effect sizes from randomized studies are relatively consistent across the grades. They also found that effect sizes vary greatly depending on methodology and the nature of measures. They end up concluding that it is most reasonable to determine the importance of an effect size by comparing it to effect sizes in other studies with similar measures and designs.

A study done by Alan Cheung and myself (2016) reinforces the importance of methodology in determining what is an important effect size. We analyzed all findings from 645 high-quality studies included in all reviews in our Best Evidence Encyclopedia (www.bestevidence.org). We found that the most important factors in effect sizes were sample size and design (randomized vs. matched). Here is the key table.

Effects of Sample Size and Designs on Effect Sizes

  Sample Size
Design Small Large
Matched +0.33 +0.17
Randomized +0.23 +0.12

What this chart shows is that matched studies with small sample sizes (less than 250 students) have much higher effect sizes, on average, than, say, large randomized studies (+0.33 vs. +0.12). These differences say nothing about the impact on children, but are completely due to differences in study design.

If effect sizes are so different due to study design, then we cannot have a single standard to tell us when an effect size is large or small. All we can do is note when an effect size is large compared to similar studies. For example imagine that a study finds an effect size of +0.20. Is that big or small? If it was a matched study with a small sample size, +0.20 would be a rather small impact. If it were a randomized study with a large sample size, it might be considered quite a large impact.

Beyond study methods, a good general principle is to compare like with like. For example, some treatments may have very small effect sizes, but they may be so inexpensive or may affect so many students that a small effect may be important. For example, principal or superintendent training may affect very many students, or benchmark assessments may be so inexpensive that a small effect size may be worthwhile, and may compare favorably with equally inexpensive means of solving the same problem.

My colleagues and I will be developing a formula to enable researchers and readers to easily put in features of a study to produce an “expected effect size” to determine more accurately whether an effect size should be considered large or small.

Not long ago, it would not have mattered much how large effect sizes were considered, but now it does. That’s an indication of the progress we have made in recent years. Big indeed!

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

New Findings on Tutoring: Four Shockers

blog_04 05 18_SURPRISE_500x353One-to-one and one-to-small group tutoring have long existed as remedial approaches for students who are performing far below expectations. Everyone knows that tutoring works, and nothing in this blog contradicts this. Although different approaches have their champions, the general consensus is that tutoring is very effective, and the problem with widespread use is primarily cost (and for tutoring by teachers, availability of sufficient teachers). If resources were unlimited, one-to-one tutoring would be the first thing most educators would recommend, and they would not be wrong. But resources are never unlimited, and the numbers of students performing far below grade level are overwhelming, so cost-effectiveness is a serious concern. Further, tutoring seems so obviously effective that we may not really understand what makes it work.

In recent reviews, my colleagues and I examined what is known about tutoring. Beyond the simple conclusion that “tutoring works,” we found some big surprises, four “shockers.” Prepare to be amazed! Further, I propose an explanation to account for these unexpected findings.

We have recently released three reviews that include thorough, up-to-date reviews of research on tutoring. One is a review of research on programs for struggling readers in elementary schools by Amanda Inns and colleagues (2018). Another is a review on programs for secondary readers by Ariane Baye and her colleagues (2017). Finally, there is a review on elementary math programs by Marta Pellegrini et al. (2018). All three use essentially identical methods, from the Best Evidence Encyclopedia (www.bestevidence.org). In addition to sections on tutoring strategies, all three also include other, non-tutoring methods directed at the same populations and outcomes.

What we found challenges much of what everyone thought they knew about tutoring.

Shocker #1: In all three reviews, tutoring by paraprofessionals (teaching assistants) was at least as effective as tutoring by teachers. This was found for reading and math, and for one-to-one and one-to-small group tutoring.  For struggling elementary readers, para tutors actually had higher effect sizes than teacher tutors. Effect sizes were +0.53 for paras and +0.36 for teachers in one-to-one tutoring. For one-to-small group, effect sizes were +0.27 for paras, +0.09 for teachers.

Shocker #2: Volunteer tutoring was far less effective than tutoring by either paras or teachers. Some programs using volunteer tutors provided them with structured materials and extensive training and supervision. These found positive impacts, but far less than those for paraprofessional tutors. Volunteers tutoring one-to-one had an effect size of +0.18, paras had an effect size of +0.53. Because of the need for recruiting, training, supervision, and management, and also because the more effective tutoring models provide stipends or other pay, volunteers were not much less expensive than paraprofessionals as tutors.

Shocker #3:  Inexpensive substitutes for tutoring have not worked. Everyone knows that one-to-one tutoring works, so there has long been a quest for approaches that simulate what makes tutoring work. Yet so far, no one, as far as I know, has found a way to turn lead into tutoring gold. Although tutoring in math was about as effective as tutoring in reading, a program that used online math tutors communicating over the Internet from India and Sri Lanka to tutor students in England, for example, had no effect. Technology has long been touted as a means of simulating tutoring, yet even when computer-assisted instruction programs have been effective, their effect sizes have been far below those of the least expensive tutoring models, one-to-small group tutoring by paraprofessionals. In fact, in the Inns et al. (2018) review, no digital reading program was found to be effective with struggling readers in elementary schools.

 Shocker #4: Certain whole-class and whole-school approaches work as well or better for struggling readers than tutoring, on average. In the Inns et al. (2018) review, the average effect size for one-to-one tutoring approaches was +0.31, and for one-to-small group approaches it was +0.14. Yet the mean for whole-class approaches, such as Ladders to Literacy (ES = +0.48), PALS (ES = +0.65), and Cooperative Integrated Reading and Composition (ES = +0.19) averaged +0.33, similar to one-to-one tutoring by teachers (ES = +0.36). The mean effect sizes for comprehensive tiered school approaches, such as Success for All (ES = +0.41) and Enhanced Core Reading Instruction (ES = +0.22) was +0.43, higher than any category of tutoring (note that these models include tutoring as part of an integrated response to implementation approach). Whole-class and whole-school approaches work with many more students than do tutoring models, so these impacts are obtained at a much lower cost per pupil.

Why does tutoring work?

Most researchers and others would say that well-structured tutoring models work primarily because they allow tutors to fully individualize instruction to the needs of students. Yet if this were the only explanation, then other individualized approaches, such as computer-assisted instruction, would have outcomes similar to those of tutoring. Why is this not the case? And why do paraprofessionals produce at least equal outcomes to those produced by teachers as tutors? None of this squares with the idea that the impact of tutoring is entirely due to the tutor’s ability to recognize and respond to students’ unique needs. If that were so, other forms of individualization would be a lot more effective, and teachers would presumably be a lot more effective at diagnosing and responding to students’ problems than would less highly trained paraprofessionals. Further, whole-class and whole-school reading approaches, which are not completely individualized, would have much lower effect sizes than tutoring.

My theory to account for the positive effects of tutoring in light of the four “shockers” is this:

  • Tutoring does not work due to individualization alone. It works due to individualization plus nurturing and attention.

This theory begins with the fundamental and obvious assumption that children, perhaps especially low achievers, are highly motivated by nurturing and attention, perhaps far more than by academic success. They are eager to please adults who relate to them personally.  The tutoring setting, whether one-to-one or one-to-very small group, gives students the undivided attention of a valued adult who can give them personal nurturing and attention to a degree that a teacher with 20-30 students cannot. Struggling readers may be particularly eager to please a valued adult, because they crave recognition for success in a skill that has previously eluded them.

Nurturing and attention may explain the otherwise puzzling equality of outcomes obtained by teachers and paraprofessionals as tutors. Both types of tutors, using structured materials, may be equally able to individualize instruction, and there is no reason to believe that paras will be any less nurturing or attentive. The assumption that teachers would be more effective as tutors depends on the belief that tutoring is complicated and requires the extensive education a teacher receives. This may be true for very unusual learners, but for most struggling students, a paraprofessional may be as capable as a teacher in providing individualization, nurturing, and attention. This is not to suggest that paraprofessionals are as capable as teachers in every way. Teachers have to be good at many things: preparing and delivering lessons, managing and motivating classes, and much more. However, in their roles as tutors, teachers and paraprofessionals may be more similar.

Volunteers certainly can be nurturing and attentive, and can be readily trained in structured programs to individualize instruction. The problem, however, is that studies of volunteer programs report difficulties in getting volunteers to attend every day and to avoid dropping out when they get a paying job. This is may be less of a problem when volunteers receive a stipend; paid volunteers are much more effective than unpaid ones.

The failure of tutoring substitutes, such as individualized technology, is easy to predict if the importance of nurturing and attention is taken into account. Technology may be fun, and may be individualized, but it usually separates students from the personal attention of caring adults.

Whole-Class and Whole-School Approaches.

Perhaps the biggest shocker of all is the finding that for struggling readers, certain non-technology approaches to instruction for whole classes and schools can be as effective as tutoring. Whole-class and whole-school approaches can serve many more students at much lower cost, of course. These classroom approaches mostly use cooperative learning and phonics-focused teaching, or both, and the whole-school models especially Success for All,  combine these approaches with tutoring for students who need it.

The success of certain whole-class programs, of certain tutoring approaches, and of whole-school approaches that combine proven teaching strategies with tutoring for students who need more, argues for response to intervention (RTI), the policy that has been promoted by the federal government since the 1990s. So what’s new? What’s new is that the approach I’m advocating is not just RTI. It’s RTI done right, where each component of  the strategy has strong evidence of effectiveness.

The good news is that we have powerful and cost-effective tools at our disposal that we could be putting to use on a much more systematic scale. Yet we rarely do this, and as a result far too many students continue to struggle with reading, even ending up in special education due to problems schools could have prevented. That is the real shocker. It’s up to our whole profession to use what works, until reading failure becomes a distant memory. There are many problems in education that we don’t know how to solve, but reading failure in elementary school isn’t one of them.

Practical Implications.

Perhaps the most important practical implication of this discussion is a realization that benefits similar or greater than those of one-to-one tutoring by teachers can be obtained in other ways that can be cost-effectively extended to many more students: Using paraprofessional tutors, using one-to-small group tutoring, or using whole-class and whole-school tiered strategies. It is no longer possible to say with a shrug, “of course tutoring works, but we can’t afford it.” The “four shockers” tell us we can do better, without breaking the bank.

 

References

Baye, A., Lake, C., Inns, A., & Slavin, R. (2017). Effective reading programs for secondary students. Manuscript submitted for publication. Also see Baye, A., Lake, C., Inns, A. & Slavin, R. E. (2017, August). Effective Reading Programs for Secondary Students. Baltimore, MD: Johns Hopkins University, Center for Research and Reform in Education.

Inns, A., Lake, C., Pellegrini, M., & Slavin, R. (2018). Effective programs for struggling readers: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

Pellegrini, M., Inns, A., & Slavin, R. (2018). Effective programs in elementary mathematics: A best-evidence synthesis. Paper presented at the annual meeting of the Society for Research on Educational Effectiveness, Washington, DC.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Photo by Westsara (Own work) [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

 

Nevada Places Its Bets on Evidence

blog_3-29-18_HooverDam_500x375In Nevada, known as the land of big bets, taking risks is what they do. The Nevada State Department of Education (NDE) is showing this in its approach to ESSA evidence standards .  Of course, many states are planning policies to encourage use of programs that meet the ESSA evidence standards, but to my knowledge, no state department of education has taken as proactive a stance in this direction as Nevada.

 

Under the leadership of their state superintendent, Steve Canavero, Deputy Superintendent Brett Barley, and Director of the Office of Student and School Supports Seng-Dao Keo, Nevada has taken a strong stand: Evidence is essential for our schools, they maintain, because our kids deserve the best programs we can give them.

All states are asked by ESSA to require strong, moderate, or promising programs (defined in the law) for low-achieving schools seeking school improvement funding. Nevada has made it clear to its local districts that it will enforce the federal definitions rigorously, and only approve school improvement funding for schools proposing to implement proven programs appropriate to their needs. The federal ESSA law also provides bonus points on various other applications for federal funding, and Nevada will support these provisions as well.

However, Nevada will go beyond these policies, reasoning that if evidence from rigorous evaluations is good for federal funding, why shouldn’t it be good for state funding too? For example, Nevada will require ESSA-type evidence for its own funding program for very high-poverty schools, and for schools serving many English learners. The state has a reading-by-third-grade initiative that will also require use of programs proven to be effective under the ESSA regulations. For all of the discretionary programs offered by the state, NDE will create lists of ESSA-proven supplementary programs in each area in which evidence exists.

Nevada has even taken on the holy grail: Textbook adoption. It is not politically possible for the state to require that textbooks have rigorous evidence of effectiveness to be considered state approved. As in the past, texts will be state adopted if they align with state standards. However, on the state list of aligned programs, two key pieces of information will be added: the ESSA evidence level and the average effect size. Districts will not be required to take this information into account, but by listing it on the state adoption lists the state leaders hope to alert district leaders to pay attention to the evidence in making their selections of textbooks.

The Nevada focus on evidence takes courage. NDE has been deluged with concern from districts, from vendors, and from providers of professional development services. To each, NDE has made the same response: we need to move our state toward use of programs known to work. This is worth undergoing the difficult changes to new partnerships and new materials, if it provides Nevada’s children better programs, which will translate into better achievement and a chance at a better life. Seng-Dao Keo describes the evidence movement in Nevada as a moral imperative, delivering proven programs to Nevada’s children and then working to see that they are well implemented and actually produce the outcomes Nevada expects.

Perhaps other states are making similar plans. I certainly hope so, but it is heartening to see one state, at least, willing to use the ESSA standards as they were intended to be used, as a rationale for state and local educators not just to meet federal mandates, but to move toward use of proven programs. If other states also do this, it could drive publishers, software producers, and providers of professional development to invest in innovation and rigorous evaluation of promising approaches, as it increases use of approaches known to be effective now.

NDE is not just rolling the dice and hoping for the best. It is actively educating its district and school leaders on the benefits of evidence-based reform, and helping them make wise choices. With a proper focus on assessments of needs, facilitating access to information, and assistance with ensuring high quality implementation, really promoting use of proven programs should be more like Nevada’s Hoover Dam: A sure thing.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Photo by: Michael Karavanov [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Lessons from China

blog_3-22-18_Confucius_344x500Recently I gave a series of speeches in China, organized by the Chinese University of Hong Kong and Nanjing Normal University. I had many wonderful and informative experiences, but one evening stood out.

I was in Nanjing, the ancient capital, and it was celebrating the weeks after the Chinese New Year. The center of the celebration was the Temple of Confucius. In and around it were lighted displays exhorting Chinese youth to excel on their exams. Children stood in front of these displays to have their pictures taken next to characters saying “first in class,” never second. A woman with a microphone recited blessings and hopes that students would do well on exams. After each one, students hit a huge drum with a long stick, as an indication of accepting the blessing. Inside the temple were thousands of small silk messages, bright red, expressing the wishes of parents and students that students will do well on their exams. Chinese friends explained what was going on, and told me how pervasive this spirit was. Children all know a saying to the effect that the path to riches and a beautiful wife was through books. I heard that perhaps 70% of urban Chinese students go to after-school cram schools to ensure their performance on exams.

The reason Chinese parents and students take test scores so seriously is obvious in every aspect of Chines culture. On an earlier trip to China I toured a beautiful house, from hundreds of years ago, in a big city. The only purpose of the house was to provide a place for young men of a large clan to stay while they prepared for their exams, which determined their place in the Confucian hierarchy.

As everyone knows, Chinese students do, in fact, do very well on their exams. I would note that these data come in particular from urban Eastern China, such as Shanghai. I’d heard about but did not fully understand policies that contribute to these outcomes. In all big cities in China, students can only attend schools in their city neighborhoods, where the best schools in the country are, if they were born there or own apartments. In a country where a small apartment in a big city can easily cost a half million dollars (U.S.), this is no small selection factor. If parents work in the city but do not own an apartment, their children may have to remain in the village or small city they came from, living with grandparents and attending non-elite schools. Chinese cities are growing so fast that the majority of their inhabitants come from the rest of China. This matters because admirers of Chinese education often cite the amazing statistics from the rich and growing Eastern Chinese cities, not the whole country. It’s as though the U.S. only reported test scores on international comparisons from suburbs in the Northeastern states from Maryland to New England, the wealthiest and highest-achieving part of our country.

I do not want to detract in any way from the educational achievements of the Chinese, but just to put it in context. First, the Chinese themselves have doubts about test scores as the only important indicators, and admire Western education for its broader focus. But just sticking to test scores, China and other Confucian cultures such as Japan, South Korea, and Singapore have been creating a culture valuing test scores since Confucius, about 2500 years ago. It would be a central focus of Chinese culture even if PISA and TIMSS did not exist to show it off to the world.

My only point is that when American or European observers hold up East Asian achievements as a goal to aspire to, these achievements do not exist in a cultural vacuum. Other countries can potentially achieve what China has achieved, in terms of test scores and other indicators, but they cannot achieve it in the same way. Western culture is just not going to spend the next 2500 years raising its children the way the Chinese do. What we can do, however, is to use our own strengths, in research, development, and dissemination, to progressively enhance educational outcomes. The Chinese can and will do this, too; that’s what I was doing traveling around China speaking about evidence-based reform. We need not be in competition with any nation or society, as expanding educational opportunity and success throughout the world is in the interests of everyone on Earth. But engaging in fantasies about how we can move ahead by emulating parts of Chinese culture that they have been refining since Confucius is not sensible.

Precisely because of their deep respect for scholarship and learning and their eagerness to continue to improve their educational achievements, the Chinese are ideal collaborators in the worldwide movement toward evidence-based reform in education. Colleagues at the Chinese University of Hong Kong and the Nanjing Normal University are launching Chinese-language and Asian-focused versions of our newsletter on evidence in education, Best Evidence in Brief (BEiB). We and our U.K. colleagues have been distributing BEIB for several years. We welcome the opportunity to share ideas and resources with our Chinese colleagues to enrich the evidence base for education for children everywhere.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

What if a Sears Catalogue Married Consumer Reports?

blog_3-15-18_familyreading_500x454When I was in high school, I had a summer job delivering Sears catalogues. I borrowed my mother’s old Chevy station wagon and headed out fully laden into the wilds of the Maryland suburbs of Washington.

I immediately learned something surprising. I thought of a Sears catalogue as a big book of advertisements. But the people to whom I was delivering them often saw it as a book of dreams. They were excited to get their catalogues. When a neighborhood saw me coming, I became a minor celebrity.

Thinking back on those days, I was thinking about our Evidence for ESSA website (www.evidenceforessa.org). I realized that what I wanted it to be was a way to communicate to educators the wonderful array of programs they could use to improve outcomes for their children. Sort of like a Sears catalogue for education. However, it provides something that a Sears catalogue does not: Evidence about the effectiveness of each catalogue entry. Imagine a Sears catalogue that was married to Consumer Reports. Where a traditional Sears catalogue describes a kitchen gadget, “It slices and dices, with no muss, no fuss!”, the marriage with Consumer Reports would instead say, “Effective at slicing and dicing, but lots of muss. Also fuss.”

If this marriage took place, it might take some of the fun out of the Sears catalogue (making it a book of realities rather than a book of dreams), but it would give confidence to buyers, and help them make wise choices. And with proper wordsmithing, it could still communicate both enthusiasm, when warranted, and truth. But even more, it could have a huge impact on the producers of consumer goods, because they would know that their products would need to be rigorously tested and found to be able to back up their claims.

In enhancing the impact of research on the practice of education, we have two problems that have to be solved. Just like the “Book of Dreams,” we have to help educators know the wonderful array of programs available to them, programs they may never had heard of. And beyond the particular programs, we need to build excitement about the opportunity to select among proven programs.

In education, we make choices not for ourselves, but on behalf of our children. Responsible educators want to choose programs and practices that improve the achievement of their students. Something like a marriage of the Sears catalogue and Consumer Reports is necessary to address educators’ dreams and their need for information on program outcomes. Users should be both excited and informed. Information usually does not excite. Excitement usually does not inform. We need a way to do both.

In Evidence for ESSA, we have tried to give educators a sense that there are many solutions to enduring instructional problems (excitement), and descriptions of programs, outcomes, costs, staffing requirements, professional development, and effects for particular subgroups, for example (information).

In contrast to Sears catalogues, Evidence for ESSA is light (Sears catalogues were huge, and ultimately broke the springs on my mother’s station wagon). In contrast to Consumer Reports, Evidence for ESSA is free.  Every marriage has its problems, but our hope is that we can capture the excitement and the information from the marriage of these two approaches.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.

Picture source: Nationaal Archief, the Netherlands

 

How Classroom-Invented Innovations Can Have Broad Impacts

blog_3-8-18_blackboard_500x381When I was in high school, I had an after-school job at a small electronics company that made and sold equipment, mostly to the U.S. Navy. My job was to work with another high school student and our foreman to pack and unpack boxes, do inventories, basically whatever needed doing.

One of our regular tasks was very time-consuming. We had to test solder extractors to be sure they were working. We’d have to heat up each one for several minutes, touch a bit of solder to it, and wipe off any residue.

One day, my fellow high school student and I came up with an idea. We took 20 solder extractors and lined them up on a work table with 20 electrical outlets. We then plugged them in. By the time we’d plugged in #20, #1 was hot, so we could go back and test it, then #2, and so on. An hour-long job was reduced to 10 minutes. We were being paid the princely sum of $1.40 an hour, so we were saving the company big bucks. Our foreman immediately saw the advantages, and he told the main office about our idea.

Up in the main office, far from the warehouse, was a mean, mean man. He wore a permanent scowl. He had a car with mean, mean bumper stickers. I’ll call him Mr. Meanie.

Mr. Meanie hated everyone, but he especially hated the goofy, college-bound high school students in the warehouse. So he had to come see what we were doing, probably to prove that it was dumb idea.

Mr. Meanie came and asked me to show him the solder extractors. I laid them out, same as always, and everything worked, same as always, but due to my anxiety under Mr. Meanie’s scowl, I let one of the cords touch its neighboring solder extractor. It was ruined.

Mr. Meanie looked satisfied (probably thinking, “I knew it was a dumb idea”), and left without a word. But as long as I worked at the company, we never again tested solder extractors one at a time (and never scorched another cord). My guess is that long after we were gone, our method remained in use despite Mr. Meanie. We’d overcome him with evidence that no one could dispute.

In education, we employ some of the smartest and most capable people anywhere as teachers. Teachers innovate, and many of their innovations undoubtedly improve their own students’ outcomes. Yet because most teachers work alone, their innovations rarely spread or stick even within their own schools. When I was a special education teacher long ago, I made up and tested out many innovations for my very diverse, very disabled students. Before heading off for graduate school, I wrote them out in detail for whoever was going to receive my students the following year. Perhaps their next teachers received and paid attention to my notes, but probably not, and they could not have had much impact for very long. More broadly, there is just no mechanism for identifying and testing out teachers’ innovations and then disseminating them to others, so they have little impact beyond the teacher and perhaps his or her colleagues and student teachers, at best.

One place in the education firmament where teacher-level innovation is encouraged, noted, and routinely disseminated is in comprehensive schoolwide approaches, such as our own Success for All (SFA). Because SFA has its own definite structure and materials, promising innovations in any school or classroom may immediately apply to the roughly 1000 schools we work with across the U.S. Because SFA schools have facilitators within each school and coaches from the Success for All Foundation who regularly visit in teachers’ classes, there are many opportunities for teachers to propose innovations and show them off. Those that seem most promising may be incorporated in the national SFA program, or at least mentioned as alternatives in ongoing coaching.

As one small example, SFA constantly has students take turns reading to each other. There used to be arguments and confusion about who goes first. A teacher in Washington, DC noticed this and invented a solution. She appointed one student in each dyad to be a “peanut butter” and the other to be a “jelly.” Then she’d say, “Today, let’s start with the jellies,” and the students started right away without confusion or argument. Now, 1000 schools use this method.

A University of Michigan professor, Don Peurach, studied this very aspect of Success for All and wrote a book about it, called Seeing Complexity in Public Education (Oxford University Press, 2011). He visited dozens of SFA schools, SFA conferences, and professional development sessions, and interviewed hundreds of participants. What he described is an enterprise engaged in sharing evidence-proven practices with schools and at the same time learning from innovations and problem solutions devised in schools and communicating best practices back out to the whole network.

I’m sure that other school improvement networks do the same, because it just makes sense. If you have a school network with common values, goals, approaches, and techniques, how does it keep getting better over time if it does not learn from those who are on the front lines? I’d expect that such very diverse networks as Montessori and Waldorf schools, KIPP and Success Academy, and School Development Program and Expeditionary Learning schools, must do the same. Each of the improvements and innovations contributed by teachers or principals may not be big enough to move the needle on achievement outcomes by themselves, but collectively they keep programs moving forward as learning organizations, solving problems and improving outcomes.

In education, we have to overcome our share of Mr. Meanies trying to keep us from innovating or evaluating promising approaches. Yet we can overcome blockers and doubters if we work together to progressively improve proven programs. We can overwhelm the Mr. Meanies with evidence that no one can dispute.

This blog was developed with support from the Laura and John Arnold Foundation. The views expressed here do not necessarily reflect those of the Foundation.