School Accountability and Summer Learning Loss


In recent years, researchers have developed statistical methods for using a new source of evidence—standardized test score data—to estimate school effectiveness, despite significant challenges to valid causal inference (for an overview of challenges, see McCaffrey, 2003; Reardon & Raudenbush, 2009). So called “value-added” models typically use district administrative data on students, teachers, and schools in statistical models that seek to isolate the school’s impact on test score performance from the many other factors that influence student test scores but are outside of the school’s control.

In practice, however, the estimation of a school’s value-added is far from straightforward. Much has been written about statistical challenges to isolating a school’s causal impact on student test scores, however one obvious problem with typical value-added models has received less attention—the impact of students’ “summer learning loss”. Almost all value-added models estimate the effects of schools using spring-to-spring outcome data, simply given that statewide annual testing occurs on this timeline. As a result, the school is attributed both with learning gains made by students during the school year as well as the prior summer. In essence, schools are in part evaluated for what happens to students prior to their first day in their class, which runs counter to the fundamental goals of an accountability system.

This project investigates the extent to which spring-to-spring testing timelines bias school value-added as a result of conflating summer and school-year learning. Using a unique dataset that contains both fall and spring standardized test scores, we can examine the patterns in school-year versus summer learning. We estimate value-added based on traditional spring-to-spring data, as well as competing models that predict fall-to-spring test score gains. We then examine whether schools are ranked differently using these two testing timelines and whether certain kinds of schools are especially affected by the test timing. The project discusses whether this problem is of sufficient magnitude to caution against using value-added measures based solely on spring-to-spring test score data. Since states currently do not require both fall and spring testing, the implications for the federal accountability planning are significant.