This paper examines the properties of two nonexperimental study designs that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. The paper looks at the internal validity and precision of these two designs, using the example of the federal Reading First program as implemented in a midwestern state.
This paper presents a conceptual framework for designing and interpreting research on variation in program effects. The framework categorizes the sources of program effect variation and helps researchers integrate the study of variation in program effectiveness and program implementation.
An Empirical Assessment Based on Four Recent Evaluations
This reference report, prepared for the National Center for Education Evaluation and Regional Assistance of the Institute of Education Sciences (IES), uses data from four recent IES-funded experimental design studies that measured student achievement using both state tests and a study-administered test.
This paper provides practical guidance for researchers who are designing and analyzing studies that randomize schools — which comprise three levels of clustering (students in classrooms in schools) — to measure intervention effects on student academic outcomes when information on the middle level (classrooms) is missing.
No universal guideline exists for judging the practical importance of a standardized effect size, a measure of the magnitude of an intervention’s effects. This working paper argues that effect sizes should be interpreted using empirical benchmarks — and presents three types in the context of education research.
Empirical Guidance for Studies That Randomize Schools to Measure the Impacts of Educational Interventions
This paper examines how controlling statistically for baseline covariates (especially pretests) improves the precision of studies that randomize schools to measure the impacts of educational interventions on student achievement.