|
This paper illustrates how to design an experimental sample for measuring the effects of
educational programs when whole schools are randomized to a program and control
group. Questions addressed by the paper include: How many schools should be
randomized? How many students per school are needed? What is the best mix of program
and control schools? And how do data on aggregate school-level measures of past student
performance or individual student-level measures improve the precision of program
impact estimates? Empirical analyses based on extensive data from two urban school
districts are used to address each question, and the statistical theory underlying these
analyses is presented in an appendix. The paper was prepared to help design the
evaluation of a national educational program, and it was circulated by the U.S.
Department of Education as methodological background for two recent Requests for
Evaluation Proposals.
|