A previous post described how school enrollment processes that contain naturally occurring lotteries provide researchers with exciting opportunities to learn about the effects of policies and programs. This follow-up post presents a few methodological issues common to lottery-based analyses — constrained statistical power, imperfect compliance, and restricted generalizability — and briefly discusses how they can be addressed.
A two-stage, multilevel approach to random assignment is an intriguing way to test a complex set of interventions — such as interventions children experience in sequence as they move from preschool to kindergarten. In the first stage of such a design, groups (schools or centers) of participants are randomly assigned to program and control groups. In the second stage, individuals within those groups (students or children) are randomized to two different conditions. This design creates up to four groups — a group that receives both interventions, a group that receives the first intervention but not the second, a group that receives the second intervention but not the first, and a control group. By comparing the outcomes of participants in these groups, the combined and relative effects of the two interventions can be tested.
In a recently published three-article set, MDRC researchers and colleagues discuss quantifying cross-site impact variation using data from multisite randomized trials. The papers show that the extent to which the effects of an intervention vary across settings has important implications for policy, practice, science, and research design. This post distills some key considerations for research design and for reporting and interpreting cross-site impact variation.
Community initiatives that aim to make neighborhoods safer, improve schools, and preserve affordable housing have long emphasized the importance of building good relationships between participating organizations. To understand how the structure of such relationships contributed to outcomes in one such initiative, MDRC applied the methodology of social network analysis to data from a survey administered to organizations in nine Chicago neighborhoods.
Researchers across disciplines have long taken advantage of natural experiments to study the effects of policies at full scale. In the past decade rapid growth in the number of charter schools and school district choice systems has provided education researchers with exciting opportunities to do the same: to use naturally occurring pockets of randomization to rigorously study the effects of policy-relevant education reforms that are already in place, often on a large scale.
Many researchers are concerned about a crisis in the credibility of social science research because of insufficient replicability and transparency in randomized controlled trials and in other kinds of studies. In this post we discuss some of the ways that MDRC strives to address these issues and ensure the rigor of its work.
Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can increase the likelihood of spurious findings: that is, finding statistically significant effects that do not in fact exist. Without the use of a multiple testing procedure (MTP) to counteract this problem, the probability of false positive findings increases, sometimes dramatically, with the number of tests. Yet the use of an MTP can result in a substantial change in statistical power, greatly reducing the probability of detecting effects when they do exist.
The Subprime Lending Data Exploration Project is a “big data” project designed to produce policy-relevant insights using an administrative data set that covers nearly 50 million individuals who have applied for or used subprime credit. The data set contains information on borrower demographics, loan types and terms, account types and balances, and repayment histories. To investigate whether there were distinct groups of borrowers in terms of loan usage patterns and outcomes, we used a data discovery process called K-means clustering.
Across policy domains, practitioners and researchers are benefiting from a trend of greater access to both more detailed and frequent data and the increased computing power needed to work with large, longitudinal data sets. There is growing interest in using such data as a case management tool, to better understand patterns of behavior, better manage caseload dynamics, and better target individuals for interventions. In particular, predictive analytics — which has long been used in business and marketing research — is gaining currency as a way for social service providers to identify individuals who are at risk of adverse outcomes. MDRC has multiple predictive analytics efforts under way, which we summarize here while highlighting our methodological approach.