MDRC is well known as a leader in advancing rigorous research methods in program improvement and evaluation and in sharing what we learn with the field. The Reflections on Methodology blog, edited by MDRC Chief Economist Charles Michalopoulos, presents posts by MDRC researchers and colleagues on the refinement and practical use of cutting-edge research methods being employed across our organization.

Lessons Learned from Career Pathways and Child First

Social services programs are increasingly looking to forecast which participants are likely to reach major milestones. Some explore advanced predictive modeling, but the Center for Data Insights (CDI) has found that such methods come with trade-offs. This post outlines CDI’s approach to predictive analytics, using illustrations from two studies.

Multiple testing procedures reduce the likelihood of false positive findings, but can also reduce the probability of detecting true effects. This post introduces two open-source software tools from the Power Under Multiplicity Project that can help researchers plan analyses for randomized controlled trials using multiple testing procedures.

Attempting to Correct for Follow-Up Selection Bias

A companion post discussed a kind of selection bias that can typically lead meta-analyses to overestimate longer-term effects for a range of interventions under consideration. This post describes a way to use information on short-term outcomes to estimate how much the effects on long-term outcomes are overstated.

Detecting Follow-Up Selection Bias in Studies of Postsecondary Education Programs

Meta-analyses pool results from multiple published studies to determine the likely effect of a type of intervention. This post discusses a kind of selection bias that can typically lead meta-analyses to overestimate longer-term effects for a range of interventions under consideration.

Individual growth modeling allows researchers to examine individual research subjects’ trajectories over time. This post describes how the approach was used to test whether the growth in students’ academic skills slowed down during the summer between preschool and kindergarten, and how that pattern varied among students of different demographic groups.

This post is one in a series highlighting MDRC’s methodological work. Contributors discuss the refinement and practical use of research methods being employed across our organization.

Several jurisdictions have instituted procedures meant to affect the use of bail. To determine whether those policies have had effects, a past trend can be used to extrapolate what would have happened had business continued as usual. This post discusses how researchers did such an extrapolation in Mecklenburg, North Carolina.

An earlier post in this series discussed considerations for reporting and interpreting cross-site impact variation and for designing studies to investigate such cross-site variation. This post discusses how those ideas were applied to address two broad questions in the Mother and Infant Home Visiting Program Evaluation.

Part I of this two-part post discussed MDRC’s work with practitioners to construct valid and reliable measures of implementation fidelity to an early childhood curriculum. Part II examines how those data can reveal associations between levels of fidelity and gains in children’s academic skills.

Lessons from the Grameen America Evaluation

A previous Reflections on Methodology post discussed the process used to select a research design in the evaluation of the Grameen America program, which uses a group-based model to provide loans to low-income women living in the United States who are seeking to start or expand a small business. We decided to use a random assignment design for the study. Our next step was to implement it.

A random assignment research design is guaranteed to produce unbiased estimates of program effects, but random assignment is not always feasible. Some individuals, organizations, or communities may view it as unfair or may be reluctant to deny their neediest participants access to an intervention that could prove beneficial. In other cases, random assignment is not possible because the program is already being implemented. Therefore, it is imperative that the evaluation field continue to pursue alternative rigorous designs. One such approach that has seen widespread interest in recent years is regression discontinuity design (RDD).

Elementary schools often assess whether students are on track to read at grade level at future points much like high schools assess whether students are on track to graduate — relying on a set of indicators, or predictors, such as literacy screening tests. Each test produces a useful composite score and scores for subsections of the tests. Educators can then identify students with a low likelihood of future reading success and recommend interventions to help them improve. We wondered whether the combined scores and subscores from all the reading assessments administered over the years could provide more accurate information. This is where predictive analytics can provide substantial value.

Lessons from the Grameen America Formative Evaluation

This post discusses the process that we used to select a research design in the evaluation of the Grameen America program, a microfinance model that provides loans to low-income women living in the United States who are seeking to start or expand a small business. The first step was to determine whether it was a strong enough program to study — specifically, whether participants were receiving loans and persisting in the program. Once the study’s worth and feasibility were established, we used the information we had gathered to determine the most appropriate research design.

Social and education policy researchers often work in partnership with the practitioners who develop and implement the programs being studied. Engagement with those who design and use a program is particularly helpful in developing measures of implementation fidelity — that is, whether a program is being implemented as designed. There are two goals here: Researchers want to understand why an intervention did or did not produce effects, and practitioners want to use the information to make improvements beyond the study period.

Andrew Leigh’s irresistibly readable new book Randomistas takes readers on a rollicking tour of disciplines in which randomized controlled trials (RCTs) have revolutionized the way we build knowledge. From medicine to social policy to crime control, RCTs have helped to debunk myths and improve the lives of millions. We were proud to see that MDRC, and its former president Judith Gueron, figure prominently in the chapter on “Pioneers of Randomisation.” Leigh takes on — and mostly demolishes — the most commonly repeated myths about random assignment.

Social policy evaluations usually draw inferences using classical statistical methods (also known as frequentist inference). An evaluator may, for example, compare outcomes for program and control groups and make statements about whether estimates are statistically significant. This approach has two important shortcomings, which Bayesian analysis can address.

A previous post described how school enrollment processes that contain naturally occurring lotteries provide researchers with exciting opportunities to learn about the effects of policies and programs. This follow-up post presents a few methodological issues common to lottery-based analyses — constrained statistical power, imperfect compliance, and restricted generalizability — and briefly discusses how they can be addressed.

A two-stage, multilevel approach to random assignment is an intriguing way to test a complex set of interventions — such as interventions children experience in sequence as they move from preschool to kindergarten. In the first stage of such a design, groups (schools or centers) of participants are randomly assigned to program and control groups. In the second stage, individuals within those groups (students or children) are randomized to two different conditions. This design creates up to four groups — a group that receives both interventions, a group that receives the first intervention but not the second, a group that receives the second intervention but not the first, and a control group. By comparing the outcomes of participants in these groups, the combined and relative effects of the two interventions can be tested.

In a recently published three-article set, MDRC researchers and colleagues discuss quantifying cross-site impact variation using data from multisite randomized trials. The papers show that the extent to which the effects of an intervention vary across settings has important implications for policy, practice, science, and research design. This post distills some key considerations for research design and for reporting and interpreting cross-site impact variation.

Community initiatives that aim to make neighborhoods safer, improve schools, and preserve affordable housing have long emphasized the importance of building good relationships between participating organizations. To understand how the structure of such relationships contributed to outcomes in one such initiative, MDRC applied the methodology of social network analysis to data from a survey administered to organizations in nine Chicago neighborhoods.

Researchers across disciplines have long taken advantage of natural experiments to study the effects of policies at full scale. In the past decade rapid growth in the number of charter schools and school district choice systems has provided education researchers with exciting opportunities to do the same: to use naturally occurring pockets of randomization to rigorously study the effects of policy-relevant education reforms that are already in place, often on a large scale.


Many researchers are concerned about a crisis in the credibility of social science research because of insufficient replicability and transparency in randomized controlled trials and in other kinds of studies. In this post we discuss some of the ways that MDRC strives to address these issues and ensure the rigor of its work.

Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can increase the likelihood of spurious findings: that is, finding statistically significant effects that do not in fact exist. Without the use of a multiple testing procedure (MTP) to counteract this problem, the probability of false positive findings increases, sometimes dramatically, with the number of tests. Yet the use of an MTP can result in a substantial change in statistical power, greatly reducing the probability of detecting effects when they do exist.

The Subprime Lending Data Exploration Project is a “big data” project designed to produce policy-relevant insights using an administrative data set that covers nearly 50 million individuals who have applied for or used subprime credit. The data set contains information on borrower demographics, loan types and terms, account types and balances, and repayment histories. To investigate whether there were distinct groups of borrowers in terms of loan usage patterns and outcomes, we used a data discovery process called K-means clustering.

Across policy domains, practitioners and researchers are benefiting from a trend of greater access to both more detailed and frequent data and the increased computing power needed to work with large, longitudinal data sets. There is growing interest in using such data as a case management tool, to better understand patterns of behavior, better manage caseload dynamics, and better target individuals for interventions. In particular, predictive analytics — which has long been used in business and marketing research — is gaining currency as a way for social service providers to identify individuals who are at risk of adverse outcomes. MDRC has multiple predictive analytics efforts under way, which we summarize here while highlighting our methodological approach.