Evaluating Programs for Community College Students
How Do We Know What Works?
A paper prepared for the White House Summit on Community Colleges
After a brief introduction on community colleges as a pathway to higher education and earnings, I describe a few approaches to evaluating the effectiveness of policies and programs designed to benefit students and issues to consider in determining standards of evidence. I then present three examples of programs that have been carefully studied and conclude with thoughts on bringing programs to scale. While there is much to be done to improve student outcomes there is also reason for optimism. Many states and colleges are piloting reforms, and there is a growing body of evidence on strategies that work.
Community College as a Pathway to Higher Education and Earnings
According to the most recent data from the U.S. Department of Education, community colleges enrolled 6.7 million students in 2007-2008 — or more than one-third of all students enrolled in higher education institutions. In part because of their open admissions policies and relative low cost, community colleges enroll larger percentages of nontraditional, low-income, and minority students than four-year colleges and universities. A primary reason why people pursue a college education is to boost future earnings. Over a lifetime, a worker with an associate’s degree will earn nearly $500,000 more than someone with no education beyond a high school diploma. Individuals who earn a bachelor’s degree will do even better, earning roughly $1.1 million more than someone with an associate’s degree and $1.6 million more than a high school graduate.
Anticipated future earnings are not enough to hold students in school or ensure progress. Longitudinal research by the U.S. Department of Education indicates that six years after entering community college, only 23 percent of degree-seeking students had completed an associate’s degree and 13 percent had completed a bachelor’s degree. An additional 17 percent had not earned a degree but were still enrolled at a college or university. Note that these figures capture enrollment at any institution, not just the community college where students began their studies.
The low success rates among community college students are due to many factors. A major problem is that a majority of students require developmental (or remedial) coursework in English or math before they can go onto college-level courses. Unfortunately, pass rates in developmental course are extremely low; for example, two-thirds of students assigned to developmental math never complete it. A second problem is that most community college students have work or family obligations that compete with school. Research shows that part-time attendance in college, 35 hours or more of work each week, and responsibility for dependents are among the major “risk factors” associated with low persistence and completion.
Approaches to Evaluating Effectiveness and Determining Standards of Evidence
Over the past decade, efforts by government, regional accreditation agencies, and foundation-led initiatives, such as Achieving the Dream, have made community colleges aware of the need to pay closer attention to students’ academic performance and progress toward degrees. Most community colleges track basic measures like retention and graduation, and some even gather information on students who transfer to other colleges or enter the labor market. Such reporting is important for monitoring institutional performance and setting goals for improvement, but is not sufficient for evaluating the effectiveness of any particular policy or program. Community colleges are complex organizations, and many factors may affect how students perform. External conditions also matter; the current economic recession, for example, has led to a surge in college enrollments but probably has made it harder for recent graduates to find jobs.
To evaluate the effectiveness of a community college policy or program, it is essential to introduce a counterfactual — that is, some means of determining what would have happened if the policy or program did not exist. To illustrate, suppose a community college implements a new advising system in order to improve student retention. For evaluation purposes, it is not enough to know that 70 percent of students who went through the advising system were retained the following year; quite possibly, students did just as well without it. The evaluator’s job is to find an appropriate comparison group to estimate the “value added” of the new advising system. If similar students who did not get the advising were retained at a rate of 50 percent, then the value added or impact of the program is the difference between the two retention rates (70 percent – 50 percent = 20 percentage points, or a 40 percent increase in retention).
In situations where a policy or program is being introduced as a pilot — or when there is insufficient capacity to serve everyone — most evaluators agree that the best approach to measuring effectiveness is the randomized control trial. A group of individuals that is targeted for an intervention (such as the new advising system mentioned above) is sorted into two groups: a program group that receives the intervention or a control group that does not. The sorting is done using a lottery-like process, so that every individual has an equal chance of ending up in either the program or the control group. The strength of a randomized control trial is that it ensures that the composition of the two groups is virtually identical at the beginning of the study — not only in observable characteristics, like age and gender, but also in unobservable characteristics, like motivation. By tracking both groups over time and comparing their outcomes, researchers can be confident that any differences are due to the intervention — and not to one group starting off more advantaged than the other.
Randomized control trials require skill to implement well and are still uncommon in postsecondary education research. More often, researchers will try to construct a comparison group that resembles the program group as much as possible on observable characteristics, like age and educational history. For example, a group of freshmen who participated in a new advising program will be compared to a similar group of freshmen who did not participate. The weakness of such designs is that the two groups may differ in ways that are not readily observed. This is a particular concern if the program was voluntary and served students who were already motivated to succeed. Because matched comparison group designs can never completely remove the possibility of underlying differences between the program and comparison groups, the evidence that comes out of such studies is not as reliable as that from a randomized control trial.
Neither randomized control trials nor matched comparison designs are well-suited to policies or programs that affect an entire college or population: for example, a new requirement that all entering students attend an orientation to inform them about college procedures and resources. In these situations, evaluators may try to compare outcomes before and after the policy or program was introduced. The best designs to evaluate such policies, known as interrupted time-series, will collect observations on an outcome of interest at many points before and after the policy went into effect to determine whether a significant change occurs after the point of policy implementation. Unfortunately, such designs can never eliminate the possibility that undetected changes in student composition or the environment may also affect the trends. In sum, randomized control trials produce the best evidence that an intervention caused a change in outcomes. They are not, however, the only approach to evaluation, nor are they feasible in all situations.
Examples of Promising Programs that Have Been Carefully Studied
In the mid-2000s, MDRC conducted randomized control trials of several programs designed to improve student outcomes at community colleges. The studies — which fell under the umbrella of the Opening Doors Demonstration — were supported by a consortium of foundations, the U.S. Department of Labor, and the U.S. Department of Education.
One of the Opening Doors programs consisted of a Learning Communities intervention at Kingsborough Community College in Brooklyn, New York. The Learning Communities targeted incoming freshmen, the great majority of whom required developmental English. Students in Learning Communities were placed into groups of 15-25 that took three courses together: an English course geared toward their level of proficiency; a regular college course like introductory psychology or sociology; and a student success course, taught by a college counselor, that covered effective study habits and other skills necessary to succeed in college. Faculty who taught in the Learning Communities were expected to coordinate assignments and meet periodically to review student progress. The idea was to build social cohesion among students and faculty and to help students apply the concepts and lessons across the courses.
More than 1,500 students participated in the Learning Communities evaluation and were, as noted, randomly assigned to either a program group that participated in Learning Communities or a control group that took regular, unlinked courses. The students were young (mostly 17 to 20 years old), low-income, and highly diverse in terms of race and ethnicity. The research team tracked program and control group members for two years and found that students in the Learning Communities were more likely to feel integrated at school and be engaged in their courses. They also passed more courses and earned more credits during their first semester, moved more quickly through developmental English courses, and were more likely to take and pass an English skills assessment test that was required for graduation. It is important to note that these effects, while statistically significant, were generally modest. For example, after four semesters, students in the program group earned an average of 33.2 college credits, compared with an average of 30.8 credits for the control group (a difference of less than one course). Moreover, contrary to expectations, the Learning Communities did not have an immediate effect on persistence. Kingsborough is only one test, however, and a new set of randomized control trials is underway in five states to build more evidence on this type of program.
Another Opening Doors study in Louisiana tested the effectiveness of a performance-based scholarship targeted to low-income parents who attended two community colleges in the New Orleans area: Delgado Community College and the Louisiana Technical College-West Jefferson. The scholarships offered $1,000 for each of two semesters ($2,000 total) if students stayed in college at least half-time and maintained a “C” or better average. The scholarships were paid in increments at the beginning, middle, and end of the semester, and program counselors monitored students’ academic performance. Eligibility was limited to students who were parents and whose household income was below 200 percent of the federal poverty level.
A little more than 1,000 students enrolled in the Louisiana study, mostly mothers in their 20s with one or two children. Half were randomly assigned to a program group that was eligible for the scholarship, while the other half were placed in a control group that was not eligible. Both the program and the control groups continued to receive federal Pell grants and other aid for which they qualified. The evaluation found that performance-based scholarships gave students a substantial boost. For example, students in the program group were more likely to register for college and attend full-time, even though only half-time enrollment was required to receive a scholarship. They were also more likely to stay in college. In the second semester of the program, 65 percent of the program group registered for courses, compared with 50 percent of the control group. And, finally, students in the program group completed more course credits than those in the control group, earning on average 3.5 more credits (a little more than one college course) over four semesters. Similar to Learning Communities, a new set of randomized control trials is taking place in six states to build more evidence on performance-based scholarships in other settings and with other types of students.
A good example of a matched comparison design is provided by an evaluation of Washington State’s Integrated Basic Education and Skills Training (I-BEST) program, conducted by the Community College Research Center (CCRC). I-BEST aspires to help community college students gain proficiency in English and math and also prepare them for specific occupational fields, such as nursing, early childhood education, and automotive repair. What is unusual about I-BEST is that the English and math instruction is integrated into the occupational curriculum, rather than the more conventional approach of teaching them separately. For example, students in nursing programs attend English classes that emphasize medical terminology and writing used in health care settings; students in automotive repair learn how to read manuals and use instruments needed to diagnose and correct engine trouble.
Researchers compared the academic outcomes for 900 I-BEST participants with those of more than 31,000 students in regular developmental education courses who were matched on the basis of demographic characteristics, educational background, and enrollment patterns. The study found that I-BEST students had higher persistence rates, earned more credits toward a college credential, earned more occupational certificates, and showed greater improvements on tests. Because it was not a randomized control trial, the study could not eliminate the possibility that students who enrolled in I-BEST were more motivated or had other characteristics that may have distinguished them from students in other developmental courses. For this reason, a randomized control trial of I-BEST is now in the planning stages by Abt Associates, with funding from the U.S. Department of Health and Human Services.
Bringing Effective Program Strategies to Scale
Though rigorous research on community college interventions is a relatively recent phenomenon, there are examples of institutions that are taking proven practices to scale. Kingsborough Community College offers a prime case: over several years, it has expanded its Learning Communities to serve two-thirds of entering freshmen and plans to grow the program further. Kingsborough’s progress has been made possible by several factors, including strong commitment by its president; ongoing training and support for faculty who teach in learning communities; and a key funder, the Robin Hood Foundation, that has provided support over a sufficiently long period to help the college institutionalize the program using regular college revenues. Kingsborough recently won a grant from the U.S. Department of Education’s Fund for Improvement of Postsecondary Education to help other community colleges adapt its approach.
The performance-based scholarship program in Louisiana did not continue beyond the demonstration, in part because the aftermath of Hurricane Katrina shifted the state’s attention to more pressing needs. Nonetheless, the positive findings prompted other states and institutions to develop versions of performance-based scholarships — some of which, as noted above, are undergoing further rigorous evaluation. If the findings are positive, the prospects for expansion are good. A considerable amount of public and private money already goes to financial aid programs, and some of these dollars may be designated for performance-based scholarships. Government or private sources may also create an incentive for institutions to implement the approach by offering start-up grants and matching funds. Finally, “how-to” guides and training for scholarship providers and financial aid administrators may encourage further take-up.
The examples above underscore how credible evidence on programs can focus attention on an idea and create the will to bring effective programs to scale. At the same time, they show that running and testing a pilot project is only the beginning. States and colleges require strong leaders who can articulate a vision, set goals, and mobilize resources to continue building on the reforms. States and colleges must also pay attention to the faculty and staff responsible for policy and program implementation. Training and professional development are key; ultimately, many individuals — and not just a few champions — need to “own” the ideas and apply the research lessons to their own context. Finally, dependable funding is essential.
States and colleges may benefit from two types of funding from government and private sources. One is for program innovation and testing, to continue the search for policies and programs that will lead to greater student success. A second is for adopting or expanding practices already proven to be effective. Of course, grant funding cannot be expected to last indefinitely and ought to include incentives or requirements for states and institutions to designate matching funds and develop long-term plans for sustainability. Evaluators can help by providing policymakers and program administrators with information on costs and by analyzing the cost-effectiveness of strategies. For example, it may be possible to justify increased funding for interventions that accelerate time-to-degree or lead to other improved outcomes based on cost savings.
 Snyder, T.D., and Dillow, S.A. (2010). Digest of Education Statistics 2009 (NCES 2010-013). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Table 187.
 Provasnik, S., and Planty, M. (2008). Community Colleges: Special Supplement to the Condition of Education 2008 (NCES 2008-033). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.
 Carnavale, A., Smith, N., and Strohl, J. (2010). Help Wanted: Projections of Jobs and Education Requirements through 2018. Washington, DC: Georgetown University Center on Workforce and Education.
 Berkner, L., He, S., and Cataldi, E.F. (2002). Descriptive Summary of 1995–96 Beginning Postsecondary Students: Six Years Later (NCES 2003–151). Washington, DC: U.S. Department of Education. National Center for Education Statistics.
 Bailey, T., Jeong, D.W., and Cho, S. (2010). “Referral, Enrollment, and Completion in Developmental Education Sequences in Community College.” Economics of Education Review, 29(2), 255-270.
 Choy, S. Findings from the Condition of Education 2002: Nontraditional Undergraduates (NCES 2001-012). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.
 Shadish, W., Cook, T., and Campbell, D. (2002). Experimental and Quasi-Experimental Designs for General Causal Inference. New York: Houghton Mifflin.
 Scrivener, S., et al. (2008). A Good Start: Two-Year Effects of a Freshmen Learning Community Program at Kingsborough Community College. New York: MDRC.
 Richburg-Hayes, L., et al. (2009). Rewarding Persistence: Effects of a Performance-Based Scholarship Program for Low-Income Parents. New York: MDRC.
 Jenkins, D., Zeidenberg, M., and Kienzl, G. 2009. Educational Outcomes of I-BEST Washington State Community and Technical College System’s Integrated Basic Education and Skills Training Program: Findings from a Multivariate Analysis. New York: Community College Research Center.