Impact evaluations of pilot programs are often used to learn what is effective. And many researchers claim that the results of these impact evaluations can inform scale-up. But estimated effect sizes from pilot programs cannot always be credibly extrapolated to settings in which the program is implemented at scale. For example, evidence from Kenya shows that placing an additional contract teacher to tutor lagging students was very effective in improving learning outcomes when implemented by a Non-Governmental organization (NGO) during a pilot program. However, the same program no longer improved learning outcomes when the program was implemented at scale by the government (Bold et al., 2013). Previous evaluations of NGO-supported pilot programs in India also demonstrated positive effects on learning outcomes of (1) grouping children by ability levels, and (2) focusing on skills appropriate to that level. However, evidence suggests that mainstreaming these changes into government-schools led to important implementation challenges when implemented at scale, although eventually the program was scaled up succesfully (Banerjee et al., 2016).
So how can evaluations contribute to the successful scale-up of education programs? Such positive contributions require mixed-methods studies that combine impact evaluations to establish causality with process evaluations that examine how the program can be effectively implemented at scale. It will still be important to conduct impact evaluations for credibly determining the effectiveness of development programs. However, most impact evaluations do not examine the implementation of programs in sufficient detail. And even if they do, these papers often do not clearly document the use of qualitative methods. This needs to change. Qualitative research requires different standards for rigor. But it is equally important to come up with systematic (not necessarily random) sampling approaches and data analysis methods in qualitative research as in quantitative research. At the same time, quantitative researchers need to get better at reporting details about the process of implementation of programs being evaluated, and selecting appropriate measurement tools.
Rigorous mixed-methods research is particularly important in protracted crisis settings where interventions often focus on achieving intangible results that are hard to measure using quantitative research. In their assessment of the explanatory power of two randomized controlled trials in conflict settings, Burde et al. (2012) argue that “[w]hen properly designed and executed, randomized trials can produce robust and significant findings even in the most difficult circumstances. Had they relied exclusively on quantitative methods, however, the studies discussed here would not have fared as well in explaining why these programs had the impact they had. Mixed methods enhance explanatory power for studies that explore impact and cause-and-effect questions” This explanation shows the importance of examining the causal chain and its underlying structures to examine the pathways underlying the theory of change. The importance of using theory for informing the scale-up of development programs through impact evaluations is powerfully illustrated by Deaton & Cartwright (2016) who argue that “Without knowing why things happen and why people do things, we run the risk of worthless casual (“fairy story”) causal theorizing.”
American Institutes for Research (AIR) conducted a mixed-methods cluster-randomized controlled trial to determine the short-term impact of a gender socialization program for teachers on teachers’ knowledge, attitudes, and behavior related to gender norms in Karamoja in Northern Uganda. The quantitative findings of this mixed-methods, cluster-RCT suggest that the program has positive effects on teachers’ knowledge about the difference between gender and sex, and changes teachers’ attitudes toward gender roles. The qualitative research, however, shows a more nuanced understanding about how the teacher training program influences teachers’ attitudes toward gender roles. Specifically, teachers are compromised in the adoption of attitudes that are supportive of gender equality because these attitudes transgress the gender norms in the community, at least in the short term. As a result, teachers can change only basic practices associated with gender equality and not more complex practices (Chinen et al., 2016). Various teachers learned that men and women can do the same activities. However, the same teachers felt it was difficult to implement the lessons learned. Teachers are often not high ranking members in the communities where they teach because they are often not from Karamoja. This lack of power complicates the adoption of practices that transgress gender norms in Karamoja. Developing a comprehensive theory of change was crucial for both the design and the implementation of this evaluation.
AIR plans to conduct similar mixed-methods research in partnership with innovation teams funded under the Humanitarian Education accelerator (HEA). The HEA program was set up by the Department for International Development (DFID), the United Nations’ Children’s Fund (UNICEF) and the United Nations High Commissioner for Refugees (UNHCR). It aims to generate rigorous evidence to understand how to transform high-potential pilot projects into scalable educational initiatives for refugees and displaced communities worldwide. To support HEA’s goal, AIR will combine impact evaluations with process evaluations so as to contribute to the successful scale-up of effective education programs in refugee settings. Partnering with the innovation teams will be particularly important to jointly learn about the effectiveness of their programs. Joint learning will enable the innovation teams to set up effective monitoring and evaluation systems and use the findings of the evaluation to inform the design of their programs. We expect that the results of the research funded under the HEA will contribute to the design and structure of emerging financing platforms for education interventions in protracted crisis settings, such as the World Bank Global Crisis Response Platform and the Education Cannot Wait Fund.