Of motorbikes, M&E, and measuring impact—insights from HEA’s external evaluator


Watch Thomas discuss the challenges in the field of impact evaluation.

When you think of social science researchers, you probably don’t imagine them crossing rivers and deserts, and navigating hard-to-reach places on motorbikes.
But the work of Thomas de Hoop, Senior Researcher at the American Institutes for Research (AIR), and his colleagues can sometimes involve those activities. That is because a lot of their work deals with marginalized populations, and making sure their needs and views get reflected can turn into a bit of an adventure.
Thomas has just started collaborating with the HEA as an external evaluator. He and his colleagues will help the cohort build capacity for conducting high-quality process and impact evaluations. Since the collaboration has just started, we’ll have to stay tuned to see what logistical challenges come up for Thomas this time. But on a more traditional, technical front, he and his team have already been quite busy.
What AIR does
AIR conducts evaluations of a large number of programs in developing countries ranging from education to early childhood development, nutrition and cash transfer programs. Its work takes place in a large range of contexts, such as refugee contexts, in countries including Algeria, Ethiopia, Lebanon, and Pakistan.
In its work with HEA, the AIR team wants to deliver high quality research and evaluation. And, equally important, it wants to build the capacity of the innovation teams to do as much as possible themselves.
 “We hope to build the implementers’ capacity for monitoring and evaluation,” Thomas says. “We’re coming in as externals and we might not be available five years from now. We want to help them improve over time, to learn to collect more relevant and reliable information. Evaluation is really learning by doing, and we want to work together with the teams as much as possible.”
He adds, “we want to partner with them in the HEA and be a resource for them to get the data they need and find the best ways to collect those data.”
How does AIR do research?
AIR uses both quantitative and qualitative methods. This entails doing large scale surveys among households, as well as in-depth interviews and focus group discussions. AIR researchers always team up with local researchers to better access local populations as well as to build capacity.
In addition, all of their evaluations are based on a theory of change, which they develop along with the implementers.
Theory of change
According to Thomas, you can think of a theory of change as a causal chain that links the activities of the program to outputs, intermediate and final outcomes. It’s not a fixed thing; it could change over time if new information comes in.
“We build a theory of change by starting with information about initial conditions of the context. We look at what kinds of activities the innovators are implementing, and the different pathways through which these activities can achieve their final goals.
“These will be different for the three projects, because they’re all working in different contexts. Refugee contexts already differ a lot from regular developing country contexts. Then there are differences between different refugee camps.”
Thomas adds that it’s important to be prepared for the fact that things don’t always go as planned.
“We do process evaluations to see what’s getting implemented that was planned, what’s not being implemented as planned, and how the programs can improve. To do this, we do interviews with households influenced by the program, as well as other stakeholders.
“There are two things that really matter here for effective collaboration with the innovation teams: That the innovators know their programs very well and provide input. And that we co-create the theories of change with them. We want everyone to be on the same page; this will make sure the evaluations will be helpful.
“We also like to work with the innovators to identify the key decisions they need to make over the next months and years, in order to scale successfully. We want to figure out which of those decisions are most crucial and where they have the most doubts.”
What are the challenges and opportunities to evaluation in the refugee context?
Data collection can be challenging because local researchers may have limited capacity.
Another challenge is the mobility issue. “Often, you try to collect data from the same people over time, but many people in refugee contexts are migrating and it’s not clear where they live now, or where they will be two years later. This makes it very challenging.”
On the other hand, mobile phone ownership is very high in the refugee context, which might present opportunities for innovative data collection to track these people.
What happened at bootcamp? 
Thomas and his colleagues participated at the first HEA bootcamp in October 2016, to meet one-on-one with the cohort and start planning for the work ahead. According to Thomas even in those few days they made a lot of progress. He explained that he and his team primarily want to make two contributions through their evaluations:
- Address knowledge gaps on what works to improve education outcomes in the refugee context, as there is not a lot of evidence on this so far.  
- Identify and address research questions that are relevant for the innovation teams and use the answers to resolve the scaling challenges the cohort members are facing.
Both goals require different research methods, Thomas explains, but are complementary because “we only want to scale programs that actually do improve education outcomes.”
At bootcamp, cohort members and AIR team members came up with two types of research questions, which they will be exploring together going forward:
- The effect of their programs on education outcomes for refugees
- The barriers to scaling these programs and keeping them effective
Thomas explained that each of the cohort teams faces different, specific challenges.
“My first thought when I heard about scaling is that it involves expansion, i.e. reaching a larger number of beneficiaries in the same context. Which is the case for WUSC. But in the case of Kepler, it also means adaptation, i.e. the program started in Kigali and they want to adapt the program to make it equally effective to the refugee context. As for War Child, it’s more about adaptation of the program to other countries – will the approach they used in Sudan also work Jordan and Lebanon?
“It’s very important that the groups learn from each other. In spite of their differences, they face many similar challenges. For example, gender balance—making sure women apply to their programs and attend school—is a challenge for all of them.”
Main lesson from bootcamp?
“We learned that a lot is feasible. We just have to be creative in coming up with our research designs. For example, we’re looking at relatively small programs that want to scale. Estimating their effect on education outcomes will be challenging, because we’ll have to rely on statistically small samples. This can be partly resolved by using mixed-methods and by relying on a larger number of control schools and more data collection rounds."