Balancing Inclusiveness Rigour and Feasi
Balancing Inclusiveness Rigour and Feasi
Balancing Inclusiveness Rigour and Feasi
Multi-stage sampling
Collect and link data of open systems
Participatory mixed-methods
Inclusiveness Feasibility
Source: Presentation given by the authors at the IDEAS Global Assembly, 29 October 2015, Bangkok.
of methods, facilitation of processes, data collation, cross- First, clean ‘control’ groups are rarely found and quantitative
validation and causal analysis. This is to ensure consistency estimation of net attributable impacts on ‘treated’ compared
and responsiveness to the purposes and constraints of the to ‘control’ groups is often impossible or inadequate. This is
evaluation, necessary to establish sufficient confidence the case in IFAD programmes where institutional and policy
among stakeholders in its findings and conclusions (Rogers work ‘contaminate’ entire populations, where other donors
2009; Stern et al. 2012). Inclusiveness involves meaningful and influences augment ‘causal density’, where self-targeting
engagement of stakeholders with diverse perspectives, mechanisms make ‘treatments’ highly diverse, and where
which has an intrinsic empowering value while also innovations have emergent results (Woolcock 2013). Hence
enhancing credibility of the evaluation through triangulation the need for different ways of arriving at rigorous causal
and cross-validation of evidence. Feasibility concerns the inference that better fit complex environments (Befani 2012;
budget and capacity needed to meet expectations of rigour Guijt and Roche 2014).
and inclusiveness and to enhance learning (Chambers 2015).
Second, programme managers and even funders can feel
PIALA was first piloted in Vietnam (IFAD and BMGF 2014) threatened by traditional types of evaluation that focus on
and then in Ghana (MOFA/GOG, IFAD and BMGF 2015). performance against pre-set targets, whereas in complex
Both pilots used these three standards for framing the environments there is less control over results (e.g. similar
evaluation: data collection and linking; synthesising findings; processes can lead to different outcomes). This hinders
and analysing and debating programme contributions. solid debate and learning about impact. By seeking to
Insights from the first pilot enabled the second to address understand programme contributions to impact, alongside
some of the challenges encountered. This paper describes key many other influences, and by taking a broader systemic
trade-offs and lessons. First, we present PIALA as a response perspective, fear of failure can partially be sidestepped
to the main challenges of impact evaluation in complex (Eyben et al. 2015).
environments. Then, we discuss our insights about the
possible trade-offs. We conclude by presenting reflections on Third, it is important to analyse and understand
how to balance rigour, inclusiveness and feasibility. development impacts more systemically when seeking
transformational or systemic change that is more inclusive
and sustainable and grounded in rights and democracy
2 PIALA’s response to challenges in (Eyben 2008). The tendency in mainstream evaluation
impact evaluation practice is to slice programmes into measurable parts
Challenges in impact evaluation and then look at intervention and effect for each in
IFAD-funded government programmes are implemented isolation (Befani, Ramalingam and Stern 2015). In Ghana,
in rather complex ways and environments that challenge for instance, in some studies of specific programme
mainstream evaluation. Four challenges are common to mechanisms conducted before the PIALA study, such
many contexts. a reductionist perspective resulted in quite perilous
CDI PRACTICE PAPER 14 February 2016 www.ids.ac.uk/cdi
PAGE 3 PRACTICE PAPER CDI
Table 1 PIALA methods and processes
Methods and processes Purposes
1 Outlining of design options and budget implications (full scale–full ■ Enable commissioners to decide on scope and scale of evaluation
scope; full scale–limited scope; or limited scale–full scope)
2 Reconstruction and visualisation of programme Theory of Change ■ Identify causal claims and assumptions
(ToC) ■ Formulate evaluation questions
■ Create shared understanding among stakeholders of programme
theory and broader influences
3 Multi-stage sampling with ‘open systems’ as principle sample unit ■ Enable systemic inquiry and comparative analysis of impact on
(e.g. value chain systems) livelihoods and household poverty
4 Selection of methods for data collection, and drafting of ‘how-to’ ■ Enable rigorous use of methods and facilitation of processes
guidance and templates for each method and for quality monitoring ■ Enable systematic data quality monitoring and reflective practice
5 Data collection on changes and causes in household food and ■ Collect and triangulate data on impacts
income through: ■ With intended beneficiaries visually reconstruct and discuss causal
■ Household survey flow of changes in livelihoods affecting household wealth and
wellbeing
■ Generic change analysis in gender-specific groups (social mapping,
timeline, wealth and wellbeing ranking, causal flow mapping)
6 Data collection on livelihood changes and causes through: ■ Collect and triangulate data on effects of livelihood changes on
household food and income
■ Generic change analysis (see above)
■ Visualise and discuss with intended beneficiaries causal flow of
■ Livelihood analysis in gender-specific groups (livelihood change changes and causes in different areas affecting their livelihood
matrix, causal flow mapping, SenseMaker)
7 Data collection on reach and effects of selected programme ■ Collect and triangulate data on effects of programme mechanisms
mechanisms through: on changes and causes in various areas affecting livelihoods
■ Livelihood analysis (see above) ■ With intended beneficiaries discuss and anonymously score reach,
benefits, outcomes of mechanisms
■ Constituent Feedback (CF) in mixed groups (questionnaire for
discussing and anonymous scoring)
■ Semi-structured interviews with service providers and officials
(CF-linked questionnaire)
8 Data linking and quality monitoring using a standard data ■ Enable instant data processing and cross-checking to identify gaps
collection tool and questionnaire for team reflections on quality of and weaknesses
methods, processes and evidence using a standard questionnaire ■ Ensure robust evidence (inclusive, sufficient, consistent, rigorous)
9 Local and national participatory sensemaking using a workshop ■ Probe to fill remaining data gaps
model consisting of design principles and methods for enabling voice ■ Enable stakeholders to understand impact systemically
and facilitating cross-validation and contribution scoring
■ Engage stakeholders in valuing programme contributions and
10 Configurational analysis using standardised data collation and identifying priority investment areas
scoring tools
Source: Drawn from the Root and Tuber Improvement and Marketing Programme (RTIMP) impact evaluation report in the Ghana pilot
(cf. MOFA/GOG, IFAD and BMGF 2015).
Depth versus breadth of inquiry (inclusiveness versus rigour Synthesising evidence, and analysing and debating
and feasibility) contribution claims
PIALA’s mixed-methods approach pursues depth, through In this phase, data are ‘zipped up’ again along the ToC
focused participatory inquiry of ‘open systems’, and breadth, to show what evidence upholds or refutes the assumed
through representative household surveys. To enable data contribution claims. This leads to answering causal questions
linking, households are sampled within the sample of these about what has produced which observed outcomes and
‘systems’. However, there are trade-offs when using this impacts, for whom and why (BetterEvaluation 2014).
approach at scale, as participatory research becomes onerous Participatory sensemaking and configurational analysis
and limited resources can further compromise rigour. (cf. Table 1) form the backbone of this phase.
CDI PRACTICE PAPER 14 February 2016 www.ids.ac.uk/cdi
PAGE 7 PRACTICE PAPER CDI
Figure 2 Part of the RTIMP configurational analysis
Contributions of
Contribution Claim of Contribution Claim of Contribution Claim of
RTIMP Components
RTIMP Component 3 RTIMP Component 2 RTIMP Component 1
1, 2 and 3
↓ ↓ ↓ ↓
Improved
Enhanced Processing (O3) Enhanced Production (O2) Enhanced Market-Linking (O3)
Livelihoods (I2)
Tano North
1 1 1 1 3 6 5 5 5 5 4 4 5 5 5
(Apesika) (CZ)
Techiman (CZ) 1 1 1 1 4 5 5 5 5 5 4 4 5 5 5
Birim Central
1 1 1 1 3 3 4 5 5 4 3 4 4 4 5
(CZ)
Nkwanta South
1 1 1 0 3 4 5 5 4 5 3 3 5 4 5
(NZ)
Upper West
1 1 1 1 2 4 4 5 5 4 3 3 5 4 5
Akim (CZ)
Ashanti
1 1 1 1 3 4 5 5 5 5 3 3 5 4 5
Mampong (CZ)
West Gonja
1 1 1 0 3 4 5 5 4 5 3 3 5 4 5
(Damongo) (NZ)
Abura Asebu
1 1 1 1 3 3 5 5 5 6 3 3 5 4 4
Kwamankese (SZ)
Nanumba North
1 1 N/A N/A 5 5 5 3 3 5 4 5
(NZ)
Central Gonja
1 1 N/A 2 3 5 5 4 5 2 2 5 4 5
(NZ)
NZ=Northern Zone
CZ=Central Zone Gari HQCF Yam PCF Other
SZ=Southern Zone
Classical counterfactual or configurational counterfactual difficult to align findings with the ToC. These concerns and
(rigour versus feasibility) limitations made us hesitant to generalise certain findings
Mainstream impact evaluation assumes that comparative data regarding programme contribution. Rigour and feasibility
analysis from treated and non-treated sites is both accessible appeared as an ‘either/or’ type of trade-off.
(thus feasible) and necessary (thus rigorous) to reach
generalisable conclusions about impact on rural household In Ghana, this trade-off was solved by choosing a
poverty. However, where this is not the case (which is quite different causal approach combining ‘configurational’ with
common), other forms of rigorous analysis are needed. ‘generative’ perspectives (Punton and Welle 2015; Stern
et al. 2012). We used systemic heterogeneity as the basis
In the Vietnam pilot, concerns about heterogeneity in for identifying and analysing programme contributions.
programme treatment and sample limitations made it Instead of a classic counterfactual inquiry of household-
CDI PRACTICE PAPER 14 February 2016 www.ids.ac.uk/cdi
PAGE 8 PRACTICE PAPER CDI
level impact, we employed a counterfactual approach that A second option involves investing in research partnerships
looked at the effects of different patterns of treatment with in-country research firms, thus strengthening local
that combined presence/non-presence, functional competencies. Integrating PIALA in programme design
conditions and differentiated effects of programme as part of an impact-oriented monitoring and evaluation
mechanisms on livelihoods and households. (M&E) process would increase cost-effectiveness, lay
the foundation for better knowledge for policy and
We developed a configurational analysis method to compare decision-making, and create more democratic space for
the evidence collected for each causal link in each of the stakeholders to influence decisions (Guijt 2014; Peersman
three contribution claims in the ToC across our sample of et al. forthcoming).
supply chains (first column in Figure 2). For each supply chain,
the formal presence of programme mechanisms such as We shifted towards the second option in Ghana,
Farmer Field Forum (FFF) or Micro-Enterprise Fund (MEF) ensuring that those involved in the tendering process fully
was inputted as a binary code (next four columns in Figure 2). understood the requirements and including far more days
We scored the causal link between the contribution claims for in-country supportive supervision of the pilot. The
and the impact claim (columns on ‘livelihood improvements’), research coordinator was strongly involved in the entire
and the evidence for this link, on ‘strength’ and ‘consistency’. evaluation process, leading to stronger ownership and
responsibility than in the case of Vietnam. IFAD’s country
Similarly, we scored the evidence for each causal link programme manager also actively engaged in design and
in which a mechanism operated, and the reach and sensemaking. The Ghana initiative was thus experienced
performance of the mechanism in each contribution claim as more of a joint learning journey – a partnership rather
(columns on ‘enhanced processing’, ‘enhanced production’ than a technical consultancy.
and ‘enhanced market-linking’). The scoring was done
based on detailed explanatory evidence collected for Degree of participation in sensemaking (inclusiveness versus
each of the claims independently (with different sets of rigour and feasibility)
methods and different groups). The analysis then looked Rigorous facilitation of participatory sensemaking – i.e.
at similarities and differences of various configurations of being responsive to local conditions and dynamics, while
clusters of scores across the supply chains supported by the consistently employing the same set of models and tools
evidence (MOFA/GOG, IFAD and BMGF 2015). in every locality – is essential for enhancing credibility and
confidence in evaluation findings (i.e. rigour) as well as
Not all contexts will allow for such a thorough and detailed generating solid debate and systemic learning among key
configurational analysis, as it requires high-quality data and stakeholders (i.e. inclusiveness). Again, this makes feasibility
analytical capacity. While there will always be a tension more elusive.
between rigour and feasibility, the approach can be adapted
to bring rigorous analysis within reach, in contexts where Engaging beneficiaries, service providers and decision-
classical counterfactual approaches cannot be applied. makers in collective sensemaking of emerging evidence
before turning to final analysis and reporting has both
Involving international experts or investing in local capacity instrumental and empowering value (MOFA/GOG, IFAD
(rigour versus feasibility) and BMGF 2015). Doing this in all researched localities
Undertaking a rigorous aggregated multi-causal analysis and at programme level helps to improve and strengthen
demands high-level analytical skills. In contexts where the evidence, overcome bias, and create ownership
local research institutions do not yet have these skills, of evaluation findings among stakeholders. For this to
conducting impact evaluations of complex programmes succeed, it is crucial to design and facilitate the processes
such as those funded by IFAD becomes less feasible. carefully, in ways that enable all participants to critically
engage and express their views in the presence of power-
In both pilots, researchers were not experienced with holders, and adopt a systemic perspective in valuing
rigorous aggregated multi-causal analysis. Because of the programme contributions to impact (cf. Section 2). A
methodological innovation, the authors took responsibility participatory sensemaking workshop model was developed
for the final analytical product. Ideally, however, the national that was first piloted in Vietnam and further expanded and
research coordinator should undertake the aggregated analysis improved on in Ghana.
and final reporting as part of delivery of the evaluation.
Using this model, in Vietnam, we organised six village-
To ensure sufficient analytical and reporting capacity, level workshops with 180 participants and one provincial
we see two options. The first is to work – as we did in workshop with 100 participants, while in Ghana, there
Vietnam – with international impact evaluation specialists were 23 district workshops with 650 participants and
leading on final analysis and reporting. This option is not one national workshop with 100 participants. Participants
optimal for fostering in-country capacity and responsibility were purposively sampled from the research participants.
for conducting rigorous impact evaluations. Beneficiaries comprised more than 70 per cent of those
CDI PRACTICE PAPER 14 February 2016 www.ids.ac.uk/cdi
PAGE 9 PRACTICE PAPER CDI
attending local workshops, and more than 30 per cent collection on poverty characteristics for designing the
at provincial/national workshops. The workshops were household survey. Another example from Vietnam
quite successful in both pilots. Participants gained a more involved the unavoidable presence of programme staff
complete picture of the development processes. There during fieldwork, which was necessary to make it feasible.
were lively debates about programme contribution to Undoubtedly, this must have had some influence on what
impact and priority areas for future investment. Critical villagers chose to share. Yet we were able to mitigate
to this success were the time and resources invested in excesses by rigorous facilitation and triangulation of
organising the workshops, and the capacity to rigorously different processes. These examples show the win-wins
design and facilitate them. When operating on a that can be achieved from putting serious thought into
shoestring, the number of workshops and participants may balancing rigour, inclusiveness and feasibility by carefully
need to be limited. But this undoubtedly has implications reflecting on potential losses in value for money if one were
for rigour and inclusiveness. to be prioritised over the other.
First, being clear about how to meet each of the standards To conclude, trade-offs are clearly not absolute, and it is
is crucial. Rigour is arguably the standard around which worthwhile exploring win-wins in any context to reduce
most was achieved in Ghana as we learned from what losses and enhance the evaluation’s value for money. One
had happened in Vietnam and became much clearer about does not have to forfeit inclusiveness completely if rigour
what it entailed. Rigour involved being thorough and is deemed a non-negotiable. Nor does rigour have to be
careful methodologically and analytically, as well as being compromised totally when budgets and capacities are
thoughtful about whose voices informed the findings. restricted. Each of the standards can be conceived of as
Inclusiveness was considered instrumental to arrive at a gradient. For example, inclusiveness can be approached
greater rigour, while also being essential for learning and from a minimalist perspective – ensuring enough to cross-
for influencing future policy and practice. In both PIALA validate key findings but perhaps cutting short on the
pilots, considerable space was created for stakeholders to aspiration of more collective learning and empowering
cross-validate and debate emerging evidence. Rigour also forms of inclusiveness. Similarly, operating constraints may
involved sampling thoroughness and appropriate method mean that it is not feasible to pursue more detailed surveys
selection, which affects the ability to conduct a ‘full scale in larger samples to build more airtight statistical rigour,
– full scope’ evaluation in a way that permits rigorous yet still permit building sufficient confidence in findings to
configurational analysis. This was achieved in Ghana, we stand up to scrutiny.
argue, despite the classic counterfactual (using control
groups) not being feasible. Being inclusive and rigorous does, however, make PIALA
more demanding of time, capacity and budget. Evaluations
Second, being clear with commissioners upfront about of smaller programmes with smaller budgets and limited
which standards are essential to serve which purposes scale will not need large samples and are therefore likely
helps decisions to reduce problematic trade-offs and to be cheaper, but may still produce little value without
accept those that remain inevitable. We sought to pursue sufficient capacity. Arguably, impact evaluation is always
all three standards equally as part of the piloting process, difficult, so there are no real shortcuts. But win-wins are
but made clear choices. For example, the importance more likely where quality standards are clearly defined
of inclusiveness for both analytical quality and uptake of and where sufficient guidance is provided for every step
findings in Ghana led us to invest more in participatory in the evaluation process – and above all, where there is a
sensemaking, instead of additional participatory data commitment to building learning and research capacity.
CDI PRACTICE PAPER 14 February 2016 www.ids.ac.uk/cdi
PAGE 10 PRACTICE PAPER CDI
Notes participatory methods evaluations. The premise is that bias cannot
1 IFAD is a specialist UN agency providing loans and support to be avoided by a single method or procedure but can be mitigated
governments for smallholder agricultural development. through triangulation of different methods and perspectives
2 The piloting of PIALA was made possible with financing from (Camfield, Duvendack and Palmer-Jones 2014).
IFAD and BMGF. However, this paper does not represent the 4 The paper builds on methodological reflections conducted with
views of the funders; it presents views of the individual authors. stakeholders and researchers that were documented by one of
3 In dominant evaluation practice, rigour connotes the controlled the authors (Adinda Van Hemelrijck) as part of the IFAD and
avoidance of bias through statistical procedure (Befani, Barnett BMGF innovation project and her doctoral study (cf. IFAD and
and Stern 2014). This narrow definition is inadequate for mixed BMGF 2013a, 2015; Van Hemelrijck 2014).
“ … trade-offs clearly are not absolute and win-wins [are] worthwhile exploring… to reduce losses
and enhance the evaluation’s value-for-money. One does not have to forfeit inclusiveness completely if rigour
is deemed a non-negotiable. Nor does rigour have to be compromised totally when budgets and capacities are
”
restricted. Each of the standards can be conceived of as a gradient.
Centre for Development Impact (CDI) This CDI Practice Paper was written by Adinda Van Hemelrijck
The Centre is a collaboration between IDS (www.ids.ac.uk), and Irene Guijt.
Itad (www.itad.com) and the University of East Anglia The opinions expressed are those of the author and do not
(www.uea.ac.uk). necessarily reflect the views of IDS or any of the institutions
The Centre aims to contribute to innovation and excellence involved. Readers are encouraged to quote and reproduce
in the areas of impact assessment, evaluation and learning material from issues of CDI Practice Papers in their own
in development. The Centre’s work is presently focused on: publication. In return, IDS requests due acknowledgement
and quotes to be referenced as above.
(1) Exploring a broader range of evaluation designs and
methods, and approaches to causal inference. © Institute of Development Studies, 2016
(2) Designing appropriate ways to assess the impact of ISSN: 2053-0536
complex interventions in challenging contexts. AG Level 2 Output ID: 323
(3) Better understanding the political dynamics and other
factors in the evaluation process, including the use of
evaluation evidence.