Skip to main content

Currently Skimming:

7 Additional Impact Evaluation Designs and Essential Tools for Better Project Evaluations
Pages 177-198

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 177...
... Also addressed were some of the objections that the committee's field teams heard about the viability of adopting randomized evaluations more generally. While concerns about the impracticality of randomized evaluations must be taken seriously, in principle many of them could be dealt with through creative project design and/or greater flexibility in the selection of units for treatment or the timing of project rollout.
From page 178...
... Our approach, therefore, is not to assess the quality of current monitoring plans but rather to assess and illustrate instances where additional information that could reveal the impact of DG projects is currently not being collected but could readily be acquired. The first finding of the analysis was that all 10 of the activities examined used M&E plans that omitted collection of crucial information that would be needed if USAID sought to make impact evaluations of those activities. The committee does not mean to criticize current M&E plans, which focus on acquiring important information for program management and resource allocation.
From page 179...
... Nonetheless, on the positive side, 5 of the 10 activities were found to be, in principle, amenable to using randomized evaluation designs to determine project impacts; 4 other activities were found to be amenable to collection of baseline or nonrandom comparison group data that would significantly improve USAID's ability to know whether or not the activity in question had a positive impact. Seven of the 10 activities were found to be amenable to changes in how outcomes were measured that by themselves would markedly strengthen the monitoring they were already doing. The measurement changes alone were judged to be capable of bringing the average ability to provide inferences about project outcomes from 1 to 3 on the 10-point scale, while the shift to collecting data for impact evaluation designs was found to be capable of raising the average score for making sound inferences of project effects to over 7.
From page 180...
... Perhaps even more important, fully 9 out of 10 of these activities were found to be suitable for some form of the impact evaluation designs described in Chapter 5. Given that none of these activities in Uganda are currently collecting the kind of information needed for such impact evaluations, but 9 out of 10 could potentially do so, USAID appears to have a great deal of choice and flexibility in deciding how much, and whether, to increase the number of programs and the amount of information it collects to determine the effects of its DG activities.
From page 181...
... to be treated is greater than one, all three of these attributes of impact evaluation are possible. The major difference between randomized evaluations and other methodologies lies in the degree to which project designers need to concern themselves with the number and selection of control units.
From page 182...
... When randomization is not possible, but selection of multiple treatment and control areas is, conditions are ideal for the "second-best" method of large N nonrandomized designs. This sort of design is often referred to as "difference in difference" (DD; Bertrand et al 2004)
From page 183...
... The national-level control group, however, could be used to show differences between the nation and the project areas in terms of not only poverty, degree of urbanization, and so forth but also many of the project impact measurements that USAID requires to determine project success or failure. For example, if a project goal is to increase participation of rural women in local government, comparisons could be made between the baseline and the national averages, and then, following the DD logic,   The committee believes, but was unable to document, that this method has been utilized in some other programs in Africa.
From page 184...
... For many years USAID focused a considerable component of its DG projects in Guatemala on institution building at the national level, especially the legislature. Surveys carried out by the Latin American Public Opinion Project as part of its Americas Barometer studies, found a deep distrust in those institutions, despite years of effort and investment.
From page 185...
... For example, if the consultant's report states that the baseline study finds 10 percent of respondents attending municipal meetings in both the control and experimental areas, and the end-of-project survey finds that the treatment area has risen to 15 percent but the control group has also risen to 12 percent, it would be important to know if the change in the treatment group is statistically significant and if the increase in the control group was also significant. Thus USAID needs to be certain it has hired qualified individuals and obtained an appropriate level of statistical analysis to make the analysis useful for determining project impact.
From page 186...
... Since in many cases missions will not be able to select their treatment areas randomly, the "national control" sample offers a reasonable way of measuring project impact. Finally, it is important to add that survey samples should not be used when little is known about the expected project impact. Surveys are best used when researchers already have a good idea of how to measure the expected impact.
From page 187...
... Is this due to the effect of party-strengthening activities supported by USAID? Or is it due to some other factor, such as a change from an electoral system with preferential voting to closed party lists, which would tend to strengthen party discipline, including, perhaps, that of local parties?
From page 188...
... And as long as they include a control group and sound pre- and postmeasurements, even nonrandomized designs can provide the basis for credible impact evaluations; in principle they can offer considerably more information for assessing project effects than is usually obtained in current DG M&E activities. Supporting an Inclusive Political System in Uganda Another example is the project sponsored by USAID's Uganda mission to promote the development of an inclusive political system.
From page 189...
... that tracks trends both before and after a program is implemented and explicitly identifies untreated units for which comparable outcomes could be measured would provide much greater confidence in any inferences about the project's actual effects. If there are large amounts of data, the techniques described earlier (propensity score matching, regression discontinuity)
From page 190...
... The project sponsored fact-finding monitoring and supervisory field visits to 35 districts, including a number in Northern Uganda, where many members of parliament and parliamentary staff rarely venture. Again, the goals of the project are worthy and the activities appear to be well conceived; however, the project is not amenable to randomized evaluation.
From page 191...
... As with the projects described previously, an evaluation design that furnishes more information for assessing impact than the current M&E approach is possible. First, assessing the impact of these initiatives would require some measurement of outcomes among a control group of members of parliament who were not exposed to the field visits, public dialogues, and consultative workshops.
From page 192...
... As with the two other projects described earlier, implementing the proposed changes involves trade-offs, but the team concluded that, if USAID wished to learn more about the precise effectiveness of these programs, there is substantial opportunity to develop impact evaluations on these activities, even without using randomized designs. What to Do When There Is Only One Unit of Analysis10 Many USAID projects involve interventions designed to affect a single unit of analysis.
From page 193...
... The latter can be measured fairly easily using public opinion polls administered before and after the period during which technical assistance was offered and then comparing the results. However, measuring the degree to which the judiciary is transparent and accountable is much more difficult.
From page 194...
... . Answering the second question requires the existence of high-quality baseline data, preferably stretching back as far in time as possible so as to be able to distinguish general trends from project effects.
From page 195...
... A second strategy for improving causal inference in an N = 1 design is to look beyond the narrow outcome that the project was designed to affect and try to identify other outcomes that would be consistent with positive project impact. The example provided earlier from Uganda of using the success of projects targeting the disabled to verify the effectiveness of completely separate projects designed to promote the empowerment of marginalized citizens illustrates this technique.
From page 196...
... The impact evaluation designs described in this report, and the examples presented in the previous two chapters, suggest that in principle there is considerable scope for USAID to improve its ability to answer this question. The committee would neither expect nor recommend that the agency undertake impact evaluations of all of its activities.
From page 197...
... Process evaluations, the kinds of case studies discussed in Chapter 4, and more informal lessons from the field obtained by DG staff, implementers, nongovernmental organizations, and independent researchers provide important insights, valuable hypotheses, and illustrations of how programs are received and respond to changing conditions. The committee believes that USAID needs to develop organizational characteristics that will provide both incentives for more varied evaluations of its projects and mechanisms to help agency staff absorb, discuss, and continually learn from a variety of sources about those factors that affect the impact of DG programs.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.