Read "Improving Democracy Assistance: Building Knowledge Through Evaluations and Research" at NAP.edu

« Previous: Appendix D: Understanding Democratic Transitions and Consolidation from Case Studies: Lessons for Democracy Assistance

Page 289 Cite

Suggested Citation:"Appendix E: Field Visit Summary Report." National Research Council. 2008. Improving Democracy Assistance: Building Knowledge Through Evaluations and Research. Washington, DC: The National Academies Press. doi: 10.17226/12164.

Page 290 Cite

Page 291 Cite

Page 292 Cite

Page 293 Cite

Page 294 Cite

Page 295 Cite

Page 296 Cite

Page 297 Cite

Page 298 Cite

Page 299 Cite

Page 300 Cite

Page 301 Cite

Page 302 Cite

Page 303 Cite

Page 304 Cite

Page 305 Cite

Page 306 Cite

Page 307 Cite

Page 308 Cite

Page 309 Cite

Page 310 Cite

Page 311 Cite

Page 312 Cite

Page 313 Cite

Page 314 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

E Field Visit Summary Report Overview of National Academiesâ Mission and Tasks The field visits were part of a larger project conducted by the National Academies (NA) for the U.S. Agency for International Development (USAID), the purpose of which was to develop an overall research and analytic design that will lead to specific findings and recommendations for the Strategic and Operational Research Agenda (SORA) of the democracy and governance (DG) programs. These findings and recommendations were developed through the vetting of a variety of methodologies for assessing and evaluating democracy assistance programs. Objectives of Field Visits In support of these overall project objectives, the field visits were intended to serve two major purposes: 1. The collection of information for the NA committee to inform its recommendations, in particular to increase membersâ understanding of: â¢ how USAID programs are developed and implemented in the field as background for its recommendations to improve program evalu- ation and understanding of program successes and failures, â¢ what data, evidence, and other resources are primarily or â Some of the material in this Appendix also appears in Chapters 6 and 7. 289

290 APPENDIX E uniquely available in the mission or in country to support improved pro- gram evaluation, â¢ the perspectives of mission personnel and USAID implement- ers regarding the feasibility of potential options for improving program evaluation; 2. to provide an opportunity to explore a âproof of conceptâ of the committeeâs preliminary recommendations, in particular the feasibility of introducing more rigorous approaches to program evaluation. Selection of Field Visit Sites Three countries were selected as the sites of the field visits conducted by teams of consultants and staff: Albania, Peru, and Uganda. In particu- lar, the selection was based primarily on the stage of program develop- ment within a countryâs DG portfolio, the breadth of USAID program- ming, and the depth of USAID programming (as determined by long-term funding in multiple program areas of interest; see âCurrent and Recent USAID Projecst at the Time of Field Visitsâ at the end of this appendix for a list of the major DG projects in each country). In each country selected, the DG staff were at the stage of developing new projects, offering an optimal opportunity to explore options for program design that may be more or less suited for various research methodologies. The NA field team members (see âConsultant Biographiesâ at the end of this appen- dix) were thus able to understand a variety of projects at the stage of their inception, the point at which new methodologies would be most effectively designed to maximize confidence about the impact of projects and under what conditions. These considerations guided the selection of cases across geographically and politically distinct regions of the world (Central Europe/Post-Communist, Latin America/Post-Military Rule, Africa/Post-Conflict). While there is no single point at which DG programs can be most effectively designed, implemented, or evaluated, the initial stages of development and design provide the most fruitful points at which inno- vative yet feasible options may be considered. Each field team therefore selected one or more projects and worked closely with USAID Mission DG officers, project implementers, and local partners through a series of in-depth conversations to understand the various opportunities and chal- lenges presented by newly proposed program designs, data collection, and more rigorous evaluation techniques. A fuller discussion of these proposed program designs in each country visited follows.

APPENDIX E 291 KEY OBSERVATIONS AND FINDINGS FROM FIELD VISITS There are ample opportunities for improving the methodology of program monitoring and evaluation within the DG sector. This is in large part due to the well-developed existing USAID evaluation procedures. To maximize these opportunities, various approaches to evaluation must be selected based on program goals and program designs. This should involve the provision of assistance (e.g., visits by specialists in program monitoring and evaluation (M&E) from USAID/Washington to missions during the project conceptualization stage as well as subsequent stages of M&E development. Improvements in program evaluation need not be expensive. Maxi- mizing existing mechanisms (surveys and other data collection systems) and strategically targeting sample populations and control groups can result in more robust findings at a cost savings overall. By improving program evaluation, the impact of USAID programs can be more accurately assessed and documented. Creating knowledge of program impacts through rigorous evaluation is the best way to identify and take advantage of lessons learned. Institutional knowledge gained through these experiences should be shared within and beyond the mission to affect learning on a broader, agency-wide basis. Building on Current Tools and Approaches Several current practices of mission staff demonstrate the necessary willingness to maximize reasonable opportunities for learning and pro- vide the basis for more solid inferences over time. Currently, as a part of ongoing DG programs, mission staff collect regular and systematic information about those who receive training through USAID-funded programs. This approach to data collection should be encouraged and expanded to complement other more rigorous methodologies described below. Similarly, implementers working with USAID have developed elabo- rate mechanisms for quarterly data collections pertinent to their programs. To maximize the potential represented by these mechanisms, data col- lected should be directed toward understanding outcomes and impacts over outputs. Similarly, mechanisms created by local implementers should be strategically collected and analyzed to maximize cost benefits and â This text is drawn from memos prepared for the committee by three of its field consul- tantsâThad Dunning, Yale University (Peru); Devra Cohen Moehler, Cornell University (Uganda); and Dan Posner, University of California at Los Angeles (Albania)âand reflects their judgments and assessments.

292 APPENDIX E efficiencies. For example, collecting local government data in the form of smaller, cost-effective samples from municipalities would be beneficial. Furthermore, this information should be fully transferable to USAID for learning purposes. Most important, these mechanisms should be consis- tent with key program design elements requiring consideration at the initial stages of program development. Measurement of Outcome Indicators Indicators gathered in connection with past programs tend to be mea- sures of âoutputsâ or very proximate outcomes. Examples of these out- put indicators include, in the context of a decentralization program, the number of relevant municipal officials trained by the implementer or the percent of target municipalities who agree to an assistance plan. Although these output measures may be useful and necessary for monitoring the performance of local implementers or to assess short-term progress on the process of implementing a program, they are less helpful for measuring the outcomes that the programs hope to promote. To improve assessment of the impact of USAID programs on ultimate objectives, it is important to gather data to the extent possible on outcome variables. One example gathered in connection with the decentralization program was the per- centage of local citizens who rate the quality of local government services as âgoodâ or âvery good.â Controls Most program evaluations involve indicators gathered only or mostly on âtreatedâ units (those groups, individuals, or organizations who were assisted by USAID). Sometimes this is unavoidable, as when a program works with only one unit or actor (e.g., the Congress). At other times, however, it is possible to find comparison units that would be useful for assessing the impact of U.S. interventions. Using control groups is invaluable for attributing impact to a USAID program. For example, without a control group it is impossible to know if the change in local party development is a result of a USAID intervention or another factor such as change in national party law, economic growth, or better media coverage. Gathering outcome measurements on control units need not be pro- hibitively costly. The cost of modifying the 2003 and 2005 national surveys in Peru conducted by the Latin American Public Opinion Project (LAPOP) to include a sample of residents in control group municipalities would likely have run around $15,000 per survey, a small investment when com- pared to the $20 million cost of the program over five years.

APPENDIX E 293 Opportunities for Randomization Comparisons across units or groups with which USAID partners worked and those with which they did not are only partially informative about the impact of USAID interventions. For example, differences across these groups could reflect preexisting differences and unobserved con- founders, rather than the impact of the intervention. Similarly, selection bias could account for the variation in performance between the treatment and control groups. One of the ways that social scientists sometimes approach this diffi- culty is through random assignment of units to treatment. In the context of decentralization, for example, the municipalities with which USAID implementers work could be determined by lottery. Subsequent differ- ences between treated and untreated municipalities are likely to be due to the intervention, since other factors will be roughly balanced across the two groups of municipalities. Randomization is not feasible for many kinds of programs, and there can be a range of practical obstacles; yet these are also often surmount- able. In addition, experimental designs need not be expensive; additional costs can be offset by savings introduced by appropriate designs. SAMPLE PROPOSED PROGRAM EVALUATION DESIGNS FROM THREE FIELD VISITS Selected Designs from Albania: Rule of Law Programs A major part of USAIDâs DG-related activities in Albania involved increasing the effectiveness and fairness of legal sector institutions. With one possible exception, none of these rule of law activities are amenable to randomized evaluation. This is because they each deal with either (a) technical assistance to a single unit (e.g., the Inspectorate of the High Council of Justice, the Inspectorate of the Ministry of Justice, the High Inspectorate for the Declaration and Audit of Assets, the Citizenâs Advo- cacy Office, and the National Chamber of Advocates), (b) support for the preparation of a particular piece of legislation (e.g., the Freedom of Information Act and Administrative Procedures Code, a new conflict of interest law, and a new press law), or (c) support for a single activity, such as the implementation of an annual corruption survey. For a randomized evaluation of the efficacy of these activities to be possible they would have to be, in principle, implementable across a large number of units, â addition to this group of selected projects discussed here, several others were analyzed In by the field teams.

294 APPENDIX E which these are not. There is only one Inspectorate of the High Council of Justice, only one conflict of interest law being prepared, and only one National Chamber of Advocates being supported, so it is not possible to compare the impact of support for these activities both where they are and are not being supported, and certainly not across multiple units. The bestâindeed, onlyâway to evaluate the success of these activities is to identify the outcomes they are designed to affect, measure these outcomes both before and after the activities have been undertaken, and compare these measures. The trick, however, is to find appropriate measures of the outcomes that the activities are designed to affect, and this is frequently far from straightforward. For example, the goal of the technical assistance to the Inspectorates of the High Council of Justice and the Ministry of Justice is to improve the transparency and accountability of the judiciary and to increase public confidence in judicial integrity. The latter can be measured fairly easily using public opinion polls that probe respondentsâ trust in the judiciary and perceptions of its integrity (these would be administered before and after the period during which technical assistance was offered, and the results of the polls compared). However, measuring the degree to which the judiciary is transparent and accountable is much more difficult. Part of the problem stems from the fact that transparency and account- ability can only be ascertained vis-Ã -vis an (unknown) set of activities that should be brought to light and an (unknown) level of malfeasance that needs to be addressed. For example, suppose that, following the implementation of the programs designed to support the Inspectorate of the High Council of Justice, we observe that three judges are brought up on charges of corruption. Should this be taken as a sign that the activi- ties worked in generating greater accountability? Compared to a baseline of no prosecutions, the answer is probably yes, to at least some degree. But knowing just how effective the activities were depends on whether there were just three corrupt judges who should have been prosecuted or whether there were, in fact, twenty, in which case prosecuting the three only scratched the surface of the problem, or whether the prosecutions might be selective with the targets chosen for political reasons. Parallel problems affect other rule of law initiatives, such as efforts to improve the ability of lawyers to police themselves. A slightly different evaluation problem arises with respect to the activi- ties designed to support the drafting of various pieces of legislation. One fairly straightforward measure of success in this area is simply whether or not the law was actually drafted, and, if so, whether it included language that will demonstrably strengthen the rule of law. But assessing whether or not USAIDâs support had any impact requires weighing the counterfactual question: Would the legislation have been drafted without USAIDâs support

APPENDIX E 295 and what would it have looked like? If the answers to these questions are that the legislation would not have been drafted or that the language in the resulting law would not have been optimal, then we can judge the support from USAID to have been successful to the extent that the result we observe is better than this counter factual outcome. The broader problem, however, is that achieving the overarching strategic objective of strengthening the rule of law will involve more than just getting legislation drafted but also getting it passed and then having it enforced. The point is that the measur- able outcome of the USAID-sponsored activity is several steps removed from the true goals of the intervention, and any assessment of âsuccessâ in these areas must be interpreted in this light. This is equally true with respect to other activities, such as technical assistance to aid the Albanian govern- ment in the establishment of a copyright office or an office of patents and trademarks. Whether these institutions, once created, will have any impact on protecting intellectual property will depend on much more than whether or not a formal office designed to do so has been established. The larger point that this discussion hints at is that many of the activi- ties in the rule of law area involve the creation of laws or the strength- ening of institutions whose existence is a prerequisite for a legal system that works, and that supports democracy and market reform. Whether or not these laws and institutions actually have a positive impact on these outcomes can only be ascertained after they have been created or made sufficiently strong to work properly. In this context, evaluating the efficacy of the resources spent on such activities may not make much sense, since the impact will only be meaningful after this initial, neces- sary foundation-building stage. Supporting the writing of laws and the setting up of institutions such as inspectorates, citizensâ advocacy offices, and attorneysâ associations may simply be necessary investments, even if it is very difficult to know whether or not they have had, or will have, an impact on the ultimate outcomes that USAID wants to affect. The one activity area within rule of law that might be amenable to randomized evaluation, at least in principle, is the support for rule of lawâoriented nongovernmental organizations (NGOs). The problem here is that the preferred method of selecting NGOs for support is through a small grants competition, whereas a truly rigorous evaluation of the impact of support would require randomly choosing NGOs for funding. One possible solution would be to hold a small grants competition and, having ranked the applications from best to worst, work down the list funding every other one. Then, data would need to be collected on the quality of the performance and/or the impact in its area of focus of every NGO on the listâboth those that were funded and those that were notâ and a comparison could then be made across those groups. The problem, again, however, is to figure out what, precisely, to measure (which will

296 APPENDIX E depend, in any case, on the particular goals that the NGO sets for itself). Also, unless the small-grants competition generates a very large number of high-quality applications, this method is not likely to generate very useful results. The need for a large number of funded and nonfunded NGOs will be increased by the likelihood that NGOs will propose dif- ferent sets of activities, so âsuccessâ will have two possible sourcesâthe difficulty of the tasks that the NGO sets out to accomplish and the benefits of having received the small grantâand the sample of NGOs analyzed will need to be large enough to permit the impact of funding through the ânoiseâ of the random variation in task difficulty. Selected Designs from Peru: Decentralization, Rule of Law, and Political Parties Decentralization USAID/Peru launched a program in 2002 to support national decen- tralization policies initiated by the Peruvian government. Over a five-year period, the Pro-Decentralization (PRODES) program was intended to â¢ support the implementation of mechanisms for citizen participation with subnational governments (such as âparticipatory budgetingâ); â¢ strengthen the management skills of subnational governments in selected regions of Peru; and â¢ increase the capacity of nongovernmental organizations in these same regions to interact with their local government. With the exception of some activities relating to national-level poli- cies, all interventions under the program took place in seven selected sub- national regions (also called departments): Ayacucho, Cusco, Huanuco, Junin, Pasco, San Martin, and Ucayali. These seven regions contain 61 provinces, which in turn contain 536 districts. Workshops on participatory budgeting, training of civil-society orga- â As discussed elsewhere, the regions were nonrandomly selected for programs because they share high poverty rates, significant indigenous populations, narcotics-related activi- ties, and because a number of the departments were strongholds for the Shining Path move- ment in the 1980s. â Peru has 24 departments plus one âconstitutional provinceâ; the 24 departments in turn comprise 194 provinces and 1,832 districts. Provinces and districts are often both called âmunicipalitiesâ in Peru and both have mayors. Sometimes two or more districts combine to form a city, however.

APPENDIX E 297 nizations, and other interventions took place at the regional, provincial, and district levels. The ultimate goal of the program was to promote âincreased respon- siveness of sub-national elected governments to citizens at the local level in selected regions.â This outcome is potentially measurable on different units of observation. For example, government capacity and responsive- ness could be measured at the district or provincial level (through expert appraisals or other means), while citizensâ perceptions of government responsiveness may be measured at the individual level (through sur- veys). Experimental designs could be used to study the impact of the decentralization program, and the cost of appropriately designed experi- mental evaluations could in fact be far beneath the actual costs spent on monitoring and evaluation. Best-possible designs.â We discuss best-possible designs from the per- spective of program evaluation. First, we discuss what an ideal ex ante design for the decentralization program might have been in 2002, when the program was begun. Second, we also discuss how an experimental design might be employed in a second phase of the program, given that all the municipalities in the seven regions were already treated in the first phase. A âtabula rasaâ design.â We assume that the decentralization program will be implemented in the seven nonrandomly chosen regions in which USAID commonly works; inferences about the effect of the intervention will then be made to the districts and provinces that comprise these regions. The simplest design would involve randomization of treatment at the district level. Districts in the treatment group would be invited to receive the full bundle of interventions associated with the decen- tralization program (e.g., training in participatory budgeting, assistance for civil society groups, and so on); control districts would receive no interventions. There are two disadvantages to randomizing at the district level, how- ever. One is that some of the relevant interventions in fact take place at the provincial level. Another is that district mayors and other actors may more easily become aware of treatments in neighboring districts. For both of these reasons, it may be useful to randomize instead at the provincial â Relevant subnational authorities include members of regional councils, provincial may- ors, and mayors of districts. â Some interventions also occurred at the regional level, particularly toward the end of the program, yet these interventions constitute a relatively minor part of the program.

298 APPENDIX E level. Then, all districts in a province that were randomly selected for treatment would be invited to receive the bundle of interventions. Several different kinds of outcome measures can be gathered. Survey evidence on citizensâ perceptions of local government responsiveness will be useful; so may be evaluations of municipal governance capacity taken across all municipalities in the seven regions (both treated and untreated). A difference in average outcomes across groups at the end of the programâfor example, differences in the percentage of residents who say government services are âgoodâ or âvery good,â or the percentage who say the government responds âalmost alwaysâ or âon the majority of occasionsâ to what the people wantâcan then be reliably attributed to the effect of the bundle of interventions, if the difference is bigger than might reasonably arise by chance. One feature of this design that may be perceived as a disadvantage is the fact that treated municipalities are subject to a bundle of interventions; thus, if we observe a difference across treated and untreated groups, we may not know which particular intervention was responsible (or most responsible) for the difference. Did training in participatory budgeting matter most? Assistance to civil society groups? Or some other aspect of the bundle of interventions? This problem arises as well in some medical trials and other experiments involving complex treatments, where it may not be clear exactly what aspect of treatment is responsible for differences in average outcomes across treatment and control groups. It seems preferable at this stage to design an evaluation plan that would allow USAID to know with some confidence whether a program financed by USAID makes any difference. Bundling the interventions may provide the best chance to estimate a causal effect of treatment. Once this question is answered, one might then want to ask what aspect of the bundle of interventions made a difference, using further experimental designs. However, another possibility discussed below is to implement a more complex design in which different municipalities would be randomized to receive different bundles of interventions. The intention-to-treat principle can be used to analyze the results of the experiment. Some municipalities assigned to treatment may refuse to sign participation agreements or otherwise may not cooperate with the local contractor; these municipalities may be akin to noncompliers in a medical trial. In this context, estimating the âeffect of treatment on the treatedâ may be of interest. It may be worth choosing pilot districts at random as well. In the first â Standard errors may need to be adjusted to account for the clustering of treated districts within provinces.

APPENDIX E 299 phase of the implemented decentralization program, only 145 municipali- ties were incorporated in the program in the first year, out of 536 that were eventually incorporated. Comparing municipal capacity across incorpo- rated and unincorporated municipalities at the end of the pilot period may not lead to useful results; the incorporated municipalities were chosen for their high degree of capacity. It would be much more meaningful to randomly assign municipalities for inclusion in the pilot phase. To the extent it is necessary to include some municipalities with high ex ante management capacity and resources, this may be accomplished through stratified sampling of municipalities. Second-phase design.â USAID/Peru is preparing to roll out a second five- year phase of the decentralization program, again in the seven regions in which it typically works. At this point, all municipalities in the seven regions were already treated (or at least targeted for treatment) in the first phase. This may raise some special considerations for the second-phase design. Our understanding is that there are at least two possibilities for the actual implementation of the second phase of the program; which option is chosen will depend on the available budget and other factors. One is that all 536 municipalities are again targeted for treatment. As in the first-phase design, this would not allow the possibility to partition municipalities in the seven regions into a treatment group and controls. In this case, the best option for an experimental design may be to ran- domly assign different treatmentsâbundles of interventionsâto differ- ent municipalities. While such an approach will not allow us to compare treated and untreated cases, it will allow us to assess the relative effects of different bundles of interventions. This may be quite useful, particularly for assessing the question raised above about which aspect of a given bundle of interventions has the most impact on outcomes. Do workshops on participatory budgeting matter more than training civil society orga- nizations (CSOs)? Randomly assigning workshops to some municipalities and training to others would allow us to find out. A second possibility for the second phase of the program is to reduce the number of municipalities treated, for budgetary reasons. Suppose the number of municipalities were to be reduced by half. The best option in this case is probably to randomize the control municipalities out of treat- ment, leaving half assigned to treatment and the other half in control. Those municipalities assigned to treatment would be offered the full menu of interventions in the decentralization program. Of course, randomizing some municipalities out of treatment is sure to encounter displeasure among authorities in control municipalities. Yet if the budget only allows for 268 municipalities assigned to treatment and 268 to control, this displeasure will arise whether or not the allocation of

300 APPENDIX E continued treatment is randomized. In fact, as discussed below, it may be that using a lottery to determine which municipalities are invited to stay in the program is perceived as the fairest method of allocating scarce resources. Cost of evaluation under the best-possible designs.â The need to gather outcome measures on control unitsâboth through surveys of residents in untreated municipalities and through independent evaluations of munici- pal capacity in control districtsâwill mean an additional cost of program evaluation. However, it is worth bearing in mind that such additional costs would have likely represented only a small fraction of the cost of the overall program as well as of the portion of overall costs going to evaluation. For example, adding 500 respondents from appropriately chosen control municipalities would likely cost no more than $10,000, a small amount compared to the overall program budget. In addition, with appropriate design modifications, there might be substantial net savings. One possibility for cost savings would involve substantially limiting the volume of output/outcome indicators gath- ered by each of the local subcontractors. For example, measures could be sampled across local jurisdictions, rather than gathered quarterly on each of 536 municipalities. A related idea is that local subcontractors could be asked to gather the indicators and report on them each quarter with some positive probability; but they would not actually have to do so in each quarter. Other Examples: Rule of Law, Political Parties, and Extractive Industries Several of the programs planned under the new Peru strategic assess- ment might also be amenable to randomized designs. In this section, we briefly review possibilities for experimental designs afforded by programs related to the rule of law, political parties, and extractive industries. Rule of law.â Most of the interventions under the rule of law programs implemented were not amenable to randomization across units. However, there were one or two interventions that could in principle have been ran- domized. For example, after the passage of a new penal code, some judges in district courts were switched to the new system of judging cases while others were left to clear the backlog of cases that had already entered the courts under the old system. Under the observational (nonexperimental) evaluation plan that was actually adopted, cases administered by judges â For reasons discussed above, it may also be useful to conduct the randomization at the provincial rather than district level.

APPENDIX E 301 under the new system were compared to cases administered under the old system. Comparisons were made across groups with respect to vari- ables such as the average time to disposition of the court cases. This nonexperimental design represented a valuable evaluation plan: There was a comparison made across treated and untreated units on an outcome measure of interest. In this and similar examples, the data seemed to show a substantial effect of treatment. However, judges were nonrandomly assigned to stay in the old sys- tem or migrate to the new one (the chief judge apparently decided who would move). This raises the possibility that characteristics of judges who stayed or migrated are partially or wholly responsible for differences in the average time to disposition.10 In principle, it would have been pos- sible to assign district court judges to the old and new systems at random. While the research design idea is straightforward, however, it was likely to be politically difficult: Chief judges may not want to relinquish power over these assignments. Political parties.â One idea under the new political parties program is to provide assistance to the major national-level parties in opening or strengthening local offices in selected municipalities. At this point, how- ever, the parties themselves would choose where to open offices, so the design is nonexperimental. Moreover, if outcomes are not tracked in municipalities in which USAID partners do not support local party offices (i.e., controls), infer- ences may be especially misleading. Suppose measures are taken today and in five years of local party strengthening and an increase is found. Is this due to the effect of local-party-strengthening activities supported by USAID? Perhaps. Yet it could be due to some other factor, like a change from an electoral system with preferential voting to closed party lists, which would tend to strengthen party discipline and, perhaps, local par- ties; such a change is currently being considered in Peru.11 The point is 10â While data were not available, it would have been helpful to compare the difference in time to resolution, before and after the switch of systems, among judges who switched and judges who did not; this could have required pre- and postswitch data on both groups of judges. While still nonexperimental, this comparison would lend greater confidence to the claim that the switch in systems had a causal effect on the time to resolution of court cases. 11â In the current electoral system, there is proportional representation at the department level, and voters vote for party lists but can indicate which candidate on the list they prefer; according to a range of research on the topic, this can create incentives for candidates to cultivate personal reputations and also makes the party label less important to candidates. Under a closed-list system, voters simply vote for the party ticket, and party leaders may decide the order of candidates on the list. This may tend to increase party discipline and cohesion (as well as the internal power of party elites).

302 APPENDIX E that without data on controls, it will be impossible to separate the effect of USAID local activities from the effect of the law. At a minimum, then, it would be advisable to consider gathering data on control municipalities. In addition, while an experimental approach may not be deemed feasible in this instance, it is possible in principle, and it would provide a stronger basis for impact attribution than a non- experimental approach. Under an experimental design, USAID or the local implementer would select municipalities in which to establish or strengthen local par- ties randomly, from a set of acceptable municipalities. Local parties would have to accept that USAID or the contractor would select the munici- palities. There may be ways to overcome any resistance to such a plan, however; for instance, a party such as Unidad Nacional (the rightist party whose candidate in the 2001 and 2006 presidential elections was Lourdes Flores Nano) has almost no base outside Lima and might accept any help it can get to broaden that base. Another obstacle is that parties may want to target certain kinds of municipalities, for example, those where they already have some support. It may be helpful for this purpose to stratify municipalitiesâfor example, by past levels of electoral support for each partyâand conduct the randomization within strata. Outcome indicators might include the municipal vote share of each party in subsequent elections, with comparisons being made across treated and untreated municipalities; there may be other, harder-to-measure out- comes of interest, too. Inferences may be complicated if more than one party opens or strengthens an office in the same municipality (i.e., if there are two parties and both are strengthened locally, party vote shares may be unchanged). This concern may be lessened by the fragmentation of the party system and by the current local dominance of regional parties. In recent regional elections, for example, 23 different regional parties won office across Peruâs 24 departments; these regional parties differ from the national par- ties whose local roots USAID seeks to strengthen. Extractive industries.â There is currently a very small pilot program that seeks to promote dialogue in two mining communities among the State, companies, and local citizens, with the larger goal of âdecreasing the probability of social conflict.â This program has the advantage of possessing a relatively easy- to-measure outcome variable, social conflict (compared to, say, trans- parency). For example, this variable might conceivably be proxied by the annual number of local marches/demonstrations. However, without comparing mining communities with which USAID works to those with which it does not, it will be difficult to evaluate the causal impact of the program on decreasing the probability of social conflict.

APPENDIX E 303 In a future rollout of the program, mining communities with which USAID might work could be randomly selected from the set of eligible mining communities. This would provide the most secure basis for attach- ing a causal interpretation to a finding that, for example, there were fewer marches and demonstrations in communities in which USAID worked than in those in which it did not work. Selected Designs from Uganda: Civil Society, Parliamentary Strengthening, and Anticorruption Large and Small Grants to CSOs In the proposed project for Strengthening Democratic Linkages in Uganda Program, USAID proposes to provide at least $100,000 per year for grants to CSOs to enable them to monitor local governments and help improve representation and service delivery at the local level. 12 These grants are thought to have two main effects: (1) to develop a more robust civil society by increasing the capacity of the CSOs who are awarded the grants, and (2) to improve the performance of government service deliv- ery by increasing civic input and oversight of government officials. Across carefully matched subcounties, large grants, small grants, and no grants will be allocated randomly to local CSOs working on HIV/ AIDS. The goal is to compare the effects of large grants to CSOs (treatment group) versus small grants to CSOs (partial control group) in order to determine the effects of increases in CSO funding. Providing small grants to the partial control group allows USAID to assess independently the effect of greater monetary resources, while controlling for the nonmone- tary effects of receiving a USAID grant (such as public recognition, special accounting requirements, and outside monitoring). It also facilitates the collection of equivalent data from CSOs in both the treatment and partial control groups. Both the treatment group and the partial control group will also be compared to CSOs in matching sub-counties where no grants are awarded (full control group) to evaluate the total effect of awarding a grant. Carefully matched groups of three subcounties will be purposively selected so that the subcounties within each group are similar along a number of dimensions that are measurable and likely to be associated with CSO capacity and government service delivery for HIV/AIDS pro- grams. Selection criteria might include the type, size, budget, and experi- ence of the HIV/AIDS-related CSOs already working in the subcounties, as well as the subcountiesâ size, urban population, wealth, voting pat- 12â The Strengthening Multi-Party Democracy in Uganda program also provides for $100,000 per year for grants to CSOs, although for a somewhat different purpose.

304 APPENDIX E terns, background of key officials, location, ethnic composition, number and type of health facilities, and infection rates. The most important cri- teria to ensure comparability should be determined in consultations with experts. Grouped subcounties might be next to each other but immediate proximity is not necessary (or even desirable).13 In each subcounty, one CSO working in HIV/AIDS will be selected with the aim of finding similar CSOs across three subcounties in the group. One subcounty in each group will be randomly assigned to receive a large CSO grant to monitor HIV/AIDS services in the subcounty. Another subcounty in the group will be randomly selected to receive a small CSO grant for HIV/AIDS. The remaining sub-county in the group will act as the pure control and receive no grant. This will be repeated for at least 50 groups, and preferably more.14 It is important to ensure that: (1) the large grant provides a significant increase to the existing budget of the CSOs, and that the small grants do not and (2) that the CSOs spend their grants entirely on HIV/AIDS activities within the selected subcounty and that there is not contamination (sharing of resources or expertise) across sub- counties. It would probably work best to select CSOs that work only in a single subcounty to prevent the supplementing or siphoning off of funds to the treatment sites due to the grant. CSOs in both treatment and partial control groups should receive equivalent technical assistance and training on how to use the grant money and how to monitor and improve service delivery. USAID interactions with the CSOs in the treatment group, and partial control group should be equivalent throughout. Evaluation.â The primary question for evaluation purposes is: What are the effects of monetary grants on the organizational capacity of CSOs and on the ability of CSOs to monitor and improve government service delivery? The best possible evaluation for this type of project would be a large N randomized controlled field experiment. Because a large N study would require sizeable grants to at least 50 CSOs and additional monitor- ing and measurement, the costs are greater than that which is currently envisioned for CSO grants within the Linkages program. However, this design offers substantial benefits over a small N experiment and is of general interest to USAID. 13âInstead of grouping subcounties in sets of three, it might be more feasibly to use an alter- native stratified sampling procedure whereby all the subcounties in the sample are stratified into types according to key factors and then subcounties within each stratum are randomly assigned into each of the three categories. 14âDepending on the districts chosen for Linkages, it may be possible to randomly select all the treatment and control subcounties from within the 10 districts.

APPENDIX E 305 Measurement.â Data should be collected before the grants are awarded, after the money is given (or at several points during the grant period), and two years after the end of the grant in order to assess both short- term and medium-term effects of the monetary infusion. Equivalent data should be collected about CSOs and service delivery in the treatment, partial-control, and full-control subcounties. The ability of USAID to col- lect comparable data in the partial control group should be facilitated by the fact that the CSOs are receiving some funds from USAID. USAID may have to provide a small fee or incentive to the CSOs not receiving grants to enable the collection of similar intrusive and time-consuming data from the CSOs in the pure control group. In order to study the effect of grants and increased resources on the organizational capacity of the CSOs, data should be collected on the bud- get, activities, operations, and planning of the CSOs. In addition, pre- and postintervention surveys can be conducted with CSO employees, volun- teers, government officials and employees, and stakeholders to evaluate changes in the activities, effectiveness, and reputation of the CSOs. In order to evaluate the effectiveness of grantsâ government service delivery data can be collected on HIV/AIDS services and outcomes within each subcounty. Much of these data may already be collected by the government (such as the periodic National Service Delivery Survey conducted by the Uganda Bureau of Statustics (UBOS)âthough perhaps USAID would need to fund an oversampling in treatment and control subcounties) or perhaps it can be collected in collaboration with other donor projects such as the Presidentâs Emergy Plan for AIDS Relief. Special attention should be given during the research design stage to determine the government activities that are likely to be affected by greater CSO involvement and how those activities might be accurately measured. Additional data collection could be done through surveys of service recipients or randomized checks on facilities and services. In addi- tion, money-tracking studies of local government and government agen- cies could be conducted to evaluate the level of corruption in HIV/AIDS projects within the selected subcounties. Possible alternatives 1. The grants could be given for an issue other than HIV/AIDS. Selected issues must be ones where (a) the government plays a major role in providing services and (b) there are measurable outcomes of service delivery. 2. The intervention can be carried out at either the district level or the village level instead of at the middle subcounty level. At higher levels of local government, CSOs are denser and better organized. While the ability

306 APPENDIX E of CSOs to effect change in government may be greater at higher levels, the size of the grant needed to make a detectable difference will also be larger. Furthermore, it may be too difficult to find similar groups, and to protect units from contamination by other donors at higher levels of government. 3. If additional funds cannot be secured to conduct a large N random- ized controlled experiment, a small N experiment could be conducted with the available funds, although with significantly less power to accu- rately evaluate the effects of CSO grants. In order to increase the number of possible comparisons, and to help control for the effect of context with a small number of treatment sites, a variation on the above design may be warranted. The inclusion of a second issue area may facilitate analysis in a small N context. For example, in each subcounty, one CSO working on education and one working on HIV/AIDS will be selected with the aim of finding similar CSOs across subcounty groups and issues. One subcounty will be randomly assigned to receive a large education grant and a small HIV/AIDS grant, and another subcounty will receive a large HIV/AIDS grant and a small education grant. Figure E-1 provides an illustration. E=Large education grant e=Small education grant A=Large HIV/AIDS grant a=Small HIV/Aids grant 3b 1a 1b e A E a e A 3a 2c 1c 3c E a 2a E a 2b 4c e A 4b 5c e A 4a 5b 5a E a e A E a FIGURE E-1â Comparison of large and small grants to education and HIV/AIDS CSOs. E-1.eps

APPENDIX E 307 This research design affords several useful comparisons. Within a sin- gle subcounty, changes in the education CSO versus the HIV/AIDS CSO (one of which got a large grant and the other of which got a small grant) can be compared, and the degree of change in each sector can be evalu- ated. Within each subcounty group, the education CSOs (one with a large grant, one with a small grant, and one with no grant) can be compared and the changes in educational outcomes across the grouped subcoun- ties can be compared. In addition, within each subcounty group, the two HIV/AIDS CSOs (one with a large grant, one with a small grant, and one with no grant) can be compared and the changes in HIV/AIDS outcomes across the grouped subcounties can be compared. The repetition of these comparisons across a number of different groups will help the researchers to parse out the effects of the grants from contextual factors. Training and Assistance for a Random Selection of New Members of Parliament The Strengthening Democratic Linkages in Uganda program seeks to enhance the knowledge, expertise, and resources of members of parlia- ment (MPs) so they can more effectively operate in a multiparty parlia- ment, legislate and perform oversight functions, foster sustainable devel- opment, and engage constituents, civil society, and local governments. The entire group of new MPs (approximately 150) will be randomly divided into two groups. USAID can explain that they only have enough resources to work with half the group at a time and that the fairest way to decide is by lottery. To ensure that the partisan makeup of the treated group is equivalent to the control group, USAID will probably want to stratify by party affiliation. They may also want to stratify by other key factors such as previous political experience, committee assignment, and gender and randomly assign MPs within strata to ensure that the treat- ment and control groups are equivalent along critical dimensions. The treatment group will receive intensive personalized training and assistance from technical personnel. This assistance my take the form of group trainings on key issues, weekly or bi-monthly individual meetings with trained legal assistants, regular research assistance on topics chosen by the MP, secretarial services, and/or repeated meetings with CSO rep- resentatives. The control group will not receive these additional services (at least initially). It is important to ensure that the intervention (1) is deemed useful by the MPs so that they continue to participate fully in the program for its duration; (2) is significant enough that the effects, if there are any, can be measured; and (3) is limited to the MPs in the treatment group alone and not easily passed on to those in the control group. For example, if the treatment was the distribution of a newsletter each week

308 APPENDIX E to the treatment group, then it is very likely that many legislators in the control group would gain access to the newsletter and receive the same treatment as those in the treatment group. Measurement.â Jeremy Weinstein and Macartan Humphreys, in coop- eration with the African Leadership Initiative, are currently producing annual scorecards for all of Ugandaâs MPs recording their behavior in the parliament, in committee, and in their constituencies. These scorecards could be used to compare the behavior of MPs in the treatment and con- trol groups. In addition, surveys could be conducted with MPs to measure the knowledge and reported behavior of new MPs and to assess percep- tions of fellow MPs. Surveys could also be conducted with parliamentary staff, civil service leaders, key stakeholders, or constituents to assess the reputation and influence of different legislators. Perhaps other measures of MP involvement (such as visits to the library) can be collected. Even- tually, for those who run for reelection, the vote results could be used to evaluate popularity. Evaluation.â For the purposes of evaluation, the most important question is: What are the effects of technical training and assistance on the ability of individual legislators to operate more actively, effectively, and inde- pendently in parliament? Possible alternatives 1. To reduce costs of the intervention, a smaller number of MPs can be selected to be in the treatment group. The required number depends on the intensity of the intervention, the quality of the measures, and the heterogeneity of the group, but a treatment group of 50 MPs may be sufficient. 2. If it is not politically feasible to provide benefits to only some of the new MPs, then the treatment could be conducted in a rollout fashion. Half (or one-third) of the MPs would receive the treatment for the first several years, and the other group would receive the treatment in the later part of the term. The interventions with each group would have to be timed to fit with the collection of data for the scorecards. 3. Returning MPs could also be included in the experiment, although returning MPs are more experienced and thus less likely to be affected by additional assistance. Their inclusion also adds to the heterogeneity of the population. The intervention activities (and the associated costs) would have to be greater, and/or more widespread, in order to discern an effect.

APPENDIX E 309 Revised Remuneration Policies to Fight Corruption The Strengthening Capacity to Fight Corruption in Uganda Program suggests that âthe Government of Uganda will consider increased pay for key personnel, through the implementation of an enhanced remuneration package for anti-corruption investigators and prosecutors.â The revised remuneration policies would âenable performance (job evaluation) based salary structures for anti-corruption prosecutors, investigators, and other officers within GOU entities such as the DEI, DPP and the CID fraud squad.â The effects of changes in remuneration policies are of general inter- est to USAID. Although the implementation of the program cannot be manipulated to create contemporaneous control or comparison groups, the effects can still be evaluated effectively with a temporal compari- sonâbefore and after the intervention. The main consideration is to try to ensure that exogenous shocks do not take place during the period of measurement. For that reason we suggest that such an intervention could only be accurately evaluated if it took place some time before the other proposed reforms in the Request for Proposal for Strengthening Capac- ity to Fight Corruption in Uganda. Perhaps the changes in remuneration could be implemented immediately, while the other interventions are still in the planning stage. Measurement.â The main comparison is before the change in remuneration policies versus after the change. To evaluate the effect of changes in remu- neration policies on recruitment and retention, the qualifications of the current employees will be assessed. In addition, the qualifications of all those who apply and former employees who sought alternative employ- ment should also be assessed. To evaluate the effect of the remuneration policies changes on the effectiveness of anticorruption activities, the num- ber of malpractices that are detected, effectively investigated, prosecuted, punished, and publicized before and after the changes can be compared. Evaluation.â The primary question from the perspective of evaluation is: How do changes in remuneration policies affect recruitment and retention of qualified personnel and the performance of employees? Possible alternatives.â If time permits, it would be better to stagger the changes in remuneration policies by types of civil servants or grades. For example, prosecutors could receive the new remuneration packages sev- eral months before the investigators. Thus, if there is an external shock, it is less likely to similarly affect the outcomes of every subject of the study.

310 APPENDIX E Current and Recent USAID Projects at the Time of Field Visits Albania (March 2007) New Local GovernmentâRFP issued Current/Recently Ended Local Government (2004âend July 2007)âUrban Institute Rule of Law (2004âend July 2007)âCasals Political Parties and Civic Participation (2004âSeptember 2007)âNDI/ IREX/Partners Albania Anti-Corruption/MCC Threshold (2006-2008)âChemonics Peru (June 2007) Current Pro Decentralization (PRODES)âARD, Inc. Political parties/ElectionsâNDI/Transparencia Congress ProgramâUnited Nations Developoment Program and George Washington University LAPOP Survey âDemocracy Political Culture in Peru, 2006ââVander- bilt University Not Included in Field Visit Conflict Mitigation in MiningâCARE Human Rights National Coordinator Institutional Development and Therapy Attention to Victims of Torture and Political Violenceâ Human Rights National Coordinator and Center for Psycho-Social Attention Trafficking in PersonsâCapital Humano y Social Alternativo Uganda (June 2007) New Democratic Linkages (within and among parliament, selected local gov- ernments, and CSOs)âCenter for Legislative Development SUNY Albany MCC Threshold (anti-corruption and civil society to improve procure- ment systems and build capacity to more effectively investigate and prosecute corruption cases) Political parties and politically active CSOs (capacity building)âdesign- ing project

APPENDIX E 311 Recent/Soon to End Decentralization (to end December 2007)âARD Not Included in Field Visit Community Resilience and Dialogue (September 2002âSeptember 2007)âInternational Rescue Committee Consultant Biographies Albania Team Members: David Black, USAID; Rita Guenther, National Acad- emies; Jo Husbands, National Academies; Karen Otto, consultant; Daniel Posner, consultant. Karen Otto, a former USAID direct hire, is a monitoring and evalua- tion specialist/consultant with a strong background in democracy and governance (especially rule of law). She has developed 70 performance monitoring plans for proposals and ongoing development projects in a wide array of areas, particularly DG. She has evaluated the performance of many development projects and the operations of all federal courts in the United States, and has developed a formal evaluation system for the Administrative Office of the U.S. Courts to review courts under its jurisdiction. Ms. Otto has been a court administrator in federal, state, and municipal courts in the United States. She has been a rule of law advisor in USAID and a project manager for DG projects overseas. She has personal experience in many of the areas involved in DG activities: court administration (she was a court administrator), media (she was a journalist), judicial disciplinary system (she was an inspector in a judicial inspection service), etc. Daniel Posner, associate professor of political science at the University of California, Los Angeles, conducts research in the following four broad areas: ethnic politics, ethnicity and economic development, political change in Africa, and social capital and civil society. His research in this area is motivated by a number of questions: When and why do some ethnic identities (and ethnic cleavages) matter for politics, and when do they not? Why, when people think about who they are, do they see them- selves (and others) as members of particular ethnic groups, and why do the groups that they see themselves as part of have the sizes and physical locations that they do? How can we reconcile what we know about the fluidity and context dependence of ethnic identities and ethnic cleavages with the need to measure social diversity and code individuals by their

312 APPENDIX E group affiliations? Why does ethnicity matter for collective action? How well are people able to identify the ethnic backgrounds of others? He approaches each of these questions with a combination of theory and the collection of original data (including experimental data). Peru Team Members: Moises Arce, consultant; Tabitha Benney, National Acad- emies; David Black, USAID; Thad Dunning, consultant; Rita Guenther, National Academies. Moises Arce is an associate professor in the Department of Political Sci- ence at the University of Missouri. His research focuses on the politics of market reform, comparative political economy, and Latin American poli- tics (Peru). He received funding from the National Science Foundation, the Social Science Research Council, and the Fulbright Scholar Program. His publications include the book Market Reform in Society: Post-Crisis Poli- tics and Economic Change in Authoritarian Peru, and articles in the Journal of Politics, Comparative Politics, Comparative Political Studies, and the Latin American Research Review. He previously taught at Louisiana State Univer- sity. He received his Ph.D. in 2000 from the University of New Mexico. Thad Dunning is assistant professor of political science and a research fellow at the Whitney and Betty MacMillan Center for International and Area Studies at Yale. His current research focuses on the influence of natu- ral resource wealth on political regimes; other recent articles investigate the influence of foreign aid on democratization and the role of informa- tion technology in economic development. He conducts field research in Latin America and has also written on a range of methodological topics, including econometric corrections for selection effects and the use of natural experiments in the social sciences. Dunningâs previous work has appeared in International Organization, the Journal of Conflict Resolution, Studies in Comparative International Development, Geopolitics and in a forth- coming Handbook of Methodology (Sage Publications). In 2006-2007, he was teaching an undergraduate lecture course and a seminar on ethnic politics and a graduate seminar on formal models of comparative politics. He received a Ph.D. in political science and an M.A. in economics from the University of California, Berkeley.

APPENDIX E 313 Uganda Team Members: Mark Billera, USAID; Mame-Fatou Diagne, consultant; John Gerring, committee member; Jo Husbands, National Academies; Devra Cohen Moelher, consultant. Mame-Fatou Diagne is a Ph.D. candidate in economics at the University of California, Berkeley. A native of Senegal, she graduated from the Insti- tut dâEtudes Politiques de Paris and received a Master of International Affairs from Columbia University. She has worked as an emerging mar- kets economist for Societe Generale in Paris and for Standard and Poorâs in London, where she was the principal analyst for South Africa and other African-rated sovereigns. Her current areas of research are development, public and labor economics, and particularly, the economics of education and political economy in Africa. Devra Cohen Moehler is an assistant professor of political science at Cor- nell University. She recently returned to Cornell from two years as a Harvard Academy Scholar at the Harvard Academy for International and Area Studies. Her research interests include political communications, education and democratization, consequences of political participation, political behavior, comparative constitution-making, law and develop- ment, cross-national survey research, and the international refugee regime. Her dissertation, based on research conducted in Uganda, focused on the effects of citizen participation in Ugandan constitution making in creating âdistrusting democrats.â She received her Ph.D. in political science from the University of Michigan and a B.A. in development studies from the University of California, Berkeley.

Next: Appendix F: Voices from the Field: Model Questionnaire »

Improving Democracy Assistance: Building Knowledge Through Evaluations and Research (2008)

Chapter: Appendix E: Field Visit Summary Report

Welcome to OpenBook!

Get Email Updates