Skip to main content

Currently Skimming:

3 The Stakes in Research Assessment
Pages 40-66

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 40...
... work to set priorities within these bounds by identifying, supporting, maintaining, and nurturing scientifically vital, mission-relevant areas of research, including selectively promoting areas of research in the manager's mission area on the basis of judgments of their prospects for advancing that mission. Science managers may also recommend changes in larger organizational priorities and practices deemed necessary to permit specific programs to more effectively achieve their objectives in the context of the larger agency mission.
From page 41...
... We discusses the limitations both of traditional expert judgment and of quantitative approaches, recognizing the particular difficulties of comparing different kinds of fields and of assessing scientific progress in interdisciplinary or transdisciplinary fields. Finally, we discuss the ways in which debates about the best methods for priority setting in the context of a movement for government accountability raise deeper questions about the balance of influence and power among researchers, program managers, advisory councils, extramural scientists, and other interested parties.
From page 42...
... We return to this issue later in the chapter. BRIEF HISTORY OF FEDERAL SCIENCE PRIORITY SETTING Since the end of World War II, the salience of the issues of priority setting and retrospective assessment in U.S.
From page 43...
... . Priority setting is also implicit in the strategic planning undertakings of federal agencies, in which selected fields are chosen for emphasis, with explicit or implicit decisions made not to fund other areas or to alter relative distributions of support among areas.
From page 44...
... . The analysis derived from this framework, coupled with several major empirical studies on private and social rates of return to research in agriculture, health, and technological innovations, provided empirical support for the conclusion that governmental support of fundamental research yielded net social benefits.
From page 45...
... The PART therefore looks at all factors that affect and reflect program performance including program purpose and design; performance measurement, evaluations, and strategic planning; program management; and program results. Because the PART includes a consistent series of analytical questions, it allows programs to show improvements over time, and allows comparisons between similar programs" (available: http://www.whitehouse.
From page 46...
... With all these advances, the prospect of using quantitative analysis systematically to channel public funds to their most productive scientific uses appears more attainable than before. A third factor making priority setting more salient involves the dynamics of science itself, especially the widespread consensus that the greatest opportunities for advances in science now involve the crossing of traditional disciplinary boundaries and the creation of new fields.
From page 47...
... . DEBATE OVER PRIORITY SETTING AND ASSESSMENT MECHANISMS As already noted, peer review has long been the dominant approach in federal research agencies for evaluating the past performance and future potential of research areas and for setting priorities.
From page 48...
... Devolution of decision-making authority, or in this case, recommendations, to peer review panels is the "special mechanism" by which the social contract for science "balances responsibilities between government and science" and thus fosters accountability (Guston and Keniston, 1994:8) .6 Outside reviews of research agencies' efforts to assess programs and set priorities have generally endorsed the clinical, deliberative methods of expert review as the best way to assess research fields.
From page 49...
... . GPRA and PART are similar in that both provide agencies with pressures or incentives to move toward more quantitative methods for setting priorities or assessing performance.7 Despite the skepticism that researchers have at times expressed about applying quantitative approaches to assessment of their work -- for example, concerns about the spawning of "LPUs" (least publishable units)
From page 50...
... Questions continue to be raised by policy makers, research administrators, practicing scientists, and specialists in program evaluation about the reliability and validity of the basic data series; about errors in measurement; about the ability of actors in the scientific enterprise to manipulate or "game" several mainstream quantitative techniques, for example, by pooling citations; and about the applicability of techniques used to study the workings of the scientific enterprise to evaluation and priority setting (e.g., van Raan, 2005; Weingart, 2005; Monastersky, 2005) .10 Particular quantitative methods have also been criticized.
From page 51...
... Important advances often appear unexpectedly and from unlikely sources; long time lags may occur between a scientific development and its application; findings are used in ways not conceived of either by researcher or sponsor; findings deemed interesting but not significant take on new import when combined with newer findings or applied to newly emerging situations.11 Critics of quantification also challenge the value of retrospective assessments for research priority setting. Although past performance is often seen as the best predictor of future performance for individual researchers, the recent performance of a research field may or may not be a good predictor of whether additional investment in that field is likely to lead to great advances or to less productive elaboration of past work.
From page 52...
... Scientific innovation is heavily concentrated in the far upper tail of accomplishment in science: thus, criteria that are effective in discriminating reasonably good from reasonably bad normal science are likely to be unpredictive or even counterproductive in predicting events, trends, or productivity in the upper tail of the relevant distribution, where breakthroughs occur. Yet another reservation about current efforts to quantify the performance of research investments is that few agencies systematically treat the development of human capital as an output complementary to conventionally measured research outputs.
From page 53...
... To the extent that different disciplines in a program manager's portfolio have different publication patterns among these four literatures, quantitative measures based on bibliographic, journal-centered databases may be biased indicators for comparing scientific output. Identifying these concerns is not equivalent to ruling out the utility of quantitative methods, including bibliometric data ones.
From page 54...
... The situation is the same for other quantitative indicators. Research in some fields leads to patentable inventions, while in others it may lead to improved practices or new policies.12 Research in some fields leads to new drugs or medical procedures, whereas research in other fields leads to less readily quantifiable medical benefits, such as improved diagnostic categories or ways of interpreting diagnostic tests.
From page 55...
... This concern is rooted in the above methodological concerns, in the idea that the judgments that emerge from expert review panels provide a more thoughtful and nuanced assessment of scientific progress than can come from any available quantitative methods, and in a concern that quantification entails a shift of power and influence over priority setting from working scientists to government officials following bureaucratic procedures. This last concern, to adapt Oscar Wilde's comment, is about the possible ascendancy of nonscientists who know the price of everything and the value of nothing.
From page 56...
... Some agencies seek to develop, validate, and apply quantitative measures of research output and its value. Others see available quantitative metrics as hopelessly inadequate for their assessment purposes and believe that expert judgment is the only valid and appropriate way to evaluate the past performance or future potential of research (see National Research Council, 1999)
From page 57...
... The result may be, as has been increasingly asserted with regard to both NIH and NSF, that review panels are inherently too conservative about supporting radically new or transformative research ideas over well-crafted mainstream but incremental science.14 Strong criticism by one or two review panel members, particularly on specialized matters in those members' areas of expertise, may be enough to defeat an idea. Similarly, panels have been criticized as favoring science in established disciplines over interdisciplinary proposals (National Research Council, 2005b:Chapter 6)
From page 58...
... to yield significant new advances in fundamental knowledge that will illuminate not only the field from which they come, but also spill over to enrich other fields or even create new ones. Although peer review panels in many fields believe that there is much high-quality work in those fields, it is possible that the experts in
From page 59...
... An increased emphasis on the use of quantitative indicators in science policy decisions, especially to the extent that indicators can be developed by technicians who are not researchers in the relevant fields, can easily weaken the influence of scientists vis-à-vis agency science managers, or of scientists in general vis-à-vis nonscientist decision makers in government. Bibliometrics provides a good example of the issue of power, latent in many current discussions of the use of "objective" measures of research
From page 60...
... "because first of all the very attempt to measure research performance by ‘outsiders,' i.e., non-experts in the field under study, conflicted with the firmly established wisdom that only the experts themselves were in the position to judge the quality and relevance of research and the appropriate mechanism to achieve that, namely peer review, was functioning properly." Adopting any method of assessing research potentially affects who has the ability to influence the setting of broad research priorities, the contours of specific programs with respect to subfields and methodologies, and decisions concerning which
From page 61...
... In short, decisions about quantification of scientific progress have a power dimension, whether or not this is within the awareness of those involved, and a shift in power relations can have significant consequences for the directions of science. Such a shift may be viewed as good for science -- for example, if research managers have a better overview than scientists of opportunities across many fields, or a better appreciation of which research directions are most likely
From page 62...
... To the extent that these tools emphasize routinized measurement of easily quantified attributes of research, they shift power away from the judgments of the scientific community and toward others, such as those who devise the indicators and those who can find ways to game the assessment system. Efforts to gain approval for assessment mechanisms that rely more on the judgments of scientists, apart from claims that they provide better quality assessments, are in part efforts to prevent a loss of influence by scientists over science priority setting.
From page 63...
... In designing methods for assessment and priority setting, then, it makes sense to avoid framing either-or choices between mechanical, quantitative, and bureaucratized decision making led by science managers and qualitatively informed, nuanced choices dominated by scientists. The proper questions to ask in guiding research assessment and priority setting do not concern whether to use quantitative measures, but what should be the appropriate roles of quantitative measures and of deliberative processes of peer review and how should the perspectives of scientists and science managers be combined to provide wise guidance for science policy decisions.
From page 64...
... By way of contrast, formalized peer review systems are core features of NSF and NIH, with a key distinction being that NSF program managers oversee both program development and the panel review process, whereas NIH separates responsibility for program development and operations from the review process. The peer review process at NIH operates primarily out of the Center for Scientific Review, which organizes review groups that often cut across programs and even institutes, and which generates ratings that are intended to evaluate proposals on a unitary scale that is the same across programs and institutes.
From page 65...
... Translating research along the continuum of basic, clinical, and applied research and, ultimately, to patient care almost always in volves long periods; the linkages between these stages are seldom straightforward. Still, more comprehensive and transparent measurement tools would provide policy makers, the public, the scientific community, and patients with a more complete understanding of the role of government-sponsored research and help inform federal policy" (quoted in Journal of the American Medical Association, 2002)
From page 66...
... . Legislators and other authoritative oversight bodies are increasingly asking public agencies for quantitative measures of research performance, and in so doing can generate all kinds of mischief." 14.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.