Uses of Peer Review
Although OST has chosen to apply its new peer review program almost exclusively to the evaluation of projects at various stages of development, the basic principles of peer review (i.e., expert, independent, external, technical) also can be used to evaluate the technical merit of proposals, the balance of programs, and even program "needs." The evaluation of proposals is by far the most common and well-established form of peer review. Peer review also can be used to evaluate the technical merit of individual projects, or groups of projects, at various stages of development (as is done within OST). In addition, the basic principles of peer review, when effectively focused and defined in terms of the purpose of a review, can be used to evaluate a whole program or aspects of a program (including program "needs"). This chapter addresses how the defining characteristics of peer review developed in Chapter 2 (i.e., expert, independent, external, technical) apply to various types of evaluations. As examples, the committee discusses four specific applications of peer reviews:
- proposal evaluations;
- project maturity evaluations:
- entrance into applied research,
- entrance into engineering development
- entrance into demonstration, and
In this chapter, the committee provides a brief overview of each of these applications, discusses specific features or considerations for each review type, and describes a number of model review programs from other governmental and nongovernmental organizations (Boxes 3.1 to 3.8). It must be emphasized that these are illustrative, not prescriptive, examples; they are offered to illustrate how the principles of peer review might be applied broadly to different types of reviews, not to prescribe particular additional reviews for OST. Although some of the examples are framed in terms of OST's program, the types of reviews could be applied to many kinds of technology development programs. They also can be used for benchmarking OST approaches with other peer review programs (see Chapter 7).
Peer review is the most common method for evaluating the technical merit of proposed research and development projects. There are a number of well-known models for such reviews, including those of NSF and NIH (see Kostoff, 1997b; NSF, 1997; and OTA, 1991 for detailed descriptions of these, and other approaches for the review of research proposals). Because proposal evaluation often involves the review of large numbers of proposals, such evaluations typically employ a small number of fixed review criteria and generally utilize mail reviews. In most cases, review panels also meet as a group to supplement evaluations received by mail.
Organizations that support applied research proposals, such as those supported by OST, often choose to employ a two-stage selection process: (1) a peer review to assess the technical merit of the proposals, and (2) a "relevance" or "commercial viability" review to assess the potential applicability of the proposed project to program needs. Box 3.1 describes an example of such a two-stage selection process employed by NSF's Small Business Innovation Research Program. The EM Science Program is an example of an OST program that employs such a two-stage selection process (see NRC, 1997a).
Although many federal agencies such as NSF and NIH administer their own peer review processes, others choose to contract out part of the review process to other organizations with experience administering peer reviews. Such arrangements can be beneficial to federal programs that do not have sufficient staff expertise to administer effective, credible peer review programs or that choose to maintain an extra degree of independence from the peer review process. Examples of two such arrangements are described briefly in Boxes 3.2 and 3.3. OST has chosen to establish a similar arrangement with the American
Peer Review of Proposals in NSF's Small Business Innovation Research Program
The NSF uses a broad-based peer review process in all of its granting programs. The majority of these are very early-stage basic research programs that are not analogous to OST's typical project. However, one NSF program that has similarities to OST and is a good model for reviewing developmental program is SBIR.1 This program was established by Congress especially to support innovation by small businesses and to get these innovations to the end user. The SBIR program consists of three phases: Phase I determines scientific, technological, and commercial merit; Phase II develops the concepts further up to demonstration; and Phase III involves commercialization of the demonstrated technology (without funding support from NSF). Peer reviews are conducted on projects at both Phase I and Phase II.
The SBIR program employs a two-stage evaluation process. The first stage uses peer review to assess the technical merit of the proposals. Following an initial screening of all proposals by NSF staff (who ensure that they meet the minimum requirements for the SBIR program and decide which program unit should review each proposal), all proposals are peer reviewed by a group of qualified disciplinary scientists and engineers to assess their technical merit.
The second stage in the evaluation process is an examination by a group of experts from a commercial or applications perspective. These experts are selected on the basis of their expertise as representing the user or consumer perspective. Because SBIR programs are intended to deliver commercial innovations to the marketplace, this second review is designed to evaluate not the technical basis of the proposal, but rather the need for and acceptance of the innovation in the competitive marketplace. NSF program managers then use the results from both the peer review and the commercial review to decide which projects will be supported.
The most important element in the NSF SBIR review process is the combination of two distinct, early-stage evaluations to assess technical merit and end user applicability. NSF program managers have discovered that although the process takes two steps, it has enhanced the proportion of proposals that migrate from Phase I to Phase II as well as the possibility of commercial success, that is, acceptance by the end user.
Peer Review of Proposals in NASA's Office of Life and Microgravity Sciences and Applications
NASA's Office of Life and Microgravity Sciences and Applications (OLMSA) funds basic research and technology development related to life and microgravity sciences, in support of its mission to advance knowledge, improve the quality of life on Earth, and strengthen the foundations for continuing the exploration and utilization of space. OLMSA uses peer review in its grants award process to evaluate the scientific merit of proposals. OLMSA staff then evaluates the proposals for relevance to the program's objectives. OLMSA uses a contractor, Information Dynamics, incorporated (IDI), to support the solicitation and peer review process, with subcontracted support by the Universities Space Research Association. IDI maintains a strong working relationship with NASA, but the review process remains independent.
Peer reviews or research proposals are conducted by panels of experts selected by IDI on the basis of expertise, peer review experience, and panel balance. NASA may recommend reviewers and can exclude proposed reviewers if a reviewer has a known conflict of interest with the program (although this occurs only rarely). Prior to the review panel meeting, primary and secondary reviewers are assigned to each proposal. These reviewers (as well as any others who so desire) write evaluations of their assigned proposals and submit them to IDI. The peer review meetings at which proposals are discussed are closed to that reviewers are not influence by external forces. At the review panel meeting, the panel as a whole assesses and rank all proposals, and writes its final evaluation. The panel's evaluations and ranking are used as important input to NASA program managers in building their research program.
Society of Mechanical Engineers and the Institute for Regulatory Science (see Chapter 4).
Project Maturity Evaluations
Evaluations of project maturity refer to peer reviews conducted as a project develops from a research idea to a technology that can be demonstrated and ultimately deployed. OST's current peer review process fits this review type. Another example of this type of review is summarized in Box 3.4. The objectives of the review, criteria for review, and panel expertise required should all change as a project moves through the maturation process, as the following examples note.
Peer Review of Proposals by the American Institute of Biological Sciences
AIBS administers a variety of peer reviews in the areas of life and biomedical sciences for federal and state government agencies, as well as private institutions. To accomplish this, it uses various types of peer review, including mail review, panel meetings, and evaluations involving site visits. AIBS peer reviews have ranged from mail reviews of a single proposal to multipanel review of more than 2,000 proposals for the 1993 Army Breast Cancer Program. To identify potential peer reviewers, AIBS maintains a database of about 7,000 people recommended by members of AIBS and 42 affiliate societies. Conflict of interest and confidentiality are important considerations for the peer review program, and reviewers sign confidentiality agreements and conflict-of-interest statements. AIBS uses NIH and NSF standard rules for conflict of interest to avoid both real and perceived conflicts. AIBS peer review panel meetings are held in closed sessions to encourage frank and open discussion of proposals.
One important general point for all types of project maturity peer reviews is the need to distinguish clearly between the technical focus of peer reviews and nontechnical factors that play important roles in decisions to continue to support specific projects. Clearly, many nontechnical issues, such as public acceptability and relative DOE needs, should be considered by program managers when making decisions on which projects to support. These nontechnical issues should not be confused with appropriate criteria for peer reviews, however, because they could sidetrack reviewers into issues that are beyond their expertise or are difficult to resolve within the time constraints of a two- to three-day review. Program managers can incorporate such input into their decision-making process, however, in a variety of ways, including reviews (such as commercial viability reviews or "relevance reviews") conducted separately or in parallel with peer reviews (see Box 3.1). In the case of OST, the stage-gate reviews of its Technology Investment Decision Model (TIDM) provide such an opportunity to incorporate nontechnical input into the decision-making process.
In the following sections, the committee describes four specific stages of technology development in which peer review could be applied to evaluate the technical merit of a project. For each stage, the committee discusses the kinds of specific technical issues that could be of particular concern at this point of technology development. Also included is a description of some of the nontechnical factors that would be involved in decisions regarding which projects to support at each stage of development.
Entrance into the Applied Research Stage
Basic research may produce a concept thought to have relevance to solving a site-specific need. Program managers then may desire to see whether the concept has enough promise to justify moving into a program of applied research (e.g., Gate 1 of OST's TIDM; see Chapter 4 and Appendix A). A peer review as defined in this report could be very useful in assessing the technical merit of the concept. For example, is the science sound? Is it likely to develop into a technology to meet the stated need? Are other technologies already available? Addressing these types of questions would require peer reviewers who are experts in the relevant scientific and technological disciplines—individuals independent of those promoting the concept.
The decision to continue to fund such a project at this stage would consider other nontechnical factors in addition to the technical criteria used in peer review. For example, if favorable answers are given to the technical questions, is the technology likely to be developed in the time frame required by implementation schedules, with the funding and resources likely to be available to the program?
Entrance into Engineering Development
This is an important step in the technology development process because it often leads to the commitment of large amounts of funding for engineering development of a project. Such a review would address the technical adequacy of the technology—that is, whether all the scientific work needed to proceed to technology development is in hand and whether the technology is likely to work as promised to meet the need addressed.
The decision to continue to support a project at this stage (e.g., Gate 4 of OST's TIDM; see Chapter 4 and Appendix A) also would consider nontechnical aspects, such as regulatory performance standards and timetables, public acceptability, and DOE needs. Input on these nontechnical factors could be provided to program managers by other types of reviews, even reviews conducted in parallel to the peer reviews.
Peer Reviews in DOD Strategic Environmental Research and Development Program
The Department of Defense's SERDP is the largest environmental research program within DOD (currently funded at $50 million to $60 million per year). SERDP is a statutory DOD-DOE-EPA science and technology program established to address the environmental needs of the DOD and congruent DOE needs. As such, SERDP responds to the specific, high-priority needs that are defined by the Deputy Under Secretary of Defense for Environmental Security. As a mission-oriented program, SERDP must ensure that the research it sponsors is both responsive to stated needs and of the highest scientific quality. SERDP is a DOD program, but it also addresses to some extent the strategic environmental needs of DOE and EPA, especially when there is some overlap with military needs. Research can be conducted by the uniformed services, DOE, EPA, and, starting in 1997, nongovernmental organizations (NGOs) on a competitive basis. SERDP employs peer reviews to evaluate proposals, ongoing projects, and its overall program.
SERDP uses a multistage process to select proposals solicited from both the federal agencies and the private sector, and peer reviews are a critical part of this process. The process begins with Statements of Needs (SONs) solicited from the uniformed services. These statements are prioritized and generalized by the internal SERDP Technology Thrust Area Working Groups (TTAWGs; see below) for the four thrust areas of SERDP (cleanup, compliance, conservation, and pollution prevention), and a call for proposals is issued, either to government laboratories or to the broad research community (NGOs). The initial screening of full proposals from NGOs is performed by SERDP staff (and described below), whereas the initial screening of full proposals on the federal side is performed by SERDP's federal partners (DOD, DOE, and EPA), with each partner limited in the number of full proposals that can be submitted. Peer review is used in the evaluation of all proposals (from both federal and NGO communities).
Proposals from NGOs are first evaluated by SERDP staff to ensure that the project addresses the SON and is appropriate for SERDP funding. Proposals are subjected to a mail peer review modeled after the system used by NSF. The peer review process is conducted by a support contractor under the control and supervision of SERDP staff. The contractor nominates potential peer reviewers, who must be approved by SERDP staff. Peer reviewers judge the proposals against a set of technical merit and personnel criteria and assign a numerical value from 1 to 4. Reviewers also are asked if they would fund each proposal based upon their overall assessment of the project and its cost. This review results in a relative ranking of the proposals within each SON.
The proposals, along with peer review results, are then provided to the TTAWGs, which are responsible for recommending a program plan of projects to fund each year for each thrust area. TTAWGs are composed of scientific and technical personnel from DOD, DOE, and EPA, and are charged with evaluating the
NGO and federal proposals against a set of weighted evaluation criteria, which include technical merit, transition potential, quality of personnel, cost, and cooperative development (i.e., whether the cost is shared with other funding sponsors). In their review of proposals, TTAWG members consider peer review comments but are not constrained to follow the rankings provided by the peer review. TTAWGs are required to justify any recommendation that is not consistent with the peer reviewers' evaluation, however.
The recommended program plan, together with all of the results from peer reviews, is presented to the SERDP Technical Director and Executive Director for review and final selection. A final review of the SERDP program is performed by the SERDP Scientific Advisory Board (SAB).1 The SAB must approve all new projects and review all project renewals above $900,000 per year; it also conducts an assessment of the overall SERDP program. During their review, SAB members have access to all of the data from both the peer review and the TTAWG review. SAB rejected projects at the rate of 19 percent in fiscal year 1997 (DOD, 1998)
Even though there is a broad knowledge base in the TTAWGs and the SERDP staff, the use of peer review to evaluate proposals and ongoing projects, as well as to assess the overall balance of the SERDP program, ensures that each SERDP project is evaluated by independent experts in the field. This is vital to maintaining the scientific and technical quality of SERDP. Peer review is utilized as a tool that aids in the proposal selection process as opposed to being a decision process unto itself.
Entrance into the Demonstration Stage
This step represents a decision on whether or not to proceed to full-scale, on-site demonstration of a technology (e.g., Gate 5 of OST's TIDM; see Chapter 4 and Appendix A). At this point the science should have been fully verified, and technical questions would have more to do with whether the engineering development test data are adequate to design a full-scale demonstration that will be safe and cost-effective.
Decisions to continue to support a project at this stage also would consider nontechnical factors similar to those discussed above for other project maturity evaluations. In particular, questions of public acceptability might be given high priority because a decision to proceed with a technology at this stage will result in a full-scale project operating at a contaminated site where public sensitivity is likely to be high. Thus, a program manager might solicit input from experts knowledgeable on risk as well as particular site stakeholders' perceptions of risk to supplement the results of a peer review. Such stakeholders would not be appropriate peer reviewers, however.
At this stage of development the technical soundness of the concept should have been proven, but the efficacy, cost-effectiveness, safety, and regulatory acceptability of the technology may have yet to be demonstrated. A predeployment peer review would examine the adequacy of the test data generated during the technology's development—for example, whether these test data show that the technology is safe and cost-effective.
Decisions to continue to support a project at this stage (e.g., Gate 6 of OST's TIDM; see Chapter 4 and Appendix A) also would involve nontechnical considerations, including regulatory compliance, intellectual property, liability, public acceptability, risk, and safety. A program manager could receive input from experts in these areas, and such input could take the form of separate reviews conducted in parallel or separate from peer reviews on technical matters.
Program Balance Evaluations
If a fully assessed set of needs is available, management might want to obtain an independent assessment of program balance, that is, whether the technology development program adequately addresses those needs, given the resources available. For such an evaluation, the review panel as a whole should
have technical expertise covering the entire scope of relevant technologies. Reviewers also should have knowledge of other existing technologies or of technologies in development nationally and internationally. To avoid conflicts of interest the panel should be independent of the managers who constructed the existing program and PIs connected with the program.
In OST's case, this would be achieved by going outside OST and those it funds. If one purpose of such an evaluations were to uncover duplication or potential synergies with projects in other parts of DOE, it might be advisable to have some panelists who are familiar with all relevant DOE programs, perhaps DOE employees, or to have such persons present at the review session to advise panel members. The committee cautions that extreme care should be used in appointing DOE employees as panelists because of the potential for conflicts of interest (see Chapter 5). Three examples of the application of peer review to evaluate program balance are given in Boxes 3.5, 3.6, and 3.7.
A peer review of program needs could assess, in OST's case for example, research and development needs to address environmental problems at contaminated sites. These problems are the fundamental drivers of the technology development program—that is, the technologies are developed to solve the problems caused by contamination. A needs review of the whole program might be conducted to examine the entire suite of needs that the program must address in order to begin to assess program balance. It can be argued that a needs review is not necessary because the personnel at the sites are closest to the problem and therefore require no review. However, this very closeness when coupled with funding pressures defines a situation in which peer review can be beneficial. An example of a needs evaluation used in industry is provided in Box 3.8.
For such a review the technical expertise required would be that necessary to assess both the types of contamination and the peculiarities of specific sites, and thus would be a very broad scope of expertise. In addition, panels would have to be constructed to ensure a fair and balanced assessment of all needs—that is, to avoid having members with interest in giving a particular site inappropriate priority for technology development. For example, a member of the public from a community affected by a DOE site should not be a panel member due to potential conflicts of interest or bias.
Peer Review of Programs at National Institute of Standards and Technology Laboratories
Since 1959, NIST (and its predecessor agency, the National Bureau of Standards) has practiced external assessment of its activities, specifically by technical panels of the NRC's Board on Assessment of NIST Programs. The longevity of this practice and the NRC's ability to attract volunteers for service on its panels attest to its success and its value to NIST. Elements of the NIST and DOE-OST should be noted.
NRC panels review the NIST Measurement and Standards Laboratories, each of which has responsibility for a discipline area (e.g., chemistry, physics, materials, information technology) related to NIST's overall mission in measurements and standards. NIST programs typically continue for decades although individual projects come and go within programs. Although NIST focuses on industry and commerce, research also relates to defense, health, environment, space, and science. The problems NIST addresses are usually technically narrower and better defined than those faced by DOE-OST. For example, NIST might well develop a standard measurement method to determine very low levels of impurities in semiconductor wafers, but not methods to lower the level of impurities. NIST's typical client is technically sophisticated, although this is not always the case. Finally, having no regulatory authority, NIST typically has a cooperative relationship with its clients. There are at least two relevant similarities between NIST and OST: (1) the problems appropriate to its mission and needing attention are too great for the resources available, so priorities must be set; and (2) these problems span many fields of science and technology
NIST meets the criteria defining peer review by asking the NRC to run its assessment system. Because the panels are constituted by and report through the NRC, they are independent and external. The NRC appoints panel members and controls the production of the official reports to NIST. The NRC is able to access the wide range of technical experts needed to review NIST programs. The National Academies of Sciences and Engineering provide a good range of contacts and certify expertise. A great many panel members are from industry, thus involving stakeholders and individuals with practical experience with the problems being addressed. The panels are ''balanced'' so that no industry or company dominates. Panel members serve overlapping three-year terms, with one renewal possible, to allow for continuity and follow-up on recommendations. This also allows for increasing understanding of the programs being reviewed. Dialogue between NIST officials and the NRC occurs throughout the review process to facilitate the process and maximize its usefulness to NIST.
Peer Review of Programs of the U.S. Army Engineer Waterways Experiment Station
One way of supporting the Government Performance and Results Act (GPRA; Public Law 103-62) of 1993 is to conduct peer reviews of programs at a departmental or laboratory level (e.g., about 200 professional staff) and address the quality of staff, research, products, and facilities. This approach is use at the U.S. Army Engineer Waterways Experiment Station (WES; Conway et al., 1996, 1997).1 Over a two-to four-day period a panel of three to five technical leaders reviews programs and biographical documents, receives technical presentations, views facilities, and interacts with staff. A 10- to 15-page report is prepared that describes the review procedure, generalizes findings, and—for each of the four review elements—lists strengths, areas for improvement (usually with suggestions), and a numerical rating from 1 to 5 (poor to excellent). The panel, with partial membership rotation, repeats the review on a one- to two-year cycle to assess progress in areas identified for improvement and to identify new areas to be addressed.
Each panelist in the WES review has demonstrated technical expertise and leadership at the international level. The panelists' expertise is complementary: for example, in the review of a large environmental laboratory, the panel consisted of an environmental engineer, a bioengineer or scientist, and an environmental biologist. Panelists make recommendations that have the potential for broad impact, are not too numerous, and largely can be addressed in the short term. Particular attention is paid to Congress's concern over the need for performance metrics, for detailed strategic plans and milestones, for coordination across agencies, and for impact in the short term (American Institute of Physics, 1997).
This model could be used by OST to review a group of similar projects under a focus area.
Peer Reviews of Research Proposals and Programs in DOE's Office of Basic Energy Sciences
DOE's Office of Basic Energy Sciences routinely carries out peer reviews of both proposals and programs. BES peers reviews include the following:
Needs Evaluation at Chiron Corporation
Chiron Corporation, headquartered in Emeryville, California, near San Francisco, is a biotechnology company that combines diagnostic, vaccine, and therapeutic strategies for controlling disease. Chiron participates in three global markets: (1) diagnostics, including immunodiagnostics, critical care diagnostics, and new quantitative probe tests; (2) pediatric and adult vaccines; and (3) therapeutics, with an emphasis on oncology and infectious diseases. Chiron has research programs under way in gene therapy, gene transfer, combinatorial chemistry, and recombinant proteins targeted toward oncology and cardiovascular and infectious diseases.
As a research-driven company, Chiron is especially sensitive to the thorough review of various parts of its research portfolio. Although the majority of the regular program reviews are performed internal to the corporation, a peer review process using external experts not associated with the company has been invoked at critical stages in several programs. One of the most extensive reviews was done prior to launching a major new program in the Chiron Diagnostics business unit.
The new program was conceptualized and developed internally over a period of almost a year. No investments in new products were made, but considerable effort was expended in market review, scientific assessment, and product planning. The undertaking was of considerable scope and called for a significant multiyear investment. As plans were formalized, additional internal resources were engaged; however, no major capital investments were made. The final stage in the initiation of this business thrust was a peer review session in which approximately ten outside experts were engaged for two days to review and critique the plans. The experts were also asked to make more extended comments on a number of relevant areas covered by the plans. The proceedings were recorded for further reference and review by Chiron staff not in attendance.
The result of this intense external review was a significant change in the direction and magnitude of the program. The expert reviewers were able to provide an alternative view of the cost and time required to mount a competitive program given the Chiron starting point. Likewise, they gave valuable input on the competitive landscape of the field and the likelihood of rapid acceptance of the products in clinical medicine. In this case, Chiron responded carefully to the external reviewers' recommendations and was able to adjust the deployment of financial and human resources into areas of greater short-term impact. Like OST, Chiron needed to address a specific problem under constraints (in this case, governed by the market) such as budgets and schedules. However, Chiron recognized the value of peer review, and before investing large amounts of funds, used peer review to improve its investment.