National Academies Press: OpenBook
« Previous: REFERENCES
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 301
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 302
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 303
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 304
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 305
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 306
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 307
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 308
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 309
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 310
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 311
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 312
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 313
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 314
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 315
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 316
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 317
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 318
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 319
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 320
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 321
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 322
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 323
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 324
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 325
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 326
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 327
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 328
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 329
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 330
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 331
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 332
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 333
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 334
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 335
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 336
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 337
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 338
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 339
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 340
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 341
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 342
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 343
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 344
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 345
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 346
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 347
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 348
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 349
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 350
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 351
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 352
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 353
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 354
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 355
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 356
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 357
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 358
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 359
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 360
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 361
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 362
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 363
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 364
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 365
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 366
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 367
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 368
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 369
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 370
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 371
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 372
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 373
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 374
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 375
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 376
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 377
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 378
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 379
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 380
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 381
Suggested Citation:"APPENDIXES." National Research Council. 1984. Toxicity Testing: Strategies to Determine Needs and Priorities. Washington, DC: The National Academies Press. doi: 10.17226/317.
×
Page 382

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

APPENDIX A REVIEW OF SELECTED PRIORITY-SETTING SYSTEMS Initial examination of selected priority-setting schemes revealed that the multiplicity of approaches was more apparent than real. The appearance of dissimilarity arises more from differences in emphasis, or scope, than from differences in basic logic or strategy. Selected for detailed description here are schemes that were thought to make important contributions to the developing science or art of priority-setting. The choices in some cases were related to uniqueness in the treatment of exposure, of toxicity, or of the interaction between the two. A comprehensive list of schemes has been compiled (U.S. Environmental Protection Agency, 1980~. It has been recommended that federal agencies adopt priority-setting systems (Administrative Conference of the United States, 1982~. The Toxic Substances Control Act-Interagency Testing Committee (TSCA-TTC) scheme (Nisbet, 1979) is of particular interest, because it deals with a large part of the universe with which the NTP is concerned. Equally important, it has had to face the test of continued use over several years, and it has been systematically reviewed (Nisbet, 1979~. The schemes of Kornreich et al. (1979 ) and Ross and Lu (1980) are based on a systematic review of a substantial portion of the literature on priority-setting. The Food and Drug Administration (FDA) scheme (U.S. Department of Health and Human Services, 1982) is limited to one route of exposure, but otherwise is comprehensive in its approach. The scheme of Wilhelm (1981) is in large measure a response to what were perceived as deficiencies in the TSCA-ITC system. That of Astill et al. (1981) is designed to function with a sequential testing and feedback strategy. The ranking algorithm of Brown et al. (1980) is based on a simple mathematical model and is designed for multinational application. The proposed cyclic review procedure for FDA (U.S. Department of Health and Human Services, 1982) uses structure-activity considerations to establish initial "levels of concern," which are also found in the decision-tree approach of Cramer, Ford, and Hall (1978~. Gori's scheme (1977) provides a ranking index based on exposure that is complementary to a second scheme that uses structure-activity analysis for assessing possible carcinogenic activity (Dehn and Helmes, 1974~. NATIONAL TOX ICOLOGY PROGRAM The NTP chemical nomination and selection process, as described in the NTP Fiscal 1983 Annual Plan (U.S. Department of Health and Human Services, 1983), is summarized below. 301

CHEMICAL NOMINATION Member agencies of NTP and other sources (other federal agencies, state agencies, the public, labor, and industry) submit to NTP nominations of chemicals for various types of toxicity testing. Each nomination is expected to include the name of the chemical, the particular toxicity testis) desired, the rationale for the nomination, and the available background data on production, use, exposure, environmental occurrence, and extent of toxicologic characterization. All nominations are considered, however, regardless of the depth of information submitted. Nominations are referred to the NTP chemical selection coordinator for review, to determine which chemicals have been tested, are already on test or scheduled for test, or have been previously considered and rejected for testing by NPT or its predecessors. This may involve preliminary searches of on-line data bases and reference books. The nominations and background information are then forwarded to the chemical-review staff at the National Center for Toxicological Research (NCTR), who examine the available literature, assess the relevant data, and prepare draft executive summaries of the information. (Executive summaries are not prepared for chemicals nominated solely for mutagenicity testing.) Included in each draft executive summary are chemical identification, surveillance index (production, use, occurrence, and analysis), information on toxic effects, and a statement of the source of and reason for nomination. EVALUATION OF NOMINATED CHEMICALS The chemical-review staff sends the draft executive summaries to the Chemical Evaluation Committee (CEC), which is composed of representatives of the Consumer Product Safety Commission (CPSC), Environmental Protection Agency (EPA), FDA, Occupational Safety and Health Administration (OSHA), National Cancer Institute (NCI), National Institute of Environmental Health Sciences (NIEHS), National Institute for Occupational Safety and Health (NIOSH), NCTR, and NTP. Members are requested to search data bases peculiar to their agencies for further information on the nominated chemicals (and structurally related compounds), to improve the evaluation process. The CEC evaluates the summaries and recommends the types of testing to be performed. Primary and secondary reviewers are also assigned to each chemical, after consideration of the nature of exposure, so that appropriate regulatory concerns will be addressed. At the CEC meetings, the primary reviewer for each chemical summarizes the data on that chemical and makes recommendations for testing. The secondary reviewer presents additional information, where available, and also discusses the testing of the compound. After discussion, the CEC votes on the recommended types of testing and assigns priorities for the testing. 302

PUBLIC COMMENT A Federal Register notice is published, which lists the chemicals reviewed by the CEC and the recommended types of testing. The notice also solicits comments from interested parties and information on completed, current, and planned testing in the private sector. The list of chemicals is also published in the NTP Technical Bulletin, with a request for comments. These steps are taken to enable other individuals and groups to provide data useful to the chemical evaluation process. PEER REVIEW The revised executive summaries and public comments on the nominated chemicals are forwarded to the NTP Board of Scientific Counselors, which meets to evaluate the data and to make recommendations. CHEMICAL SELECTION The chemical-review staff then incorporates the board's ratings and pertinent public input into final executive summaries, which are submitted to the NTP Executive Committee. That committee decides whether to test, defer, or delete each of the nominated chemicals for the various types of testing. Its decisions are published in the NTP Technical Bulletin. The committee also recommends priorities for testing, test development, and test validation to NTP. After Executive Committee action, the NTP Steering Committee refers the chemicals to one or more of the three organizational units participating in NTP: NIEHS, NIOSH, and NCTR. A chemical manager is then assigned to evaluate the data developed during the NTP chemical evaluation process and other information retrieved from detailed searches of the published literature and from industry. The manager presents a proposal to the Toxicology Design Committee (TDC) either to perform appropriate testing or to delete the chemical from consideration for testing. The TDC, which consists of research scientists from NIEHS and NTP, assesses the proposal and either develops a final protocol for testing or recommends no further testing; the latter recommendation is based on technical difficulties in testing, budgetary reasons, or the existence of adequate outside testing. All chemicals selected through this process are then tested as time and resources permit. The results of testing are reviewed routinely, to determine whether further types of testing are appropriate, and candidates for additional testing are submitted to the NTP chemical nomination and selection process for evaluation and decision-making. 303

SEQUENTIAL TESTING FOR CHEMICAL RISK ASSESSMENT (ASTILL ET AL. t 1981) This scoring system was developed by the Eastman Kodak Company to ~ the extent of toxicity testing required for production Four categories of information are used to derive a total the basis of which one of four testing levels is recommended. determine chemicals. score, on Available health and environmental data are compiled and rated independently, composite health-effects scores are computed, and the appropriate tests are selected and performed. Results of these tests are then used to revise the ratings. New scores are obtained, and the testing level is revised. This process is repeated until testing information is complete. Thus, the system is dynamic, in that it incorporates a feedback mechanism that allows for continuing review of the testing needs of a specific chemical. This system provides a basis for a multistage screening system. Four categories of information are used: magnitude of human exposure, magnitude of environmental exposure, effects on human health, and effects on the environment. The two magnitude categories have four components each, and the two effects categories have three components each. The four components considered in the rating of the magnitude of human exposure are production volume, number of people exposed, hours per year exposed, and number of population types exposed. Scores for the four components are added to yield a value for the magnitude of exposure. The assessment of health effects considers the LDso, acute effects (reversible and irreversible!, and chronic effects (reversible and irreversible). four categories is scored from 1 Each of the 14 components of the _ _ ~ _ ~ 3, with 3 indicating the most severe or hazardous score. The scores _ to The scores for the two human categories thealth effects and magnitude of human exposure) are summed, as are the scores for the two environmental categories. The resulting scores range from 7 to 21 and are associated with specific testing levels, as follows: Health (or Environmental) Score 7 - 9 10-13 ~ 4 -17 18-21 - Testing Level II III IV The level of testing becomes increasingly specific and sophisticated with increasing score. Level I testing is based on the use of physicochemical evaluation and health screening, as well as acute-toxicity 304

studies. Although it is not specifically stated, with respect to human data, Level I might include surveillance of morbidity, mortality, and fertility patterns of exposed human populations. Level II testing consists of toxicity tests that are intermediate between acute tests and subahronic feeding studies, whereas Level III testing includes subacute exposure studies. Long-term (or chronic) health effects are evaluated through Level IV testing. The health-effects criteria are not very specific, but readily quantified in an objective and replicable manner. The health-effects criteria and ratings are as follows: LD50, mg/kg Immediate effects Prolonged effects Rating > 500 1 50-500 2 50 3 2 None Reversible Irreversible 3 None 1 Reversible 2 Irreversible 3 This system appears to be efficient, in that it uses a minimum of subjective input (expert opinion or judgment), although such judgment may be used in the review and rating of health effects. This system appears to be practical, in that it facilitates decision-making in an efficient and objective manner. Any compound can be evaluated; in the absence of available data, baseline information is compiled before any testing is done. The baseline information compiled consists of: Quantities manufactured and disposed of. Exposure estimates. Product function and application. Structure-activity correlation. . · Literature search. Cancer hazard evaluation. Such baseline information may be sufficiently complete for hazard assessment, particularly if previously published toxicity studies are available. 305

This scheme has been evaluated by the authors with a wide range of industrial chemicals, although the specifics of evaluation are not provided. A RANKING ALGORITHM FOR EEC WATER POllUTANTS (BROWN ET AL., 1980) The purpose of this scheme is to rank, for possible regulatory action, water pollutants as potential hazards to humans and to aquatic organisms. The scheme considers about 1,500 compounds used in countries of the European Economic Community and suspected of entering rivers. The algorithm is based on a simplified mathematical model relating production and use of a chemical to its occurrence in drinking water and in food of freshwater origin. Standard assumptions are made as to intake of fish and water; daily maximal and annual average intakes through ingestion are calculated. The amount of a chemical estimated to reach the water is calculated by multiplying production by the fraction that reaches the water; the fraction is estimated on the basis of manufacturing practices and the chemical's use. A typical dilution volume of the chemical is estimated from its half-life in water and from river-flow data. Estimated concentrations are used to calculate human exposure from consuming drinking water and freshwater fish. A concentration factor is used to calculate ingestion from consumption of f ish, assuming typical diets. The list of 1, 500 chemicals was reduced to about 1,400 when mercury and cadmium compounds were eliminated because they were already controlled by the EEC and persistent synthetic substances (mainly plastic materials) were eliminated because, although objectionable in water, they are not toxic. For the remaining 1,400 compounds, production and consumption data are obtained, and all those estimated to be produced at under 100 metric tons per year are eliminated. The remaining 426 compounds are then processed through a screening algorithm based on production, environmental half-life, and acute-toxicity factors. Some elements of toxicity testing for human health are applied in this scheme. The acute-mammalian-effect dose is represented by the lowest reported lethal oral dose for humans. If this information is not available, the lowest oral LD50 value for other mammalian species is used. If no oral LD50 value is available, the lowest LD50 value for the dermal or inhalation route is applied. If no LD50 values have been reported at all, the lowest lethal dose for the oral, dermal, or inhalation route is used. If no acute-lethality data are available, an estimate is devised on the basis of comparison with other compounds in the same chemical class. If a reasonable estimate cannot be made this way, the default entry "unknown" is used in the program. 306

Chronic mammalian effects are also used when available. If the data file indicates that carcinogenicity, mutagenicity, or teratogenicity information is available, it is factored into the algorithm. If a chemical exhibits all three effects, only one is entered, preferably carcinogenicity. The chronic-mammalian-effect dose is the lowest dose that caused the reported effect. ESTIMATION OF TOXIC HAZARD--A DECISION TREE APPROACH (CRAMER ET AL., 1978) This scheme ranks food chemicals in three classes of concern for toxicity testing on the basis of chemical structure and oral-toxicity data. It is applied to structurally defined organic and organometallic compounds. Polymers and inorganic compounds are excluded. By answering a series of questions about chemical structure, the operator of the system follows a decision tree until the chemical considered falls into Class I (low concern), Class II (moderate concern), or Class III (serious concern). In each class, chemicals are ranked by comparison witn no-observed-effect doses. The data on no-effect doses were derived from literature values based on short-term or chronic studies. Class I substances are those whose structures and toxicity data, when combined with low human exposure, suggest low priority for investigation. Class III substances are those whose structures and toxicity data would not permit presumptions of safety and which thus require the highest priority for investigation. Class II substances are intermediate between Classes I and III. High exposures to substances in any class would increase the priority for investigation or testing. The number of chemicals found to be in Class II is not large. The tabulation of compounds in classes, with the exception of compounds with no-effect exposures above 500 mg/kg of body weight per day, is restricted to toxicity tests in which the next higher feeding exposure above the no-effect exposure is no more than 5 times the no-effect exposure. It was the general intent of the authors that the most toxic substances in Class I (low concern) should have a no-effect exposure in animal tests at or above SO mg/kg of body weight per day. This exposure, subjected to a safety factor of 100, corresponds to human exposure at approximately 25 mg/day. Use of this procedure requires knowledge of chemical structure and reasonably accurate estimates of human intake. The authors made it clear that chemical structure is to be used only as a guideline for testing decisions and that such use of structure-activity analysis is intended as a guide to the acquisition of data, not as a substitute for data. 307

AN AUTOMATIC PROCEDURE FOR ASSESSING POSSIBLE C - CIN~ENIC ACTIVITY OF CHEMICALS PRIOR TO TESTING - ( DEHN AND HELMES, 1 9 7 4 ) This scheme uses structure-activity relationships to predict carcinogenesis. mere is no exposure element. The corresponding exposure element has been described by Gori (1977~. me procedure incorporates the collective knowledge of a panel of experts and attempts to automate the key features of that knowledge to select candidate compounds for carcinogenicity testing. The basis of the procedure is an activity tree constructed so that more specific details of chemical structure (as related to carcinogenicity) are applied at each decision point in the tree. This subdivision of structures continues until an end group (called a node) containing compounds of closely related chemical structure is identified. An estimate is then made of the probability that the chemicals in a node are carcinogenic and of the relative potency of each. Reflecting the expertise of the panel, construction of the tree concentrates on the following groups of chemicals: naturally occurring substances; nitroso, hydrazino, and ado compounds; polycyclic aromatic hydrocarbons; aromatic amines; and inorganic compounds. Although structure-activity relationships can be useful in setting priorities for carcinogenicity testing, the accuracy of analysis of such relationships in predicting carcinogenicity has not been verified. If the decision tree could be compared with test data generated since the scheme was completed, its utility could be better assessed. Exceptions within a given node (i.e., negative compounds within a carcinogenic chemical class) are extremely instructive and should serve as a cautionary guide when one attempts to apply analysis of structure-activity relationships in too broad a manner. TOXICOLOG ICAL PRI NCIPLES FOR THE SAFETY OF DIRECT FOOD ADDITIVES AND COLOR ADDITIVES USED IN FAD (U. S . DEPARTMENT OF HEALTH AND HORN SERVICES, 1982) This scheme was developed to establish priorities (and extent) for toxicity testing of direct food additives. Chemicals are divided into three categories of suspicion based on structure-activity considerations by following a short decision tree. A suspicion category is combined with exposure information to define a level of concern (I, II, or III). Once the level of concern is determined, tests may be required. Results of tests already done are placed in three categories (well done; not well enough done, but usable to some degree as a "core" test; and unusable). On the basis of this further information, additional testing may be required. 308

Toxicity is not estimated quantitatively, so there is no quantitative assessment of uncertainty for it. There is judgmental consideration of uncertainty (specification error) in the evaluation of toxicity tests in the literature. There is a discussion of tests for each level of concern and for various combinations of concern and test information. RANKING OF ENVIRONMENTAL CONTAMINANTS FOR BIOASSAY PRIORITY (GORI, 1977) The purpose of this scheme is to estab~ ish, on the basis of exposure ~ a priority ranking for chemicals to be tested in a carcinogenicity bioassay. All chemicals in commerce are considered by the scheme. Total intake of a chemical by a given route is estimated for all members of a population group with similar exposures, and intake is then summed over population groups and sources of exposure. Intake by route is combined with probability of carcinogenicity and expected potency to produce a ranking index that, in theory, reflects the expected annual number of cancer cases. The scheme depends on the quantitative prediction of carcinogenic activity from structure-activity comparisons (see Dehn and Helmes, 1974~. This requires the identification of substructures, derived from known carcinogens, to which activity indexes can be attached--a process that requires expert opinion. A chemical of unknown carcinogenic potential is then inspected for such substructures, and an activity value is ascertained on the basis of their presence. Exposure assessment takes account of chemical production and use, but not disposal or discharges explicitly. Although it may not be clear from the text, the scheme estimates an uncertainty factor or confidence range for every variable. One notes and keeps track of the route of exposure and maintains an "audit trail" to the information in the data base. Deriving an exposure estimate for a chemical might require up to a person-day of effort, on the average. Considerable subjective input is required. PRIORITY-SETTING OF TOXIC SUBSTANCES FOR GUIDING MONITORING PROGRAMS (KORNREICH ET AL., 1979) This system, prepared for the Office of Technology Assessment by Clement Associates, is designed to compile a priority list for selecting potentially toxic chemicals for monitoring in food. 309

The criteria used in developing 32 existing priority lists of toxic chemicals are examined, and criteria developed for ranking chemicals on the basis of their likelihood of endangering human health through contamination of the food supply. Three preliminary lists of possible food contaminants (organic substances, inorganic substances, and radionuclides) are compiled. Data on each chemical on these lists are assembled and used to assign scores to each chemical for various factors. Scores for the factors are combined, and the combined scores are used for ranking the chemicals on the three lists. Selection criteria include both exposure and toxicity factors. Weights are assigned to reflect the relative importance of each criterion and to allow the total score to be a measure of the overall propensity of a chemical to contaminate foods. The individual score for each factor is _ _ _ and the weighted scores are added. The total exposure score and the total biologic score are each adjusted to a maximum of 50 points and summed to allow for a possible total of 100 points. multiplied by the assigned weight, This system is designed to use quantitative information, with considerable reliance on expert opinion for the assigning of scores. For toxicity factors, a score of 0 is assigned for negative results and for absence of data. No cost estimates are given for this system, which was intended for one-time, rather than repeated, use. RANKING CHEMICALS FOR TESTING: A PRIORITY-SETTING EXERCISE UNDER THE TOXIC SUBSTANCES CONTROL ACT - (NISBET, 1979) This scoring system was developed to set priorities for testing chemicals under the authority of the Toxic Substances Control Act. The scheme is intended for application to chemicals in commerce that are not covered by other statutes. Drugs, cosmetics, food additives, and pesticides are excluded, unless they also have other uses. Also excluded are chemicals with an annual production volume of 1,000 lb or less. The system is intended for chemicals already in commerce at the time of compilation of the TSCA Inventory, which now defines "old" chemicals for the purposes of the statute. Because the Inventory did not exist when the first testing recommendations were required by the statute, the system was originally applied to a list of chemicals derived from lists of chemicals of high production volume or previously reported toxicity. Thus, the initial "universe" of chemicals was limited to chemicals already identified as of potential concern or nominated for inclusion by Interagency Testing Committee (ITC) members or other experts. Of 24 priority lists reviewed, 19 were used as a basis for the initial compilation of compounds. Noncommercial chemicals were then eliminated. 310

Chemicals that were not on the U.S. ITC list were designated to be eliminated from the list, but were screened initially and were included if nominated by the expert panel. Later screening evaluated use and eliminated substances already regulated under some statute other than the TSCA. These initial screening steps resulted in a list of approximately 900 chemicals for scoring. ITC divided the scoring process into two discrete phases--potential exposure and biologic effects. Screening and scoring of biologic effects were postponed until potential exposure was evaluated. _ following factors were used in the first stage of exposure scoring: · General population exposure--number of people exposed, frequency of exposure, exposure intensity, and penetrability. · Quantity released into and persistence in the environment. · Production volume. · Occupational exposure. Some 330 chemicals were then selected from the list for biologic scoring. The TSCA requires that ITC give priority to compounds that are known or thought to cause or contribute to cancer, gene mutations, or birth defects. Seven factors were selected for scoring on biologic activity: Carcinogenicity. Mutagenicity. · Teratogenicity. . Acute toxicity. Other toxic effects. · Ecologic effects.* · Bioaccumulation. Because ITC seeks to identify chemicals that require testing, rather than simply scoring compounds for known biologic activity, it was decided that the biologic scoring system should have two independent components--a measure of known biologic activity and a measure of the need for further testing. These components provided the basis for the biologic scoring system, as follows: *Note that this scheme and its variants (Nisbet, 1979; Ross and Lu, 1980) are designed to set priorities among chemicals for potential effects on the environment, as well as on human health. 311

Positive numerical score 1 to 3: - · Substance does not need further testing. · The higher the number, the more positive the results. Zero score: · Negative test results. · Biologically inactive compound. · Low index of suspicion. Negative numerical score -1 to -3: · Lack of data--substance should be tested further · The more negative the number, the greater the need for testing (as judged by other data on biologic activity or data on structural analogues). Early in 1979, ITC sponsored a workshop to review the ITC system and to make recommendations for improvements. The proceedings of the workshop (Nisbet, 1979) includes a number of papers on priority-setting systems and reports by 11 subgroups that reviewed different elements of the ITC scoring system and recommended changes in scoring methods for individual exposure and toxicity elements. m e workshop did not propose a comprehensive alternative scheme and did not produce a synthesis of the recommendations of the subgroups. CHEMICAL SCORING SYSTEM DEVELOPMENT (ROSS AND LO, 1980) This draft scheme was designed to screen relatively large numbers of chemicals and to identify those with the greatest need for control or testing. The scheme considers subsets of the TSCA Inventory, including chemicals on which EPA expects to receive additional production- and exposure-related information under Section 8(a) of TSCA. The scheme consists of several screening processes grouped into five components: biologic toxicity I, biologic toxicity II, environmental fate, production and release, and human exposure. There are several criteria for each component. Each criterion is assigned a numerical score from 0 or 1 to 9 or 10. The screening system is applied to chemicals on the TSCA Inventory in two phases. The first phase screens chemicals into groups of low, moderate, and high concern on the basis of exposure characteristics (production volume, environmental fate, potential environmental release, and potential 312

human exposure). For chemicals that have similar scores on these major exposure criteria, scores on a group of modifier criteria can be applied to determine which compounds have the greater exposure potential. These modifier criteria can receive a maximal score of 9 and are to be used only in case of ties in the scores on the primary exposure criteria. me second phase separates chemicals into groups of low, moderate, or high concern on the basis of potential toxic effects. Chemicals that are identified as being of high concern in the first phase are to be considered first in the second phase. The biologic-effects criteria are divided into two categories: biologic toxicity I includes carcinogenicity, mutagenicity, embryotoxicity and fetotoxicity, and reproductive effects; biologic toxicity II includes all other criteria for biologic effects and contains effects on plants, bacteria, fungi, and aquatic organisms. The authors stated: Biological toxicity is divided into 2 components because the areas of health effects in the biological toxicity I component are of particular societal and regulatory agency interest and therefore warrant consideration separate from other aspects of toxicity. Another difference between the biological toxicity I and biological toxicity II components is that the scoring systems in the biological toxicity I component are not dose dependence [sic] but are based on expressions of confidence, whereas the scoring systems in the biological II component are either dose or concentration dependent. In the carcinogenicity scoring process, a precursor is defined as "a chemical which in itself is not carcinogenic but which is responsible for the formation of a chemical which is carcinogenic, e.g., a metabolize." However, the precursor is assigned a score of 4, rather than a potentially higher one. This scoring process is strictly qualitative and does not deal with the potency of a carcinogen. It appears that absence of data is considered to imply low priority; "no data but suspect" is given a score of 3; "no data but not considered suspect" is given a score equal to that for "no data available, no estimate made." The mutagenicity scoring procedure considers the potential for genetic impairment at both the somatic cell and germinal cell levels. Like the carcinogenicity scoring procedure, it is strictly qualitative, and a suspect chemical on which no data are available will score low (2 or 3~. Several types of prenatal effects are combined under the broad terms of "embryotoxicity" and "fetotoxicity." Whether other reproductive effects are distinguished from true teratogenic action is unclear. The chronic-toxicity scoring procedure has two notable components: first, it scores on the basis of quantitative dosage criteria; second, it 313

scores on the basis of the severity of an effect. No guidelines are given to indicate what specific effects would be examined or called for. Again, suspect chemicals with no data get low scores. The acute-toxicity scoring system considers lethal end points, but not functional impairment. Several opportunities for scoring are possible, because data from any route are considered. When several routes have been studied, the data that provided the highest score are used in the final priority-setting. Chemicals "suspected to have a score of 8 to 10" are assigned a score of 3 when there are no data to confirm the suspicion. Again, suspect chemicals with no data get low scores. me first phase of the screening program uses the exposure component and subcomponent scores to screen and set testing priorities for chemicals on which additional biologic-effects data are needed. The actual priority-setting treats the data as a set of component scores (for either exposure or biologic effects) that are made up of combinations of subcomponents. Each component has a maximal score of 10. The ratio of the assigned score to the maximal score is displayed. If any subcomponent receives a score of 10, it is automatically placed in a rank of high concern. Otherwise, the accumulated subcomponent ratios within a component are assigned scores, and a hazard index is calculated. Subcomponent scores are added and form the numerator of a fraction whose denominator is the sum of possible scores for each of the subcomponents within the component. A hazard index is the expression of the ratios as a percentage. With the exception that a score of 10 in any subcomponent automatically places that chemical in a category of high concern, the hazard indexes for each component are to be used to place the chemicals in categories of high, moderate, or low concern. SELECTING PRIORITIES FROM LARGE SETS OF ALTERNATIVES: THE CASE OF TOXIC SUBSTANCES REGULATION (WILHELM, 1981) Although it is not explicitly stated, this scheme seems designed to rank the TSCA Inventory list of chemicals for further toxicity testing. Seventeen scores are developed per chemical. The author argued against using a single aggregation function for these scores. Instead, he suggested nine aggregation functions, each designed for a special purpose (picking out regulatory targets, establishing testing priorities by ranking chemicals on the basis of volume and suspicion of toxicity, possible environmental problems, possible occupational problems, and suspicion of toxicity based on chemical structure). These aggregation functions are defined in terms of inequality constraints on the summary scores. A score for exposure potential is derived from a simply calculated function of production volume. Factors for exposure potential are 314

production volume, number of chemical-plant sites, and estimated number of workers exposed. The data are to be read, and processing performed, by computer. Indicators of suspicion are expressed as a series of lo scores that are reduced to three summary scores. Each score refers to the number of lines on the RTECS file on an item of interest--total number of toxic-dose lines, number of reviews (one each line), number of toxic-dose lines that deal with teratogenic, carcinogenic, and mutagenic studies, etc. Further indicators are developed for closely related chemicals, and searches are made for toxic-element components for the chemical in question. The summary scores appear to depend heavily on quantity of information, as contrasted with quality of information. For example, the human-toxicity score is l if there is one line in the RTECS file on human toxicity and 5 if there are five lines. Scoring by the number of lines in the RTECS file ignores both the nature and the quality of the published data. In defense of this approach, it is hard to imagine schemes capable of processing the 55,000 TSCA Inventory chemicals without severe simplifications. Examining the whole list of chemicals requires the use of simple indicators that almost inevitably treat some unequal things as equal. Because of the simple and mechanical nature of the scheme, it might be most useful as part of a larger scheme. Its role would be to scan the entire universe of chemicals and to put those most in need of testing on a series of (relatively) short lists. Each list could be augmented or reduced by other methods. The author believed expert judgment to be essential. Experts are to make decisions from the shorter lists generated by the aggregation functions, working on the summary of scores from the entire universe of chemicals. The scheme does not describe how the experts are to perform this role. The scheme is designed to use quantitative information. Qualifications come at the level of expert judgment, once the lists are obtained, and at the level of discussion that motivates the particular scores and summaries. mese qualifications would be more convincing if the scheme were placed in the context of a larger scheme of priority-setting that explained how expert judgments were to be used and how the short lists could be augmented by other means that might compensate for possible weaknesses due to the simplifications inherent in this scheme. The principal virtue of this scheme is its moderate use of resources. It would be useful to have some estimates of what it would cost in time, money, and personnel to implement the scheme for all 55,000 chemicals. me scheme appears to be well designed for a narrow, but highly important, role in a larger priority-setting system. 315

REFERENCES Administrative Conference of the United States. 1982. Federal regulation of cancer-causing chemicals (Recommendation No. 82-5~. Fed. Reg. 47:30710-30715. Astill, B. D., H. B. Lockhart, Jr., J. B. Moses, A. N. M. Nasr, R. L. Raleigh, and C. J. Terhaar. 1981. Sequential testing for chemical risk assessment. In R. Conway, Ed. Environmental Risk Analysis of Chemicals. Second International Congress of Toxicology, Brussels, 1978. New York: Van Nostrand-Reinhold Company. (in press) Brown, S. L., R. L. Cbfer, T. Eger, D. H. W. Liu, W. R. Mabey, K. Suttinger, and D. Tuse. 1980. A Ranking Algorithm for EEC Water Pollutants. CRESS Report No. 136. Menlo Park, Calif.: SRI International. 8 pp. Cramer, G. M., R. A. Ford, and R. L. Hall. 1978. Estimation of toxic hazard--A decision tree approach. Food Cosmet. Toxicol. 16:255-276. Dehn, R. L., and C. T. Helmes. 1974. An Automatic Procedure for Assessing Possible Carcinogenic Activity of Chemicals Prior to Testing. Menlo Park, Calif.: Stanford Research Institute. [128] pp. Gori, G. B. 1977. Ranking of environmental contaminants for bioassay priority, pp. 99-111. In U. Mohr, D. Schmahl, L. Tomatis, and W. Davis, Eds. Air Pollution and Cancer in Man. IARC Scientific Publications No. 16. Lyon, France: International Agency for Research on Cancer. Kornreich, M. R., I. C. T. Nisbet, R. Fensterheim, M. Beroza, M. Shah, D. Bradley, J. Turim, A. Pinkney, and D. Smith. 1979. Priority Setting of Toxic Substances for Guiding Monitoring Programs. Report to Office of Technology Assessment. Washington, D.C.: Clement Associates, Inc. [194] pp. Nisbet, I. C. T. 1979. Ranking chemicals for testing: A priority- setting exercise under the Toxic Substances Control Act, pp. B-41--B-54. In Scoring Chemicals for Health and Ecological Effects Testing. TSCA-ITC Workshop. Rockville, Md.: Enviro Control, Inc. Ross, R. H., and Pe Lu. 1980. Chemical Scoring System Bevelopment. Draft report to U.S. Environmental Protection Agency, Office of Toxic Substances. Oak Ridge, Tenn.: U.S. Department of Energy, Oak Ridge National Laboratory. 121 pp. U.S. Department of Health and Human Services. 1983. National Toxicology Program Fiscal Year 1983 Annual Plan. NTP-82-119. Washington, D.C.: U.S. Department of Health and Human Services, Public Health Service, National Toxicology Program. 316

U.S. Department of Health and Human Services, Food and Drug Administration, Bureau of Foods. 1982. Toxicological Principles for the Safety Assessment of Direct Food Additives and Color Additives Used in Food. Washington, D.C.: U.S. Department of Health and Human Services, Food and Drug Administration. 240 pp. U.S. Environmental Protection Agency, Toxic Integration Information System. 1980. Chemical Selection Methods: An Annotated Bibliography. Washington, D.C.: U.S. Environmental Protection Agency. Wilhelm, S. 1981. Selecting Priorities from Large Sets of Alternatives: me Case of Toxic Substances Regulation. Ph.D. dissertation. Providence, R.I.: Brown University, Department of Chemistry. 259 pp 317

APPENDIX B MATHEMATICAL MODELING OF THE PRIORITY-SETTING PROCESS AND RESULTING DECISION RULES THE FORMAL MODEL Mathematical models of the priority-setting process were constructed so that the process could be examined in a systematic way, including an examination of the following questions: Does one system have a lower misclassification cost than another? How is the system affected by changes in one or more of the design characteristics, such as prevalence rates, effectiveness of the tests, or cost of gathering information? Calculation of misclassification cost is particularly important, because it is the main criterion for selecting one priority-setting system over another (see Appendix E). The entire collection of chemicals to be considered for priority- setting is divided into N categories; categories are defined by ranges of toxicity and exposure. For example, if we were considering three degrees of toxicity of a specific type (low, medium, and high) and three degrees of exposure (low, medium, and high) , there would be nine categories in all (low exposure, low toxicity; low exposure, medium toxicity; and so on). In general, there will be several end points' and the ranges may be divided more finely than into low, medium, and high; so there will be more than nine categories in all. The main limitation on defining a large number of categories is the availability of information on exposure and toxicity. A category is denoted si, and the entire collection of categories is denoted as the set S = i S1' S2, · · ·, SN> Priority-setting systems can be regarded as decision trees on which each node or decision point corresponds to a test. In this context, a test is not limited to a laboratory test, but includes other information-gathering activities and their interpretation. A single test is denoted ti, and the entire collection of tests available to the priority-setting process is denoted as the set T = Its, t2, · ~ tk r) where k is the total number of tests available. Each test is associated with a set of possible results Rk ~ 1' k:, , . . ,rkm~k' 319

where ski is the ith result of test k, and there are mike possible results of tests k. A path ~. through a priority-setting process is defined by a sequence of tests and test results. In the example shown in Figure B-1, path hi is the sequence of (test, result) pairs (tlrl), (t2r2), (t3r3), and (tarry. The collection of possible paths is denoted ~ = Eni~ / r1 ~ ~r4 Jo \ \ 1rj FIGURE B-1 Decision tree with example path hi, consisting of tlrl, t2r2, t3r3' and t4r4 The performance of a test tk may be described by the conditional probabilities P(rki|sj), where i = 1, . . ., mik) and j = 1, . . ., N. Probability P(rki|sj) is the probability that test k will have result rki when the chemical tested belongs to category Sj. In other words, each test, tk, is associated with an array of conditional probabilities, as shown in Table B-1. TABLE B-1 Performance Characteristics of Test tk Expressed as Conditional Probabilities Test Category Results s~ rat P(rkI~8l) k2 P(rk2~81) s2 P (rkl | S2) . P(rk21 2) EN ( kll8N) · · P(rk2.1~N) rk;~(k) P(rk,3(k)| 5 1) P(rk ~(k)|82) . . . P(r~ () |8N) 320

The fraction of all the chemicals in the universe that belongs to category Sj is denoted P(sj). The probability that a chemical randomly drawn from the chemical universe will be of type s; is P(sj). The probability that a chemical chosen at random from the initial collection (the universe) will be put into the ith result category by the kth test is: N ~ P(rki|sj)P(sj). j=1 A test tk transforms pretest distribution P(sj) into a set of posttest distributions P(sj|rki), where i = 1, . . ., mike. The improvement in our understanding about the chemicals is reflected in changes in the posttest distributions relative to the pretest distribution. If tests are effective, the posttest distributions are changed from the pretest distribution. Repeated tests will produce a proliferation of outcomes whose distributions should become narrower and narrower as testing proceeds. Eventually, if there are many tests and each is effective, the multitude of distributions would converge until they had a zero standard deviation. Every chemical would then be assigned to its true category. However, because of budget constraints, lack of information, and limitations of existing tests, an actual priority-setting system will not achieve perfect classification. MEASURING PERFORMANCE Different priority-setting processes will result in different kinds of misclassification of chemicals. One requirement for comparing the performance of priority-setting processes or recommending one over another is specification of the severity of misclassification. None of the designs of the priority-setting systems reviewed by the Committee on Priority Mechanisms included a means of measuring performance and choosing among systems. To compare priority-setting systems, a concept of misclassification cost has been developed (Appendix E). At the end of the priority- setting and testing process, each chemical is classified into some category. If a chemical is assigned to category si when it belongs in category s , a misclassification cost (distinct from a monetary cost) is incurred. The magnitude of the misclassification cost depends on how much the assigned category differs from the actual category. For example, if a highly toxic chemical with high exposure is erroneously classified as having low exposure and low toxicity, the associated misclassification cost is greater than it would have been if the chemical had been classified as having low exposure and medium toxicity. C(si|sj) is the cost of classifying a chemical in category si when it belongs in category s;. 321

In principle, the above concept requires a penalty to be specified for each type and severity of misclassification. In practice, costs can be assigned for the extreme cases, and then costs can be assigned for the intermediate cases by interpolation. Also in principle, one type of true classification--C(si~si)--might be treated as more important than another type of true classification--C(sj|sj)--and different costs assigned to each. Bbwever, sufficient Information to make use of this refinement appears not to be available, and in the current model all true classifications are treated as incurring zero cost. BUDGET COST l A second requirement for comparability is definition of the budget cost of a priority-setting process. As stated above, a priority-setting process is characterized by a set of paths [ni; where i = 1, Mid , where ME is the number of paths. If the priority-setting scheme is regarded as one big test, the paths can be viewed as defining the M~ test outcomes. As in the case of a simple test, the priority-setting scheme is characterized by the conditional probabilities P(ni|~j). If path Hi is the sequence (tlrl), (t2r21, ~ (tern) Of tests and results, then P (tlEl I Sj ~ P (t2r2 I sj . . . P (tern | sj The dollar testing cost, Eli' associated with path Hi is simply the sum of the costs of the tests along the path, that is: Car, = Cal+ Cog +. . . + Con where Ck is the cost of test tk. Rather than a single testing cost, the priority-setting scheme has a vector of expected costs, E:, where j = 1, . . ., N. because chemicals of different types follow the various paths with different probabilities: ~ i=i ~ ilSj)C1ri = Z P`ni| sj' ' (Ct + Ct + ffl e total expected cost (TCT~) of process ~ per chemical is N TCT = j Ptsj)E 322 . + Ctn)

MINIMIZING MISCLASSIFICATION COST For an initial probability distribution P (sj), the probabilities associated with outcome categories of a process are given by P( Aids;) where i = 1' . . .' ~ and j = 1J . e · ~ N . The best f inal class ification of chemicals in the outcome category i s that which minimizes expected misclassification cost. The minimal misclassification cost for path Hi is CM~=h=~ N ~ P(~i~si)P(sj)C(ShlSj) and h is chosen to minimize the cost. The total expected misclassification cost for a priority-setting process is TCM~= ~ CM,~P(~i~sj)P(sj) i,j per chemical. The foregoing shows how a priority-setting process acting on an initial collection of chemicals described by a probability distribution P(sj) incurs a misclassification cost ACME and a testing cost TCT~. The optimal priority-setting process minimizes TOM while satisfying the budget constraint TCT~ < B. where B is the budget per chemical. THE PILOT MODEL In designing a priority-setting process, many interacting factors must be considered. To sharpen the committee's judgment as to the interaction of these factors, a small-scale or pilot version of the formal model was quantified to permit sensitivity analysis. The main value of the pilot model is in providing a systematic framework for the many judgments on these factors and in making it possible to sort out important implications of the interactions. The model is a bookkeeping device; it does not tell us what numbers should be entered on the books--that is a matter of judgment for toxicologists, epidemiologists, chemists, chemical engineers, and other scientists with empirical knowledge about exposure, toxicity, availability of data, and properties of tests. Nor does the model select the best priority-setting scheme--that selection is based on judgment of the predictive value of data elements and tests (and other judgments). 323

In setting priorities for testing chemicals, many judgments must be made. The pilot model reduces the number of judgments required to approximately 25 estimated probabilities related to three data elements for Stage 1, six data elements in Stage 2, and two data elements in Stage 3 (one dossier on exposure and one on toxicity). One short-term test (the Ames Salmonella/microsome test) and a bioassay could be recommended at the end of the priority-setting process. In the model's present form, the only human health effect being considered is cancer. On a mainframe computer, the pilot model takes about 1 min to run. Questions that can be investigated with the aid of the quantified model include the following: · What allocation of expenditures between gathering exposure data and gathering toxicity data minimizes costs of each stage of the process? For example, what proportions of the resources should be devoted to writing exposure and toxicity sections of a dossier? · What effort is sufficient for writing a dossier? More generally, what amount of data is sufficient to gather in each stage of the priority-setting process? · How many chemicals should be winnowed out of the process in each stage? · What is the best allocation of resources between short-term tests and long-term tests? What allocation among the many possible candidate short-term tests minimizes testing costs? Empirical investigation of these questions requires empirical judgments on the factors that are included in the pilot model. Some of these factors have already been estimated, and some are being assessed. The pilot model is available in the NAS open file. COMPARISON WITH WEINSTEIN ' S MODEL The formal model decribed above and quantified in the pilot model is similar to the one developed by Weinstein (1979), in that both embody the value-of-information concept and the minimization of expected costs. Four principal differences between Weinstein's approach and the approach used by the Committee on Priority Mechanisms should be pointed out. 0 The committee's approach focused on the design of a priority-setting process, whereas Weinstein stressed the application of a given process to a list of chemicals. In Weinstein's model, priorities are based on minimizing cost for a given priority-setting process. 324

· Weinstein's model does not consider the cost of information acquisition incurred during the gathering of data elements and the performance of toxicity tests. For the committee's task, these costs were important design considerations; thus, they were included in the model. This difference has important implications for the design of the priority-setting process. · The costs of misclassification were defined by Weinstein as the costs of expected final actions. This was in accord with the decision- theory approach developed by Raiffa (1968~; however, it requires two conditions that do not apply to the committee's task of designing a priority-setting system. First, it must be possible to estimate, at low cost and with some accuracy, the costs of likely regulatory, market, court, and individual actions expected to result from each of the possible outcomes of the priority-setting process. Second, one must assume that regulatory and other action will be based on the minimization of expected cost. · Weinstein's model was not developed to the point of being operational. me committee attempted to take the next step by doing two things. First, it expressed quantitatively the empirical information required to design the system. Second, it wrote a program for the pilot model to enable sensitivity analysis and further exploration of the practical limitations of the modeling approach. RULES FOR CHOOSING CHEMICALS GENERATED BY THE MODEL m is example is designed to illustrate the rules produced by the optimization model for selecting chemicals. The example is illustrative, in that it depends on the accuracy of the estimates of the characteristics of the population of chemicals and estimates of the accuracy of the tests and on the results of the model changes as these estimates change. This model is for cancer testing, but it appears that the model can be used to design a priority-setting system for other health effects or groups of effects. A computer program of the model is available from the National Academy of Sciences. Data elements used in the model have been developed by performing the following steps: choose information; define a small number of mutually exclusive outcomes that include all chemicals considered; arrange the outcomes in order of estimated degree of exposure or toxicity; estimate the degree of probability of given degrees of toxicity of or exposure to chemicals with each outcome. Three data elements are used in Stage 1--one for exposure and two for toxicity. The exposure data element is based on intended use and production. The 15 possible outcomes (the product of 5 use categories and 3 production ranges) have been grouped into 13 mutually exclusive outcomes (see Table 31. 325

The first toxicity data element is a crude application of structure- activity relationships based on membership in a chemical group associated with a human health effect (see Table 5~. A nonstructurable chemical is considered to be slightly less likely than one randomly drawn from the universe to be a carcinogen. A structurable chemical not a member of a chemical group associated with a health effect is considered to be least likely to be a carcinogen. A chemical that is a member of a chemical group associated with some health effect other than cancer is considered slightly less likely than a randomly drawn chemical to be a carcinogen. Chemical groups associated with cancer are divided into groups of low, medium, and high suspicion. The second toxicity data element used in Stage 1 is listing in RTECS. The outcomes and the estimated probabilities associated with them are shown in Table 7. A chemical not listed in RTECS at all is considered to be slightly less likely than a chemical randomly drawn from the universe to be a carcinogen. If a chemical is listed in RTECS, but without mention of CAR or MUT, it is considered a little more suspect, but still not as suspect as a randomly drawn chemical. If MUT is mentioned, but not CAR, suspicion increases substantially; if CAR is also mentioned, suspicion increases still more. The data elements for Stages 2 and 3 are less precisely defined, and there is less information on which to base an estimate of accuracy than in the case of data elements for Stages 1 and 4. The reasoning used to estimate the accuracy for data elements in Stages 2 and 3 is as follows. The cost of the data elements is assumed to be about $30-300 per chemical for Stage 2 and $1,000-4,000 for Stage 3 (otherwise, the budget constraint chosen for the model cannot be met if thousands of chemicals are considered in Stage 2 and hundreds in Stage 3~. We assume that accuracy of Stages 2 and 3 will be intermediate between that of Stages 1 and 4. In terms of increasing accuracy, we can probably arrange the stages in the order 1, 2, 3, and 4. The procedure for estimating the accuracy of Stages 2 and 3 differs from that for Stages 1 and 4. For the latter two, expert judgment and quantitative information were used. For Stages 2 and 3, a sensitivity analysis determined the accuracy compatible with data searches costing $30-300 per chemical in Stage 2 and with writing dossiers costing $1,000-4,000 in Stage 3. (If the accuracy of Stages 2 and 3 is too high, the model of the priority-setting system will not specify short-term or long-term testing. If the accuracy is too low, the model will not specify using Stages 2 and 3, but will specify going directly from Stage 1 to testing in Stage 4.) The model is used to estimate how accurate Stages 2 and 3 must be for them to be used moderately. Once the priority-setting system is operating, data may be obtained to validate the estimated accuracy of all four stages. The data elements in Stage 4 are a short-term test and a long-term test. The short-term test is a battery consisting of an Ames Salmonella/microsome test and a cell-transformation test, which is assumed to be slightly more accurate than the Salmonella/microsome test alone. The test accuracy is based on data reported by McCann and Ames, Purchase et al., and Lave et al. The long-term test is a rodent 326

bioassay. Its accuracy has been analyzed by Tarone, Fears, and Chu, among others. The long-term test is expected to identify 95% of the noncarcinogens correctly as negative and 93% of carcinogens correctly as positive. There are 6 possible outcomes of the exposure data element, 6 possible outcomes of the chemical-group toxicity data element, and 4 possible outcomes of the RTECS toxicity data element. Thus, there are (6 x 6 x 4 =) 144 possible outcomes of the combined data elements in Stage 1. Similar outcomes were combined to simplify calculations. Outcomes are ~similar. if they lead to similar estimates of the 9 possible degrees of public-health concern (each formed from one of the 3 degrees of toxicity and one of the 3 degrees of exposure). After examination of the estimated means and standard deviation of the estimated probability distributions, the 144 possible outcomes of Stage 1 were grouped into 9 outcome categories; each of these 9 "branches leads from Stage 1 to the later stages. Branch 1 has the lowest standard deviation and the lowest mean; Branch 9 has the highest standard deviation and the highest mean. Most of the chemicals use Branch 1, because it has a low mean for public-health concern and most chemicals are considered to have low exposure and low toxicity. Chemicals in Branch 1 are removed from further consideration. Dossiers are not written in Stage 2; however, a limited search for exposure data is provided for in Stage 2. Fewer chemicals use Branches 2-9. These branches have higher means and higher standard deviations for public-health concern. Note that Branch 2 calls for the larger search for exposure data and the smaller search for toxicity data in Stage 2. Branch 2 provides for dossiers, except when the results of the search for toxicity data suggest low toxicity and the results of the search for exposure data suggest low or medium exposure. Branches 3-9 all provide for dossiers (except Branch 3 when results of Stage 2 data searches suggest both low exposure and low toxicity) , and Branches 7, 8, and 9 often lead to recommendations for a long-term test, depending on the results of the Stage 2 data searches and the assessments based on the dossiers. On the basis of the assumption that 95% of the chemicals considered by the priority-setting system are noncarcinogens, 4% are moderate carcinogens, and 1% are highly potent carcinogens and on the basis of the estimated accuracy of the three selection stages (Stages 1-3) and the tests in Stage 4, the selection rules are expected to lead to recommendations for 62 Stage 4 long-term tests, 690 Stage 4 short-term tests, and 417 Stage 3 dossiers for each 10, 000 chemicals. To demonstrate how the selection rules are used, we assume that a chemical is a pesticide with unknown production, that it is a member of a chemical group associated with a low suspicion of cancer, and that it is 327

listed in RTECS with no mention of CAR or MUT. We go to the breakdown for the Stage 1 exposure data elements "Intended Use and Productions; the fourth entry includes pesticides with unknown production. That entry leads us to Stage 1 toxicity data element Part 4, where we seek a member of a chemical group with a low suspicion of cancer (the fourth entry). There, we find "in MECS, no mention of MUT or CARE; that leads to Branch 2 and completes Stage 1. Via Branch 2, we are led to Stage 2 information-gathering activities consisting of a larger search for exposure information and a limited search for toxicity information. Assume that analysis of the information gathered in Stage 2 leads to the assessment that the chemical has medium exposure and is highly toxic. This leads us to prepare a dossier in Stage 3. If an expert committee reviews the dossier and concurs in the assessment of high toxicity, then the decision rules recommend a long-term test in Stage 4. 328

SELECTION RULES FOR AN ILLUSTRATIVE PRIORITY-SETTING SYSTEM FOR CANCER STAGE 1: EXPOSURE DATA ELEMENT Intended Use and Production (based on Table 3) General commerce (TSCA) , other, or unclassified with production less than 104 lb/yr Cosmetic with production less than 104 lb/yr; or general commerce, other, or unclassified with production between 104 and 106 lb/yr Go to Stage 1 Toxicity Data Elements Part 1 2 Drug, food chemical, pesticide or unknown with production less than 104 lb/yr; or cosmetic, general commerce, other, or unclassified with unknown production 3 Food chemical, drug, or pesticide with unknown production General commerce, other, or unclassified with production between 106 and 108 lb/yr; or cosmetic with production between 104 and 105 lb/yr Drug, food chemical, or pesticide with production equal to or greater than 104 lb/yr; cosmetic with production greater than 105 lb/yr; or general commerce or other with production greater than 108 lb/yr 329 - 4 5 6

STAGE 1: TOXTCTTY DATA ELEMENTS Part 1 Go to Stages 2-4 via Branch Not a member of a chemical group (Table 5) not in RTECS e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 1 in RTECS, no mention of MUT in RTECS, mention of MUT, but not CARe e ~ ~ ~ ~ ~ e in RTECS. mention Member of a chemical group, but not one associated with cancer not in RTECS. e ~ ~ ~ e ~ e ~ e ~ ~ e ~ e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAR e e e e e ~ e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e e in RTECS, mention of Not structurable r~ - ; ~ D,T1~C: 1 2 2 in RTECS, no mention of MUT or CAR e e e e e e e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e e in RTECS, mention of ~ Member of a chemical group associated with cancer with "low suspicion" not in RTECS................................ in RTECS, no mention of MUT or CAR.......... in RTECS, mention of MUT, but not CAR in RTECS, mention of __ 2 2 l Member of a chemical group associated with cancer with "moderate suspicion" not in RTECS............................... in RTECS, no mention of MUT or CARe.~e in RTECS, mention of MUT, but not CAR...... in RTECS, mention of CAR ~4 Member of a chemical group associated with cancer with "high suspicion~ not in RTECSe.~e in RTECS, no mention of MUT or CAR.......... in RTECS ~ mention of MUT, but not CAR....... in RTECS, mention of _ _ 2 4 2 4 330

Part 2 Go to Stages 2-4 via Branch Not a member of a chemical group (Table 5) not in RTECS . eeeeeeeeeeeeeeeeeeeeeeeeaeeeeee 1 in RTECS, no mention of MUT or CAReea.~eee in RTECS, mention of MUT, but not CARe.ee.~e in RTECS. mention Member of a chemical group, but not one associated with cancer not in RTECS eea.~.e.eeaeeee.~.eeeeee 1 in RTECS, no mention of MUT or Mare in RTECS, mention of MUT, but not CARe.~.eee in RTECS, mention of CAR 2 2 Not structurable not in RTECS. eaeeeeeaeeeeeeeeeeeeeeeeeeeeeeee 1 in RTECS J no mention of MUT or CAReeeeeeeeeee in RTECS, mention of MUT, but not CAReaeeeeee in RTECS/ mention of CAReeeeeee..eeeee..eeeae Member of a chemical group associated with cancer with "low suspicion" not in RTECS eee~eeeeeeeeea.~eeeeeea.~e in RTECS, no mention of MUT or CAReaea~eeee. in RTECS, mention of MUT, but not CAR in RTECS, mention of And 2 2 3 3 l Member of a chemical group associated with cancer with "moderate suspicion" not in RTECS in RTECS, no mention of MUT or CAR e in RTECS, mention of MUT, but not CAR 6 in RTECS, mention of CAR 4 Member of a chemical group associated with cancer with thigh suspicion" not in RTECSeeeeee.~. in RTECS, no mention of MUT or CAR in RTECS, mention of MUT, but not CARIES in RTECS, mention of CAR ~4 2 6 331

Part 3 Go to Stages 2-4 via Branch Not a member of a chemical group (Table 5) not in RTECS . e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAR e e e e e ~ ~ e e e in RTECS, mention of MUT, but not CAR. e e e e e e in RTECS. mention Member of a chemical group, but not one associated with cancer not in RTECS . e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e in RTECS, no mention of MUT or CAReee.eeeaee in RTECS, mention of MUT, but not CAReee in RTECS, mention of CAR. e e ~ ~ e e ~ e e ~ e e e e e e e e e Not structurable not in RTECSe e e e e e ~ e e ~ e ~ ~ e e e ~ e ~ ~ ~ ~ e e e e ~ e e ~ e e in RTECS, no mention of MUT or CAR. e ~ e e e e e e e in RTECS, mention of MUT, but not CAR e ~ ~ ~ ~ ~ e in RTECS, mention of CAR e e ~ e ~ e ~ ~ ~ ~ 2 2 2 1 1 4 Member of a chemical group associated with cancer with "low suspicion" not in RTECS. e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAR e e e e ~ ~ e e ~ e in RTECS, mention of MUT, but not CAR e e ~ ~ e e 6 in RTECS, mention of CAR e e e e e e e e e e e e e e e e e ~ e e 4 Member of a chemical group associated with cancer with "moderate suspicion" not in RTECS................................ in RTECS, no mention of MUT or CAR.......... in RTECS, mention of MUT, but not CAR....... in RTECS, mention of CAR e ~ ~ ~ Member of a chemical group associated with cancer with "high suspicion" not in RTECSe e e ~ e e e e e e e e e e e e e e e e e e e e e e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e e e e in RTECS' mention of MUT, but not CAR.eeeeee in RTECS, mention of 2 6 4 2 2 6 332

Part 4 Go to Stages 2-4 via Branch Not a member of a chemica' group (Table 5) not in RTECS . e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e in RTECS J mention of CAR e e e e e e e e e e e e e e e e e e e e Member of a chemical group, but not one associated with cancer not in RTECSe e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAR e e e e e e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e in RTECS. mentic~n c~f 3 2 Not structurable not in RTECS e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAEte e e e e e e e e e in RTECS, mention of MUT, but not CAR e e e e e ~ e in RTECS, mention of CAR e e e e e e e e e e e e e e e e e e e e Member of a chemical group associated with cancer with ~low suspicion" nnt~ i n 12~1~F~f~.C 4 4 in RTECS, no mention of MUT or CAR.......... in RTECS, mention of MUT, but not CAR . in RTECS I mention of CAR- . . e e ~ e 2 6 6 Meraber of a chemical group associated with cancer with "moderate suspicion" not in RTECS 1 in RTECS, no mention of MUT or CAB.......... in RTECS, mention of MUT, but not CAR....... in RTECS, mention of CAR.................... Mer~ber of a chemical group associated with cancer with "high suspicionn not in RTECS. e e e e e e e ~ e e ~ e e e e e e e e e e e e e e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e in RTECS ~ mention of CAR e e e e e e e e e e e e e e e e e e e e 333 2 6 6 2 7 6

Part 5 Go to Stages 2-4 via Branch Not a member of a chemical group (Table 5) not in RTECS ... e ~ ~ e ~ ~ e e e e e ~ e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CARee. in RTECS, mention of MUT, but not CAR e a ~ e e e e in RTECS, mention of 4 Member of a chemical group, but not one associated with cancer not in RTECS. e ~ e e e e ~ e e e e e e e e e e e e e e e e ~ e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e ~ e in RTECS, mention of MUT, but not CARe e e e in RTECS, mention of 1 4 Not structurable not in RTECS e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in RTECS, no mention of MUT or CAR e e e ~ e e e e e e in RTECS, mention of MUT, but not CAR. e e ~ e e e in RTECS, mention of Member of a chemical group associated with cancer with "low suspicion" not in RTECS. e e e e e e e e e e e e e e e ~ e e e ~ e ~ e e ~ e ~ e e e ~ in RTECS, no mention of MUT or CAReeeeea.~e in RTECS, mention of MUT, but not CAR in RTECS, mention of CAR e e e e e ~ ~ e e e e ~ e e e e e e e e Member of a chemical group associated with cancer with "moderate suspicion" not in RTECS. e e e ~ e ~ ~ e e ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ e ~ ~ ~ ~ ~ e in RTECS, no mention of MUT or CARea~eeeea.e in RTECS, mention of MUT, but not CAR....... in RTECS, mention of CAR.. e e e a . ~ ~ . e e e a Member of a chemical group associated with cancer with "high suspicion" not in RTECSe ~ ~ e ~ ~ e e e e ~ ~ ~ e e in RTECS, no mention of MUT or CAR in RTECS ~ mention of MUT, but not CARe in RTECS, mention of CAR e e ~ ~ ~ ~ ~ ~ ~ ~ e e e e ~ e ~ e ~ e 334 1 2 8 6 2 8 7 2 8 8

Part 6 Go to Stages 2-4 via Branch Not a member of a chemical group (Table 5) not in RTECS . e e ~ e e e e e e e e e e e e e e e e e e e e e e e e e e e e 1 in MECS/ no mention of MUT or CAR e e e e e e e e e e e 1 in RTECS, mention of MUT, but not CARe e e e e e e e 4 in RTECS, mention of CAR e e e e e e e e e e e e e e e e e e e e e Member of a chemical group, but not one associated with cancer not in RTECS. e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e e e e e in RTECS, mention of MUT, but not CARe e e e e e e e in N7ECS. mention of 4 4 Not structurable not in RTECS e e e e e e e e e e e e e e e e e e e e e ~ e e e e e e e e e e e in RTECS, no mention of MUT or CAR e e e e e e e e e e e in RTECS, mention of MUT, but not CARe ~ ~ e e ~ e in RTECS, mention of 6 Member of a chemical group associated with cancer with ~low suspicion" not in RTECSe · e e e e e e e e e e e e e e e e e e e e e e e e e e e e in RTECS J no mention of MUT or CAR e e e e e e e e e e e in RTECS, mention of MUT, but not CAR in RTECS, mention of CAR e e e e e e e e e e e e e e e e e e e e e Member of a chemical group associated with cancer with ~moderate suspicion" in RTECS, no mention of MUT or CAR.......... in RTECS, mention of MUT, but not CAR....... in RTECS, mention of CAR ~ Member of a chemical group associated with cancer with "high suspicion~ not in RTECS.~e e e · · e e e · · e e e e · · e e e e e e · e e e e in RTECS' no mention of MUT or CAR e e e e · e e e e e in RTECS, mention of MUT, but not CAR e e e e e e e in RTECS, mention of 335 2 9 8 2 3 9 9 2 4 9 9

STAGES 2-4 Branch 1 Stage 2: Perform a limited search for exposure data. chemical further. Branch 2 Do not consider Stage 2: Perform search for exposure data and limited search for toxicity data. Assessments Based on Data Gathered in Stare 2 Exposure Toxicity Assessments Based Stage 4: Stage 3: on Data Gathered Recommended Data Gathering in Stage 3 Testinga Low Low No dossier None ST Low Medium Dossier None ST Low High Dossier High toxicity LT Otherwise ST Medium Low No dossier None ST Medium Medium Dossier High toxicity LT or high and high exposure Otherwise ST High Low, Dossier High exposure LT medium, or high a ST = short-term test; LT = long-term test. 336

Branch 3 Stage 2: Perform search for exposure data and limited search for toxicity data. Assessments Based on Data Gathered in Stage 2 Exposure Toxicity Stage 3: Data Gathe r ing . Low Low Low Low No dossier Medium Dossier High Medium Low Doss ier Dossier Medium Medium Dossier Medium High Assessments Based on Data Gathered in Stage 3 High toxicity and high exposure Otherwise High toxicity Otherwise Stage 4: Recommended Testinga LT ST LT ST . High toxicity; or LT high exposure and medium toxicity Otherwise ST High toxicity; or LT high exposure and medium toxicity Otherwise ST High exposure or LT high toxicity Otherwise . Dossler ST High exposure or LT high toxicity; or medium exposure and high toxicity Otherwise 337 ST

Branch 3 (continued) High Low Dossier High exposure; or LT medium exposure and high toxicity Otherwise ST High Medium Dossier High exposure or LT high toxicity; or medium exposure and medium toxicity Otherwise High High Dossier - a ST = short-term test; LT = long-term test. 338 ST Low exposure and low toxicity Anything but low exposure and low toxicity ST LT

Branch 4 Stage 2: Perform search for exposure data and limited search for toxicity data. Assessments Based on Data Gathered in Assessments Based Stage 4: Stage 2 Stage 3: on Data Gathered Recommended Exposure Toxicity Data Gathering in Stage 3 Testinqa Low Low Dossier High toxicity and LT low or medium exposure Medium exposure and low or medium toxicity ST Low exposure and None low or medium toxicity Low Medium Dossier High toxicity LT Medium toxicity; ST or low toxicity and medium exposure Low toxicity None and low exposu re Low High Dossier High toxicity LT Otherwise ST Medium Low Glossier Medium Medium Dossier or high High toxicity Medium toxicity Low toxicity and low exposure LT ST None High toxicity or LT high exposure Otherwise ST High Low, Dossier High toxicity LT medium, or high exposure or high Otherwise ST aST = short-term test; LT = long-term test. 339

Branch 5 - Stage 2: Perform search for exposure data and limited search for toxicity data Assessments Based on Data Gathered in Stage 2 Exposure Toxicity Low Low Low Low Medium Dossier High Medium Low Stage 3: Data Gathering Dossier Dossier Dossier Medium Medium Dossier Medium High Assessments Based on Data Gathered in Stage 3 High toxicity and low or medium exposure Stage 4: Recommended Testinga LT Medium toxicity ST and low or medium exposure Low exposure and None low toxicity High toxicity and LT low or medium exposure Otherwise High toxicity; or medium toxicity and medium or high exposure Otherwise ST LT ST High toxicity and LT low or high exposure Otherwise High toxicity and low or high exposure Otherwise Dossier ST LT ST High exposure or LT high toxicity; or medium exposure and medium toxicity Otherwise 340 ST

Branch 5 (continued) High Low Dossier High or medium LT exposure or high toxicity Otherwise ST High Medium Dossier High exposure or LT high toxicity; or medium toxicity and low or medium exposure Otherwise ST High High Dossier High or medium LT toxicity Otherwise ST a ST = short-term test; LT = long-term test. 341

Branch 6 Stage 2: Perform search for exposure data and limited search for toxicity Data. Assessments Based on Data Gathered in Assessments Based Stage 4: Stage 2 Stage 3: on Data Gathered Recommended Exposure Toxicity Data Gathering in Stage 3 Testinga Low Low, Dossier High toxicity LT medium, or high Otherwise ST Medium Low, Dossier High exposure; or LT medium, high toxicity and or high low exposure Otherwise ST High Low Dossier High toxicity and LT medium or high exposure; or low toxicity and high exposure Otherwise ST High Medium Dossier High toxicity; or LT high exposure and low toxicity Otherwise ST High High Dossier High toxicity and LT medium or high exposure; or medium toxicity and medium exposure Otherwise a ST = short-term test; LT = long-term test. 342 ST

Branch 7 Stage 2: Perform search for exposure data and limited search for toxicity data. Assessments Based on Data Gathered in Stage 2 Exposure Toxicity Low Law Low Low Stage 3: Data Gathering Dossier Medium Dossier High Medium Low . Dossler Dossier Medium Medium Dossier Medium High High Dossier High, medium, or low Dossier a ST = short-term test; LT = long-term test. 343 Assessments Based on Data Gathered in Stage 3 High toxicity Otherwise Stage 4: Recommended Testinga LT ST High toxicity or LT high exposure Otherwise ST High toxicity or LT high exposure Otherwise ST High exposure; or LT high toxicity and low exposure Otherwise ST High toxicity or LT high exposure Otherwise ST Low or high exposure LT and high toxicity; or medium exposure and medium or high toxicity Medium exposure and low toxicity ST Low or high exposure LT and high toxicity; or medium exposure and medium or high toxicity Medium exposure and ST low toxicity

Branch 8 Stage 2: Perform search for exposure data and limited search for toxicity Oata. Assessments Based on Data Gathered in Assessments Based Stage 4: Stare 2 Stage 3: on Data Gathered Recommended Exposure Toxicity Data Gathering in Stage 3 Testinga Low Low Dossier High exposure; or LT high toxicity and low exposure Otherwise Low or Medium Dossier medium ST High exposure or LT high toxicity Otherwise ST Low High Dossier High toxicity; or LT high exposure; or medium toxicity and medium exposure Otherwise Medium Low ST Dossier High exposure; or medium exposure and high toxicity Otherwise LT ST Medium High Dossier Medium or high LT toxicity or high exposure Otherwise ST High Low Dossier Medium or high LT toxicity or high exposure Otherwise ST High Medium Dossier Medium or high LT toxicity; or low toxicity and medium or high exposure Low toxicity and low exposure High High Dossier All assessments LT a ST = short-term test; LT = long-term test. 344

Branch 9 Stage 2: Perform search for exposure data and limited search for toxicity data. Assessments Based on Data Gathered in Assessments Based Stage 4: Stage 2 Stage 3: on Data Gathered Recommended Exposure Toxicity Data Gathering in Stage 3 Testinga Low Low Dossier High exposure; or LT high toxicity and medium exposure Otherwise ST Low Medium Dossier High toxicity or LT high exposure; or medium toxicity and medium exposure Otherwise ST Low High Dossier High or medium LT toxicity; or high exposure Low toxicity and ST medium or low exposure Medium Low Dossier High toxicity or LT high exposure Otherwise ST Medium Medium Dossier High or medium LT exposure or high toxicity Otherwise ST Medium High Dossier Low toxicity and ST low exposure Otherwise 345 LT

Branch 9 (continued) High Low Dossier Low or medium LT exposure; or high exposure and high toxicity Otherwise ST High Medium Dossier Low toxicity and ST low exposure Otherwise LT High High Dossier All assessments LT a ST = short-term test; LT = long-term test REFERENCES , . Raiffa, H. 1968. Decision Analysis: Introductory Lectur under Uncertainty. Reading , Mass.: Addison-Wesley. On Choices 309 pp. Weinstein, M.C. 1979. Decision making for toxic substances control: Cost-effective information for the control of environmental carcinogens. Public Policy 27:333-383. 346

APPENDIX C EXPERT JUDGMENT AND THE TREATMENT OF UNCERTAINTY Priority-setting systems inescapably involve uncertainty and the exercise of judgment. The data on which such systems operate--the toxic properties of candidate chemicals and the circumstances in which humans are exposed to them--can be known only within a range of uncertainty. A method is needed for taking this uncertainty into account and for registering how the acquisition of more knowledge will affect it. Even with perfect knowledge of the toxicity of all chemicals, it would be necessary to rank the various forms of toxicity by reference to agreed standards of severity, which are based on judgments. Scientific comparisons of forms of toxicity are, however, only part of an overall evaluation of the impact of these toxicities on society. The potential severity of various health effects may be greatly increased or reduced by factors that affect the rate, characteristics, and consequences of exposure and actual incidence. These factors include the genetic and demographic characteristics of populations; patterns of housing and transportation; social organization, in general and in relation to work; the organization and adequacy of public-health measures, health care, and social welfare (both private--including familial--and public); and the amenability of these and other relevant factors to changes made to reduce or mitigate exposure, incidence, and their consequences. Thus, the valuation of the social impacts of toxicities requires a broad array of judgments. Finally, the assignment of priorities for testing is, in itself, an allocation of scarce resources that may materially affect the interests of social groups competing for those resources. Such allocations involve issues of social justice and, in a democracy, politics. Weir resolution requires additional kinds of judgment concerning matters of procedure, as well as substance. In its first report, the Committee on Priority Mechanisms recommended that the priority-setting process to be developed for NTP be explicit about the use of uncertainty and judgment, provide a rationale and means for taking them into account, and address their use in a deliberate manner that is documented and articulated, but also preserve a record for later review and evaluation (National Research Council, 19811. PROBABILITY JUDGMENTS Existing priority-setting systems fail to offer a persuasive treatment of the uncertainty with which the health hazards of chemicals are evaluated (National Research Council, 1981~. Often, information is summarized in a single number representing a best guess at toxicity or 347

exposure. However accurate such point estimates may be, they are relatively uninformative in the setting of priorities in the face of two key questions: How much uncertainty surrounds the estimate? What are the opportunities for reducing that uncertainty through testing? Moreover, existing schemes do not characterize the potential of the tests for reducing uncertainty. Without that characterization, one has no basis for arguing that a particular test offers the best value for money spent--nor can the policy-makers who must rely on the results of testing know the likely false-negative and false-positive rates for the data at which they are looking. Where uncertainty is considered, the treatment is often unsatisfactory. For example, in one system examined by this committee, default values for missing data are combined with estimates, without attention to the differing degrees of uncertainty about each. In another system, Strong evidence is scored with positive numbers and "weak. evidence with negative numbers, without explanation of how the two types of information should be treated in the priority-setting process. Such treatment of uncertainty not only obscures the logic of a scheme, but also precludes any systematic test of its predictive accuracy. The committee has attempted to develop a scheme that will deal with uncertainty more adequately. In its current configuration, the scheme requires experts in exposure and toxicity to judge a number of probabilities for simple events, such as a finding that a chemical selected from a particular group will prove carcinogenic or a finding that a chemical that appeared carcinogenic in a particular test is actually not carcinogenic. These judgments can be regarded as estimates of prevalence and of false-positive rates. This innovation makes explicit some judgments that scientists ordinarily make implicitly when selecting tests. This allows the proposed scheme to provide a number of attractive features. When supplemented with even crude estimates of the costs of tests and of different kinds of errors, these probability judgments provide the basis of value-of-information analyses, which allow one to identify in a logically defensible manner which chemicals to test and which test" to use. Although the identification of chemicals and tests depends on probability judgments, as well as cost estimates, the values used are open to public scrutiny, and objective analysis of the process is facilitated. The use of a mathematical model enables investigators to analyze the sensitivity of the data elements used and of the values attached to them. When supplemented with test data, probability judgments can be made explicit, thereby enabling investigators to evaluate the performance of the entire scheme. Probability judgments are constrained primarily by the requirement that the "judges" have substantive expertise, but not necessarily in making probability assessments per se. It must be possible to explain 348

probability assessment procedures to these experts simply, effectively, and unintrusively. USE OF PROBABILITIES IN DECISION-MAKING The studies of Ramsay, van Neumann and Morgenstern, Wald, and others have provoked rapid development in the theory of decision-making over the last 30 years (see reviews by Howard, 1975; Mishan, 1976~. Their work has shown that coherent decision-making methods for situations involving uncertainty require an explicit treatment of the uncertainty. Failing to provide such treatment leads to schemes that are confusing, in that untested and often false values are implicitly assigned to uncertainties (Fischhoff et al., 19811. On a practical level, decision theorists have developed a sophisticated repertoire of frameworks for characterizing decision-making situations and aids for resolving them. Decision analysis, value-of-information analysis, probabilistic risk assessment, and probabilistic information-processing systems are some of the procedures that have been developed. Where appropriate, each procedure involves the use of judgmentally assessed probabilities. Probably the greatest successes with these procedures have been achieved in management and business administration. Other applications of explicit probability judgments include meteorology (Murphy, 1973), atmospheric-pollution studies (Moreau, 1980), and nuclear-power projects (U.S. Nuclear Regulatory Commission, 1975~. These examples indicate that others in both government and business have found judgmental probabilities to be useful and practical. Fields in which such probabilities are used in tasks akin to value-of-information analysis include petroleum geology and clinical diagnosis. One of the earliest practical uses of decision analysis was in the allocation of resources for oil exploration. Reliance was placed on expert geologists to assess the probabilities of various outcomes of different possible test drills (Raiffa, 1968~. To improve the allocation of scarce medical resources, several projects have been undertaken to study the ability of physicians to predict the outcomes and value of various diagnostic procedures. In radiology, for example, such projects have resulted in recommendations that some procedures be discontinued (Christensen-Szalanski et al., 1982~. ELICITATION OF PROBABILITY JUMMENTS As the preceding applications suggest, it has proved possible to obtain explicit probability judgments from a wide variety of persons dealing with diverse topics. It is reasonable to ask the price of these demonstrations and what profits were realized. 349

The evidence from both laboratory experiments and field applications (Lichtenstein et al., 1982) indicates that, once people have accepted the idea of providing probability judgments, instruction in the actual procedures is straightforward (Howell and Burnett, 19783. Some response modes require no explanation at all. People know what it means to say that "the probability of rain tomorrow is 0.30" or "the odds that this drilling site will not yield gas or oil above 8,000 feet are 4:1. n For these situations, there has been only a negligible improvement in responses after the meaning of probabilities was explained. With other, less familiar response modes, some direct instruction may be necessary (Lichtenstein and Newman, 1967~. For example, when expressing incomplete knowledge about a continuous quantity, one may assess a cumulative probability distribution (e.g., there is a 0.05 chance that the rate is less than 0.003, a 0.10 chance that it is less than 0.005, etc.) or use the fractile method described by Raiffa (1968) (e.g., there is a 0.05 chance that the rate is less than 0.003, and a 0.05 chance that the rate is greater than 0.012, and the underlying distribution is log-normal). The former method has been used in studies of atmospheric oxidation rates, the latter in probabilistic risk assessments for nuclear-power stations. An occasional source of difficulty is confusion about the event in question. me recent EPA effort to use probabilistic methods to set ambient-air quality standards, although thoughtfully conceived, was hampered by the poorly defined and unfamiliar units whose likelihood experts were required to judge. For example, one key question asked scientists to estimate the probability that severe health effects would result under specified conditions. This required the scientists to assign some value to a term (severe health effects) that should be defined by policy-makers (Moreau, 1980~. me design of assessment procedures requires the exercise of good sense. It cannot be assumed that the asking of a question ensures an articulate response. Part of the art of analysis is the ability to divide a complex problem into portions encompassing subjects on which people have the required expertise. The recommended scheme asks for rather straightforward judgments about data that are familiar to respondents. A potential danger is that good probability assessment will be achieved at the expense of attention to the scientific issues being judged. One might imagine respondents being fascinated by the newness of probability judgments or eager to please the investigator with their assessments. Here, too, good sense may be sufficient to allay most fears. Indeed, if an expert in probabilities is playing an active role as the questioner in the proceedings, he or she should be evaluating the respondents' performance with a scoring rule that rewards them for 350

responding with their true beliefs (Murphy, 1973~. Evidence suggests that the procedures for organizing one's knowledge in a manner that will produce the best appraisal of what is known are the same procedures that produce the best probability assessment (i.e., a careful review of all that Is known with an emphasis on evidence that seems to contradict the dominant opinion). Better substantive judgment and better probability assessment should go hand in hand. This discussion is based on the assumption that experts in pertinent substantive matters are willing to participate in the assessment procedure. The product of the elicitation is likely to suffer if the respondents do not have confidence in the procedure. mere are a number of reasons for hesitancy. Some resistance comes from mistrust, not of probability assessments, but of the analytic schemes in which they are embedded. Analysts have, at times, promised more than they can deliver, namely, a value-free, definitive solution to a difficult social problem. Scientists may also be unwilling to engage in a procedure based on an unfamiliar concept to which their experience cannot be applied easily. Addressing this reluctance requires a spirit of compromise. The assessor needs to be willing to make an attempt. Finally, scientists may simply resist being explicit about their uncertainties. A common ploy of analysts in such situations is to ask different people for the numerical equivalent of verbal expressions of uncertainty and then to show the range of responses elicited. If, for example, the probabilities associated with Further fighting in Cyprus is Suite likely this year" vary from 0.25 to 0.80, then one can argue that the use of verbal expressions constitutes a barrier to communication. One hopes that scientists would also accept such demonstrations as persuasive and would be less likely to seek the shelter of vagueness when expressing their uncertainty. QUALITY OF PROBABI LITY JUDGMENTS A variety of judges, topics, instructions, and response modes have been used in several hundred empirical studies of the quality of Probability assessments (Lichtenstein et al., 1982~. Two major conclusions resulted from these studies: · Expressed confidence is correlated with justified confidence. Events that are judged to be more probable are, in fact, more likely to occur. People typically know more when they express great certainty than when they express uncertainty. · Expressed confidence is not an infallible guide to the absolute extent of people's knowledge. Although confidence is sensitive to knowledge, it is not sensitive enough. The most commonly observed 351

overall ~bias" is a tendency toward overconfidence, except when judgments are very easy (Fischhoff et al., 1977~. These conclusions seem to be independent of such factors as the instructions used, the amount of practice in making such judgments, the stakes riding on good performance, familiarity with the subject in question (except insofar as it reduces uncertainty to nil), and the response mode used (provided that it has been properly explained). IMPROVING THE QUALITY OF PROBABILITY JUDGMENTS Two procedures have been found to improve the quality of probability judgments: providing personalized feedback after a round of practice questions (Lichtenstein and Fischhoff, 1980) and requiring assessors to be explicit about the set of reasons supporting and (especially) contradicting their beliefs (Koriat et al., 1980~. A number of theoretically based procedures have been developed for aggregating the probability assessments of a set of judges (Lindley et al., 1979~. These procedures require one to make some assumptions about the quality of the probabilities that the different judges provide. Empirical evidence suggests that people who express more confidence tend to know more, other things being equal. The identity and power of those Bother things" are not altogether clear. One factor that obviously needs to be considered is how well the respondents are motivated toward candor. To determine whether the assessment that can be expected in specific situations is "good enough," it is necessary to make a comparative analysis of the consequences associated with potential biases and those associated with forgoing the probability assessment altogether. In setting priorities for testing chemicals for health effects, the utility of estimating explicit probabilities is likely to be substantial. They will lead to better testing decisions, and they will facilitate evaluation of the scheme's efficacy. If included from the beginning of the process, even rudimentary assessments will facilitate the inclusion of more sophisticated assessment procedures as they become available (and if evaluation shows the need for them). A modest amount of modeling might also provide some indications of the effect of biases on estimates of probabilities. Many estimates will prove to be only slightly affected by even fairly large biases; greater effects may be caused by errors originating in other sources. Instruction in probability assessment is straightforward and can be given in a short time. Care is needed to explain explicitly the particular method used (e.g., probabilities, odds, or fractiles). Performance of an expert may be improved considerably by providing a little training, including practice questions whose answers are compared with known values. The consequences of inaccurate probability estimates depend on the testing decision associated with them. Many testing decisions are likely to be unaffected by errors in those estimates. 352

REFERENCES Christensen-Szalanski, J., P. Diehr, and R. Wood. 1982. Phased trial of a proven algorithm at a new primary care clinic. J. Public Health 21:16-22. Fischhoff, B. 1980. Clinical decision analysis. Oper. Res. 28:28-43. Fischhoff, B., P. Slovic, and S. Lichtenstein. 1977. Knowing with certainty: The appropriateness of extreme confidence. J. Exp. Psychol. Hum. Percept. Perform. 3:552-564. Fischhoff, B., S. Lichtenstein, P. Slovic, S. L. Derby, and R. L. Keeney. 1981. Acceptable Risk. New York: Cambridge University Press. 185 pp. Howard, R. 1975. Social decision analysis. Proc. Inst. Electr. Electron. Eng. 63:359-371. Howell, W. C., and S. A. Burnett. 1978. Uncertainty measurement: A cognitive taxonomy. Organ. Behav. Hum. Perform. 22:45-68. Koriat, A., S. Lichtenstein, and B. Fischhoff. 1980. Reasons for confidence. J. Exp. Psychol. [Hum. Learn.] 6:107-118. Lichtenstein, S., and B. Fischhoff. 1980. Training for calibration. Organ. Behav. Hum. Perform. 26:149-171. Lichtenstein, S., and J. R. Newman. 1967. Empirical scaling of common verbal phrases associated with numerical probabilities. Psychon. Sci. 9:563-564. Lichtenstein, S., B. Fischhoff, and L. D. Phillips. 1982. Calibration of probabilities: The state of the art to 1980. Pp. 306-334 in D. Kahneman, P. Slovic, and A. Tversky, Eds. Judgment under Uncertainty: Heuristics and Biases. New York: Cambridge University Press. Lindley, D. V., A. Tversky, and R. V. Brown. 1979. On the reconciliation of probability assessments. J. R. Stat. Soc., Ser. A 142:146-180. Mishan, E. J. 1976. Cost-Benefit Analysis. New York: Praeger. 478 pp. Moreau, D. H. 1980. Quantitative Risk Assessment of Non-carcinogenic Ambient Air Quality Standards: A Discussion of Conceptual Approaches, Input Information, and Output Measures. Research Triangle Park, N.C.: U.S. Environmental Protection Agency, Office of Air Quality Planning and Standards. 35 pp. 353 .

Murphy, A. H. 1973. A new vector partition of the probability score. J. Appl. Meteorol. 12:595-600. National Research Council, Steering Committee on Identification of Toxic and Potentially Toxic Chemicals for Consideration by the National Toxicology Program. 1981. Strategies to Determine Needs and Priorities for Toxicity Testing. Vol. 1. Design. Washington, D.C.: National Academy Press. 143 pp. Raiffa, H. 1968. Decision Analysis: Introductory Lectures on Choices under Uncertainty. Reading, Mass.: Addison-Wesley. 309 pp. U.S. Nuclear Regulatory Commission. 1975. Reactor Safety Study. (Wash 1400~. Washington, D.C.: U.S. Nuclear Regulatory Commission. 354

APPENDIX D THE ANALYSIS OF STRUCTURE-ACTIVITY RELATIONSHIPS IN SELECTING POTENTIALLY TOXIC COMPOUNDS FOR TESTING Information on many chemicals now in commercial use is sparse or lacking, except for their structural formulas and a few physical constants. Because essentially nothing may be known about their toxicity to various forms of life, it would be highly desirable to use our knowledge of the relationships between chemical structure and biologic activity--i.e., structure-activity relationships (SARs)--to formulate algorithms for selecting the notentiallv most toxic compounds for ~. Ace: .. ~a: A, I: _ of_._ _ ~ ~ _ , _ ~ ~ _ _ =~'v" ~'v~vy'~ ~. This would make it possible to predict a specific kind of toxicity simply from a chemical structure of a substance and a few physical properties. It is difficult to use SARs to predict the biologic activity of two types of compounds: · Congeners with a common pharmacophoric or other biologic function presumed to have the same mechanisms of action. · Heterogeneous compounds that produce qualitatively the same biologic response, but do not have a common function or a common mechanism of action. During the last 20 years, considerable progress has been made in predicting the activity of congeners by using the techniques of quantitative SAR analysis (Martin, 1978~. Few published studies have been devoted to theoretical solutions to the problem of predicting biologic activity of heterogeneous compounds, which is of interest when using SARs to set priorities. In one empirical approach, results on a large number of organic compounds tested in a standard system (e.g., the Ames Salmonella/microsome test and tests to determine LD50) are used to derive a correlation equation that relates chemical features and physical properties of the molecules to the observed biologic response. Free and Wilson (1964) developed the first general approach to computerized SARs, which was based on chemical structure. Using regression analysis, they assigned values de nova to various substituents (molecular fragments that could be used to predict the biologic activity of untested compounds). Their effort was developed further by Cramer et al. (1974), Bodes et al. (1977), and Tinker (1981~. In effect, these models postulate that BA = aA ~ bB + cC +, . . ., + nN, (1) 355

where BA is biologic activity, and the "bests values of the coefficients a, b, c, . . ., n are selected by a computer program to weight the contributions of the chemical's structural features or molecular fragments, A, B. C, . . ., N. Equation 1 can often be used to correlate activity and structure when the biologic activity is produced by the same mechanism for all compounds in the data set, even though the mechanisms of action at the molecular level are not known. For unknown reasons, the correlation sometimes fails badly, even for small sets of closely related molecules. A heterogeneous set of compounds may cause a specific type of toxicity through a variety of mechanisms at the molecular level. For example, DNA can be damaged in many ways to initiate cancer. To express this, Equation 1 should be rewritten as: BAi = a'A + b'B + c 'C, . . ., + n'N, (2) where each BA has a different mechanism of action and, hence, most likely a different SAR. Emus, there are an unknown number of variables on both sides of the equation. The meaning of such an equation is not clear. Two approaches have been used to define the molecular fragments used in the algorithm. On the basis of the work of Free and Wilson (1964), Enslein and Craig (1981) use molecular fragments developed by chemists from accumulated experience about the chemical reactivity of clusters of atoms within a molecule. This method of defining molecular fragments allows them to use hundreds, instead of thousands, of fragments for correlation, thereby greatly reducing the redundancy in the independent variables. Hodes (1981) uses arbitrarily defined molecular fragments, such as three contiguous atoms excluding hydrogen, generated by a computer program. For a large data set (results of testing 1,000 compounds), Hodes starts with more than 10,000 variables (fragment constants) and eliminates redundancy as much as possible. The final molecular fragments selected bear almost no relation to those conventionally used by chemists. The advantage of this approach is that combinations of atoms not considered important by chemists are not overlooked. The accuracy of an algorithm is determined by using it to predict the activity of a portion of the data set. (Some of the data points are set aside and not used in developing the algorithm.) The data of Enslein and Craig suffer in comparison with those of Bodes. Weir data were taken from the literature and, hence, are of variable quality, whereas those of Hodes were derived from tests conducted by the National Cancer Institute, which made a concerted effort to achieve uniformity. NONADDITIVITY We approaches used by Bodes et al. (1977), Tinker (1981), Enslein and Craig (1981), Cramer et al. (1974), and Free and Wilson (1964) are based on Equations 1 and 2, which assume additivity--the biologic activity of a 356

molecule is assumed to be the sum of the contributions of its parts. That is, a given molecular fragment, such as a -CH2CH2CH3 group, makes a constant contribution, either positive or negative, to biologic activity; and other fragments make constant contributions; so one can add the weighting factors of the fragments of an untested compound to estimate its activity. These weighting factors are average values that work best for the set of data under consideration. Such additivity does not apply to a wide range of activity or variety of fragments. One of the most carefully studied and important properties related to the biologic activity of organic compounds is their hydrophobicity. The most common way of describing that is to measure how a compound distributes between water and a nonpolar solvent, such as octanol. The partition coefficient, P. is the ratio of the concentrations in the two phases, and log P is used as a numerical scale of hydrophobicity. The relationship between log B (biologic activity) and log P is roughly parabolic, if a wide enough range of log P is considered. Activity in a given set of congeners (other factors remaining constant) increases as log P increases until a characteristic point {log PO) is reached, at which point activity decreases with further increase in log P until it is inevitably lost. This can be understood as follows: As P approaches negative infinity, a compound becomes so water-soluble that it cannot cross a lipid barrier and therefore remains localized in the first aqueous compartment with which it comes into contact. Likewise, as P approaches positive infinity, the compound becomes so lipid-soluble that it remains in the first lipid compartment it encounters. Actually, activity falls to zero far short of infinity. For example, carcinogenic aromatic hydrocarbons, as a class, have one of the highest log PO values; however, these compounds become inactive when log P reaches approximately 8 (Hansch and Fujita, 1964~. The nonlinear relationship between P and activity eliminates the possibility of an additive relationship between hydrophobicity and bioactivity. Another important effect of fragments is their relative attraction for or release of electrons, which determines the electron distribution in molecules. The relative electron density around functional groups can have an enormous effect on the reactivity of the groups. Some reactions are favored by high electron density; some, by low density. For example, Hansch et al. (1980) demonstrated a linear relationship (over a wide range) between the electron-withdrawing effect of substituents and the mutagenicity of cis-platinum ammines; the stronger the electron withdrawal, the more potent the ammine was in the Ames Salmonella/microsome test. In an earlier study, Venger et al. (1979) found exactly the opposite effect in triazines in the same test: strong electron withdrawal by substituents linearly reduced mutagenicity. The two classes of compounds induce mutagenesis in different ways; hence, one cannot assign a single number to a three-atom fragment, such as a nitro group, to account for its activity in both these classes of compounds in the test. If one had a set of compounds of which half were triazines and half cis-platinum ammines, the average contribution of the nitro group 357

might be close to zero; however, if the set were composed only of compounds behaving like the cis-platinum ammines, a nitro group would increase mutagenicity in the Ames Salmonella/microsome test by almost 3 orders of magnitude. This illustrates a most serious problem in trying to formulate an algorithm for estimating toxicity of large, miscellaneous sets of compounds. Each set will be composed of subgroups of congeners, each of which acts via its own mechanism. Twenty years of work in quantitative SAR analysis has shown that the SAR will generally be different for each mechanism of toxicity. The examples provided above illustrate problems of additivity with hydrophobic and electronic effects in bioactive compounds. Additional problems are encountered when steric factors are considered. For example, an isopropyl group on a benzene ring pare to the active group might increase activity, whereas the same group ortho to the active group could easily destroy it. Even more dramatic is the difference in activity of stereoisomers that contain the same fragments, but in such arrangements that the compounds are mirror images of each other. One often finds huge differences in activity between such compounds; however, it has been shown that one can even do quantitative SAR work with stereoisomers when the mechanism of action is uniform throughout the set (Hansch et al., 1977~. The concept of lock-and-key fit of biologically active compounds and bioreceptors is still valid. The only change in recent times is to view them as being more flexible. me problems encountered when using chemical structures to predict biologic activity are similar to those found when translating a language by computer. Errors are made in translation, because the context in which a word is used is not considered by the computer. Assigning a toxicity weight to the hydroxyl (OH) group illustrates the necessity of considering the group within the context of the molecule. The parent compound in this series is water (HOH)--the least toxic compound known. Simple alcohols (ROM) containing 1-10 carbon atoms are toxic to all forms of life from enzymes to humans. The toxicity of alcohols containing more than 10 carbon atoms, depending on the test system, decreases rapidly with increasing chain length, and simple alcohols having more than 14 carbon atoms are not toxic in the usual tests. Lipophilic alcohols are highly toxic, but, in the context of superoptimal hydrophobicity or hydrophilicity, the OH is nontoxic. If OH is assigned a weighting factor for toxicity on the basis of experience with a number of hydrophobic alcohols, the factor is useless when applied to the OH group in sugars, starches, or some antibiotics. If one uses a large and evenly balanced set of data to form a correlation equation, the weighting factors of each of the fragments may approach zero as the size of the data set increases. 358

The examples described above illustrate several difficulties with the additivity principle and the limitation inherent in this approach when it is used to develop a global algorithm for correlating a specific biologic activity with chemical structure. Hence, it is reasonable to inquire about the effectiveness of these-models and their verification. Hodes (1981) has published results of a critical evaluation of this type of SAR. He used an algorithm developed from test data on leukemia to estimate the relative probability of antitumor activity of 988 compounds that had not yet been tested against P358 mouse lymphocytic leukemia cells. In the same study, an experienced chemist also classified the 988 compounds and selected 298 as likely to have antileukemia activity, 14 as having novel structures, and 676 as likely to be inactive. In later National Cancer Institute tests for antileukemia activity, 26 compounds were found to be active; 33 were presumed to be active, but to require further testing; 10 were judged to have potential activity; and 27 were found to be too toxic to test. The chemist's selected 298 contained 11 of the 26 active substances, 8 of the 33 presumed to be active, 5 of the 10 with doubtful activity, and 19 of the 27 toxic compounds. The 298 compounds ranked most active by Hodes's algorithm contained 13 of the 26 active substances, 10 of the 33 presumed to be active, 4 of the 10 doubtful chemicals, and 14 of the 27 toxic compounds. Hodes and the chemist each identified about one-third of the active compounds. Hodes indicated that this is better than chance, but not by much. CHEMICAL CLASSES RELATED TO HEALTH EFFECTS IN HUES Given the paucity of toxicity data on most chemicals in commerce, it is impossible to formulate general correlation equations that relate chemical structure to biologic activity. The best we can do is make guesses about toxicity that are little more than expressions of suspicion. An organized approach to such guessing might be the development of a matrix of structural classes and health effects in humans. The list of structural elements could be kept arbitrarily low (fewer than 100), and the toxicity classes could be made as large as necessary. Expert toxicologists could be asked to make some kind of simple estimate of the end points--i.e., health effects in humans associated with the structural class--and an estimate of their confidence in the association. These ~guesses" would be based on a combination of experience and intuition. Given enough estimates, one should obtain an indication of which compounds may be toxic. However, would such a selection system be better than having the same "guesses" made directly without an attempt to classify the structural features? A choice must be made between using a computer to sort chemicals into broad structural classes and using experienced persons to rate the potential toxicity of chemicals directly. Using structural classes, toxicologists must make judgments about only a few classes. The alternative approach requires experienced persons to make judgments 359

about tens of thousands of compounds. The advantage of judging the compounds directly is that functional groups can be judged within the context of the entire molecule. Moreover, the experienced persons would be expected to make more accurate judgments than would be required if the broad structural classes were used; however, the use of these experts may be more expensive. Computer sorting into structural classes is only a crude screening device. me effectiveness of the screen depends on the number of classes and on the accuracy of the computer searching procedures. An increase in the number of classes increases accuracy and search costs. If the number of classes were increased sufficiently, one would have a model similar to that developed by Enslein and Craig. Sorting by structural classes can be expected to fail in some cases. If the number of classes is kept low, many chemicals will not be included in a class, and the procedure will not yield an estimate. Complicated molecules may be placed in two or more classes, so that intrepretation of the results is not straightforward. Furthermore, sorting by the use of these classes will produce many false-positives, because many members of a class have little toxicity. SAR modeling seems to be a necessity, in that current testing resources do not permit testing all the compounds in commerce. Some method of generalizing toxicity data is required, and current SAR models are a modest beginning. Development of such models has suffered from the lack of an adequate data base and from the theoretical difficulties outlined above. There appear to be few alternative methods for making inexpensive judgments about the toxicity of tens of thousands of compounds on which there is little information other than chemical structure. Future progress in building SAR models depends on the acquisition of data on a carefully selected group of organic compounds. This group should contain a few representative compounds from each of the important classes of industrial chemicals, and there should be some diversity among their physical properties. In addition, a useful data base would contain information on a few subsets containing 20-30 compounds each, all appearing to have the same toxic function, which has been modified by structural changes and which can be correlated quantitatively through the use of SARs. Such correlations could probably be made and would probably provide information that would be useful in developing a global algorithm. REFERENCES Craig, P. N., and K. Enslein. 1980. Application of structure-activity studies to develop models for estimation of toxicity, pp. 411-419. In D. B. Walters, ed. Safe Handling of Chemical Carcinogens, Mutagens, Teratogens and Highly Toxic Substances. Vol. 2. Ann Arbor, Mich.: Ann Arbor Science Publishers, Inc. 360

Cramer, R. D., III, G. Redl, and C. E. Berkoff. 1974. Substructural analysis. A novel approach to the problem of drug design. J. Med. Chem. 17: 533-53 5. Enslein, K., and P. Craig. 1981. Structure-activity in hazard assessment, pp. 389-420. In J. Saxena and F. Fisher, Eds. Hazard Assessment of Chemicals. Vol. 1. Current Developments. Academic Press, New York. Enslein, K., and P. Craig. Carcinogens: A statistical structure activity model. J. Toxicol. Environ. Health. (in press) Free, S. M., and J. W. Wilson. 1964. A mathematical contribution to structure-activity studies. J. Med. Chem. 7:395-399. Hansch, C., and T. Fujita. 1964. Analysis. A method for correlation of biological activity and chemical structure. J. Am. Chem. Soc. 86:1616-1626. Hansch, C., C. Grieco, C. Silipo, and A. Vittoria. 1977. Quantitative structure-activity relationship of chymotrypsin-ligand interactions. J. Med. Chem. 20:1420-1435. Hansch, C., B. H. Venger, and A. Panthananickal. 1980. Mutagenicity of substituted (o-phenylenediamine) platinum dichloride in the Ames test. A quantitative structure-activity analysis. J. Med. Chem. 23:459-461. Hades, L. 1981. Computer-aided selection of compounds for antitumor screening: Validation of a statistical-heuristic method. J. Chem. Inf. Comput. Sci. 21:128-132. Hodes, L., G. F. Hazard, R. T. Geran, and S. Richman. 1977. A statistical-heuristic method for automated selection of drugs for screening. J. Med. Chem. 20:469-475. Martin, Y. C. 1978. Quantitative Drug Design: A Critical Introduction. Marcel Dekker, Inc., New York. 425 pp. Tinker, J. 1981. Relating mutagenicity to chemical structure. J. Chem. Inf. Comput. Sci. 21:3-7. Venger, B. H., C. Hansch, G. J. Hatheway, and Y. U. Amrein. 1979. Ames test of 1-(x phenyl)-3,3-dialkyltriazenes. A quantitative structure-activity study. J. Med. Chem. 22:473-476. 361

APPENDIX E COSTS OF MISCLAS SIFICATION This appendix describes a plausible approach--based on economic theory--to the assignment of costs to errors in classification. The concepts of net social benefit and health costs are introduced and related to production. For this discussion, a chemical is assumed to be regulated on the basis of classification of its hazard. The social costs of regulatory errors caused by misclassification of the hazard are calculated from health costs and social benefits. Supply and demand curves for a chemical are shown in Figure E-1. If the chemical is unregulated, it is produced at the market-determined level qm, where marginal supply cost equals the price supported by demand. The net social benefit (excluding health costs) is defined as the area between the supply and demand curves up to the point of production. When the chemical is regulated to production qr, the net benefit (excluding health costs), bale), is a function of production, q. Note that unregulated production and marketing are carried to the point where b~q) becomes flat (Figure Emu. This is also the point where supply and demand curves cross at qm. Note also that the curve in Figure E-2 is concave downward; that is an important property of the typical structure of supply and demand relationships. If health cost is assumed to be proportional to the product of exposure and toxicity and exposure is assumed to be proportional to production it is possible to express health costs as a function of the product of production and toxicity. Therefore, health cost is a linear function of production with a slope related to toxicity. Toxicity does not vary with production, but its magnitude is uncertain; the implications of this uncertainty will be considered later. According to the criterion of net benefit, which includes health costs, optimal regulatory control leads to the production, qr, that maximizes b~q) - cfq) or b~q) - tq. Figure E-3 shows social benefits, buy, and health costs, cfqJ, plotted as functions of production, q. Optimal regulatory control leads to the production, qr, that maximizes the vertical distance, y, between the curves big) and cfq). Inasmuch as cfq) is a straight line with slope t, this distance is maximized when the slope of big) is equal to t. In the simplest case, only toxicity is uncertain. Exposure and production are assumed to be known, and exposure is assumed proportional to production. The supply and demand for the chemical in question are known (hence the net benefits). The harm of the chemical is the product of its (known) exposure and (unknown) toxicity. The correct classif ication of the chemical into a category of harm or hazard is uncertain. 363

By By 11 o LL Am o o qr PRODUCTION (q) qm FIGURE E-1 Variation in supply and demand caused by changes in production, q. qm, market-determined production; biqr), net benefit of regulation of production. 364

11 z m C) o An z b(q ) / PRODUCTION qm FIGURE E-2 Variation in net social benefit, b(q), with production q. qm, market-determined production. 365

c(q) UP w X - b(q) z qr PRODUCTION qm FIGURE E-3 Health costs , c (q), and benef its , b (q) . 366

Errors in assessing toxicity are assumed to lead to errors in regulation. Define t as the best estimate of toxic potency. Suppose a chemical is classified as nontoxic (t = 0) when in fact it is toxic (t 0~. If society acts on this erroneous information, it does not regulate the chemical. Production then is set by the market at qm, with a net benefit (excluding health costs) of bigm). The net social benefit is net economic benefit minus health costs. This is shown as the vertical distance x in Figure E-3. With perfect knowledge, the optimal control is qr, with a total net social benefit, y, in Figure E-3. The cost of a regulatory false-negative is the difference in benefit betwen what society obtains with erroneous information and what it could obtain with perfect information. In Figure E-3, the cost of a regulatory false-negative is the benefit that could have been obtained with regulation (y) minus the benefit actually obtained. This net benefit is negative because ctqm) is less than biqm), so that the cost of the false-negative is x + y. In the opposite error, suppose we classify the chemical as toxic (t > O) when in fact it is nontoxic (t = 0~. If society acts on this erroneous information, it regulates to a production of qr. It thus obtains a net benefit of y + w (Figure E-3) when it could have obtained a benefit of z, if it had known that the chemical was nontoxic and did not warrant regulation. The cost of a regulatory false-positive is z - y - w (the difference between what it could have obtained with perfect information and what it actually obtained with erroneous information). Clearly, the costs of the two types of mistakes usually are not equal. The cost of a mistake depends on how much the estimated toxicity, t, departs from the true toxicity, t. The cost of a regulatory error, R(t,t), is a function of the estimated toxicity and the true toxicity. If t is greater than t, the mistake is a false-positive; and if t is less than t, it is a false-negative. It is easy to check that when t = t, R(t,t) = 0. Also, R(t,t) is nonnegative. As measured by TD50, carcinogenic potency of chemicals may range over 7 orders of magnitude. Three categories of carcinogenic potency are defined: low (noncarcinogenic), with a TD50 above 108 ~g/kg per day; medium with a TD50 between 108 and 104 ~g/kg per day; and high, with a TD50 below 104 ~g/kg per day. A carcinogen of medium potency, such as chloroform, has an associated health cost as represented by line c2(q) in Figure E-4. Chloroform is subject to some control, so there is some reduction in use with special care in handling, which lowers the amount of exposure for a given level of production. mus, Figure E-4 shows a regulated production, qr, that is about half the market-determined production, qm, because at qr the largest vertical distance occurs between b~q) and c2(q). The health cost as a function of exposure for a highly potent carcinogen is shown by c3(qJ, which has a slope 1,000 times greater than that for c2(q). The health cost for a noncarcinogen is virtually the horizontal axis, cl~q)--its slope is 0.0001 that of c2(q). 367

The structure of relative health costs shown above will be used to consider various costs of regulatory mistakes, R(t,t). First, consider the case of a chemical classified as of medium potency when it is actually nontoxic (tl). The error cost is the benefit that society could have achieved minus the benefit actually achieved, assuming that society acts on its best available information, t2. As shown in Figure E-4, society, acting on the true information, would not regulate and hence would obtain benefit b~qm). However, acting on the best available information, society regulates to a production of qr, with benefit b~qr). The cost of this regulatory false-positive is bigm) - bale). Now consider the case of a chemical classified as nontoxic (tl) when it is actually of medium potency (tag. With perfect information, society would regulate to a production of qr, with benefit b~qr) - c2(qr). With available information, society does not regulate, with benefit balm). The cost of this regulatory~false-negative is b~qr) - c2(qr) - [b~qm) - C2(qm) ] In the case of a chemical classified as nontoxic (tl) when it is actually highly potent (t3) and with correct information and optimal regulation, society would ban this chemical entirely (see slopes in Figure E-4), with zero benefit. With actual information, society does not control the chemical at all, with benefit b~qm) - c3(qm), where c3(qm) is the health cost with no control for highly potent carcinogenic chemical. Note that the net benefit is negative and large, because c3(qm) is much larger than b~qm). The cost of this regulatory false-negative is LO - b~qm)] - LO - c3(qm)] = c3(qm) - b~qm). Suppose we classify a chemical as highly potent (~ ~ when it is actually nontoxic (tl). Acting on the erroneous information, we ban the chemical, with zero benefit. With correct information, there would be no regulation. The cost of this false-positive is b~qm) - 0 = b~qm) If we classify a chemical as highly potent (t3) when it is actually of medium potency, the cost of the false-positive is b~qr) - c2(qr) - 0 = b~qr) - C2(qr) m e above cases are shown together in Table E-1. When the toxicity of a chemical is classified correctly (ti = ti), there is no regulatory mistake; therefore, the regulatory error, R(t,t), is zero. Returning to Figure E-4, we can attempt to assess some relative magnitudes for the costs of regulatory mistakes. The benefit, balm), may be considered as a unit. Assume that this unit is about $1 million. 368

b(q) c(q) n In - IL LU of LL m it: o - - C: On In o C: I 6 I t c3(q) b(q ) c2(q) Ctgr) PRODUCTION (q) b(qm) am qm FIGURE E-4 Health costs c(q) and benefits b(q) 369 c(qm) c1 (q) q .

For example, if 70,000 chemicals are associated with about $140 billion of GNP, an "average" chemical has about $2 million in production volume. However, sales volume is a very rough indicator of consumer plus producer surplus, which is the definition of benefits used for balm). For "typical" supply and demand, most of the benefits to society are attained at half the market-determined production volume. Thus, if c2(qm) is about 10% of balm), we might expect qr to be roughly half qm, as is shown in Figure E-4. If the slope of cl~q) is 0.0001 the slope of c2(qj, it will be virtually zero, or the horizontal axis in Figure E-4. Thus, with low toxicity, the best regulation is virtually none. If c3(q) has a slope 10,000 times that of c2(qJ, c3(qm) will be about 100 times balm). Because c3(q) is so steep, for a highly carcinogenic chemical (like dioxin) the appropriate control is a ban, as shown in Figure E-4. Another action is to reduce exposure greatly, which lowers the slope of c3(q). This will be considered later. The discussion here considers only the simple case in which assumed exposure and production volume are propo rtional . Wi th thi s assumption, c3 (qm) i s about lO Ob (qm) , Applying these values to Table E-1 provides the results shown in Table E-2. We have assumed that regulatory actions are based on best current classifications of chemicals. Clearly, this is a simplification. Some chemicals currently classified as medium carcinogens are receiving little regulatory attention (instead of being controlled to about half their unregulated exposure). And some chemicals currently classif fed as highly toxic (such as dioxin) also are not being completely regulated. Looking at Table E-2, we might conclude that, if a chemical is classified as nontoxic, it is highly unlikely that there will be regulatory action or much scientific inquiry into its inherent toxicity. Thus, the false-negative costs are "real" and likely to be felt for some time, until the chemical is reclassified as toxic. (The upper right portion of Table E-2 include the false-negatives; the lower left portion includes the false-positives.) If a chemical is classified as either of medium or high potency, it is likely to receive some further attention. The likely action, if it is classified as of high potency, is further testing, as long as the chemical is not already well tested. Testing costs could run to about $1 million, which is not far from the 1 unit already assigned to the cost of regulatory error if toxicity of a chemical is estimated as strong when it is actually a low-potency carcinogen or a noncarcinogen. It is estimated that the comprehensive testing cost for a chemical classified as strong is not much different from bm. If that is true, comprehensive testing for "average" chemicals would be roughly the same as the revenues obtained from them in the first few years. These comparisons do not address the differences between company profits, market revenue, and the consumers' plus producers' surplus. 370

TABLE E-1 cost of Regulatory Error due to Disagreement Between Estimated Toxicity, t, and True Toxicity, t Cost Estimated Toxicity t1 t2 t3 . . . tl O biter) - C2(qr) c3{qm) - bigm) - [b~qm) - C2(qm) ~ t2 bloom) - b~qr) 0 c3(qr) - b~qr) t3 balm) b~qr) - c2(qr) 0 TABLE E-2 Cost of Regulatory Error in Terms of Social Benefit per Average Chemical under Market Conditions, balm) Cost, bream) Estimated - - Toxicityt1 t2 t3 t 00.05 99 -1 t2 0.10 4 t2 1.00 92 371

It is probable that further scrutiny would follow a classification of a chemical as a medium carcinogen; the assignment of 49 units for the cost of regulatory error may overstate the cost of this misclassificat- ion. With further scrutiny, there is some chance that the chemical in question would be correctly classified, with eventually a lower cost to society. The cost of this scrutiny in further testing might be $100,000 - 500,000 (less than if the chemical had been classified as high). Thus, if the chemical is truly nontoxic, but is misclassified as medium, the cost of the mistake might be a little more than 0.1 unit ($100,000), suggested in Table E-2. Thus, we modify the costs of a regulatory error when a chemical is classified as a medium carcinogen, as shown in Table E-3. Thus far, exposure has been assumed to be known and to be proportional to production. Exposure has been estimated to range over about 6 orders of magnitude. Furthermore, exposure often is fairly independent of production. Taking into account a range of 6 orders of magnitude in exposure and a range of 7 orders of magnitude in toxic potency, a hazard index of the product of toxicity and exposure would range over 13 orders of magnitude. At the same time, the link between regulatory action and hazard classification becomes more complicated and more diffuse. A more complicated example of misclassification occurs when a chemical is correctly classified as having medium toxicity, but erroneously classified as having high exposure when in fact it has low exposure. Is the chemical likely to be regulated on the basis of this misclassification? Hardly. Before regulation, its exposure is likely to be studied more carefully and (it is hoped) sufficiently well to correct the misclassification of exposure. Because the exposure category is 3 orders of magnitude wide, this discovery is fairly likely. Thus, the main cost of classifying exposure too highly is the extra cost (ultimately found to be unnecessary) of learning that the chemical belongs in a lower exposure category. Now consider a chemical, correctly classified as high in toxicity, but erroneously classified as having low exposure when it actually has high exposure. Would such a chemical receive no further attention, on the basis of its erroneous exposure classif ication? Once a chemical is classified as having high toxic potency, it would receive further attention, even if currently classified as having low exposure. With the further attention, the erroneous classification of exposure would have a good chance of being correctd. The main cost of this misclassification is not the cost of research on exposure, which is warranted in this case, but the cost arising from the chance that the error in exposure would not be corrected through investigation. Similar reasoning applies in the case of a chemical correctly classified as being medium in toxicity, but misclassified as to exposure. Too high a classification of exposure leads to unnecessary research on exposure, whereas too low a classif ication leads to a chance that research will not correct the error. Chemicals classif fed as having medium toxicity are assumed to receive further attention. 372

TABLE E-3 Cost of Regulatory Error in Terms of Social Benefit per Average Chemical under Market Conditions, b (qm) Cost, biqm) Estimated t1 t2 t3 Toxicity 0 0.05 99 tl t 0~4 0 25 -2 t3 le O O e 95 0 373

Further attention is more likely to be devoted to these chemicals than in the previous case, because less attention will probably be devoted to a chemical classified as having medium than high toxic potency. Taking these ideas into account, we might arrive at a table of regulatory error, R(t,t), as shown in Table E-4. The underlined numbers correspond to medium exposure and correspond to the entries of Table E-3, with some modification. The obvious thing to note in Table E-4 is the large range in the costs of false-negatives, which appear above and to the right of the diagonal whose values are zeroes. This range is due to the range of 13 orders of magnitude in hazard. The range of regulatory false-positives is about 7 orders of magnitude, which is less than the range in hazard because of the way false-positive and false-negative costs are defined, in comparison with benefit, bigm). However, the range is still enormous. Moreover, the cost of the largest false-negative (99,999) is enormous, compared with the cost of the largest false-positive (0.8), and probably much larger than most people would think realistic. What might lead to overstating these ranges? Consideration of the answers to this question leads to examination of two assumptions: exposure is proportional to production, and a classification of medium toxicity leads to some control. These assumptions may decrease the differences between false-negatives and false-positives. The "average" chemical is assumed to have market-determined production, qm, with a net benefit, balm). Consider what happens when qm also ranges over 6 orders of magnitude and, with it, net benefit. In the simplest case, exposure is proportional to production, and regulation follows directly from classification. The matrix of error costs is similar to Tables E-2 and E-3, where the variation in bum) and qm, from high to low production, scales everything up and down. An overestimate of exposure is also a mistake in overestimating balm), and these mistakes cancel out, as long as costs of mistakes are expressed in terms of b~qm). In another simple case, benefit, biqm), and production, qm, are fixed and exposure still varies over 6 orders of magnitude. This case is shown in Table E-4. A more realistic case has costs of regulatory error intermediate between the values for the extreme cases shown in Table E-4. The main reason for the high relative cost of the largest false-negative in Table E-2 (99) is the assumption that there should be some regulatory control for medium carcinogens. The control considered was a restriction of production to about 50% of that determined by the market. With the typical concavity of the benefit curve (which follows from typical supply-and-demand schedules) and the range of 3 orders of magnitude in toxic potency, this led to the asymmetric structure of Table E-2. 374

TABLE E-4 Expected Cost of Regulatory Error, as Modif fed by Expected Reaction to Misclassif icationa True Classification Estimated Classif i- eltl e2t1 e3t1 elt2 e2t2 3 2 cation elt3 e2t3 e3t3 eltlO0 0. 05 0 0. OS 99 0.05 99 99, 999 e2t e3t1 e t0.70.7 1- 2 0 00. 05 0 0. 05 99 0. 05 99 99, 999 0 00 0 0.05 99 0.05 99 99, 999 0.7 0 0.40 40 0.5 40 40, 000 e2t2 0 ~ 7 0 7 0 7 0 ~ 20 10 0. 5 10 10, 000 e3t2 0 7 0~7 0.7 0.20.1 0 0.5 5 5,000 elt3 0.8 0.8 0.8 0.40.4 0.2 0 2 200 e t 0.8 0.8 0.S 0.4 0.4 0.2 0.1 0 100 2 3 e3t3 0.8 0.8 0.8 0.4 0.4 0.2 .0_1 0.1 0 - exposure class; t = toxicity class. Subscripts: 1 = low or non-; 2 = medium; 3 - high. Underlined numbers correspond to medium exposure and correspond to entries of Table E-3, with some modification. 375

But there is another, probably more realistic, scenario in which chemicals classified as of medium toxicity may be subjected to some control. For the same production, more careful handling and use might halve exposure at a management cost that is low relative to the benefit. To have some control for medium toxicity, it is not necessary to posit c2(q) with a slope of about 10% of that of bigm) (see Figure E-4. Suppose the slope of c2(q) is only 1% of that of balm), instead of 10~. This assumption implies very little restriction in production, but it could imply large reductions in exposure per unit of production by restricting use and adopting stringent handling requirements. With this change, we can revise Table E-2 to Table E-5. Again, consider the cost of various cases of misclassification: ~ R(tlt3~. Because high toxicity is 1,000 times more than medium toxicity, the slope of c3(q) is only 10 times that of balm). Classifying a chemical as nontoxic when it has a high toxicity leads to a false-negative regulatory cost of lOb~qm) - bigm) = Album) (upper right corner of Table E-5 in terms of benefit from production set by market conditions). ~ R(tlt2~. If a chemical is classified as nontoxic when it is actually medium, it is possible to halve the exposure at a small cost (say, 0.1%), relative to b~qm). The cost of regulatory error is [b~q ~ = c2(qr) - O.Ollb~qm) - b~qm) - cfqm)~. Because c2(qr) is half c2(qm), the cost of this false-negative is 0.004 unit of balm). O R(t2t3~. If society acts directly on a chemical classified as medium when it is actually high, the benefit is bum) - c3(qr) - O.OOlbigm), or balm) - 5 - 0.001 balm). With correct information, the chemical would be banned, with zero benefit. So the cost of this false-negative is O - (-3.999) = 3.999. · R(t2tl). If a chemical is classified as medium when it is actually nontoxic, the benef it is b (qm) - O . OOlb (qm); it could have been bm. The cost of this false-positive is 0.001. 0 R(t3t2~. If a chemical is classified as of high toxicity when it is actually medium, it is banned, with zero benefit, whereas we could have halved the exposure, at production qm, with a benefit of bum) - c2(qr) - 0.001 bum) = 1 - 0.005 - 0.001 = 0.994b~qm). The cost of this false-positive is 0.994biqm). · R(t3tl). If a chemical is classified as highly toxic when it is actually nontoxic, it is banned, with zero benefit, whereas we could have had the benefit, berm), with no health cost. Therefore, the cost of this false-negative is b(gm) - 0 = balm). 376

TABLE E-5 Cost of Regulatory Error Due to Misclassification of Toxicity in Terms of Social Benef it under Market Conditions, b (qm) Cost, bedim) t1 t2 t3 Estimated toxicity t1 00.004 9 t2 0.0010 3999 t3 10.994 0 377

Table E-5 was Maculated on the assumption that exposure (unless separately controlled) is proportional to production and b~qm) is proportional to qm. Now we can take into account the idea that exposure and production are not always proportional. Suppose that for high-exposure chemicals the proportionality between exposure and production is twice that for medium-exposure chemicals, and suppose further that for medium-exposure chemicals the proportionality between exposure and production i s twice that for low-exposure chemicals. In other words, high-production chemicals tend to cost less per pound, and the assumption is that the market benefit per pound of production volume is half that for medium-production chemicals and one-fourth that for low- production chemicals. This assumption leads to Table E-6. The false- negative section (upper right portion) is revised, but the false-positive portion is unchanged. Most of the false-positive costs are associated with research to develop a chemical before it is regulated, and these costs are limited to 1 unit; the total benefit having been foregone, the chemical is erroneously banned. It is difficult to assess regulatory costs resulting from misclassification of a chemical with regard to exposure or toxicity. But, although the above analysis can be refined, some features of the structure of error costs are discernible. It seems reasonable for the costs in the upper right portion to be asymmetrically larger than those in the lower left portion--perhaps as much as 20 times as large. This indicates that the social cost of underregulating a chemical is much greater than that of overregulation. Exposure clarification is clearly important, but it does not play a role entirely symmetric with that of toxicity classification, as the simple concept, "hazard equals exposure times toxicity," might suggest. The reason for the asymmetry is that exposure tends to be proportional to the benefit of a chemical, whereas toxic potency is an inherent property of a chemical and does not vary with production, benefit, or exposure. An implication of this difference is that, for a given health effect, information on toxic potency is permanent or changes only as science improves, whereas correct information on exposure is more dynamic and changes as production responds to market changes and technologic advances. 378

TABLE E-6 Regulatory Error Due to Misclassif ication of Exposure and Toxicity True Classif ication Estimated Classifi cation1 12tle3t1 elt2 e2t2 e3t2 elt3e2t3e3t3 eltl000 0,002 0.004 0.008 4.5918 e2tl000 0.002 0.004 0.008 4.5918 e3t1 0 0 0 0.0015 0.003 0.006 4.5 8 16 elt2 e2~2 0.7 0.7 0.7 0.7 0.7 0.7 0.2 0 0.01 0.04 4 8 00.012 4 8 3t2 0 7 0.7 0.7 0.20.101.8 3 6 elt3 e2t3 0.8 0.8 0.8 0.4 0.4 0.200.01 0.05 0.8 0.8 0.8 0.4 0.4 0.20.10 0.01 e3t3 0.8 0.8 0.8 0.4 0.4 0. 2 0.1 0.1 0 a e = exposure class; t = toxicity class. non-; 2 = moderate; 3 = high. 379 Subscripts: 1 = low or

APPENDIX F DIFFERENCES BETWEEN PART 1 AND PART 2 The stated objectives of the Committee on Sampling Strategies and the Committee on Toxicity Data Elements differed from those of the Committee on Priority Mechanisms. Accordingly, the concepts and practices used by the committees to achieve the two objectives were different: · The study included an examinination of the extent of toxicity testing of chemicals to which humans are exposed and methods of selecting chemicals for testing. The committees were thus assigned different aspects of the same problem: evaluating toxicity testing for chemicals of concern to NTP. The Committees on Toxicity Data Elements and Sampling Strategies devised a method to assess the current state of toxicity information used to determine health hazard and to estimate additional needs for toxicity testing. The Committee on Priority Mechanisms developed a design approach for priority-setting systems and applied that approach to a demonstration system for the select universe defined by the Committee on Toxicity Data Elements. · The Committee on Toxicity Data Elements identified a list of test types that served as a basis for examining the adequacy of past toxicity testing and estimating current testing needs. The committees acknowledge that under a variety of conditions it would not be necessary for all such tests to be done. For example, although it is useful to have information on chronic toxicity for a specific substance, the presence of positive data from a well-conducted subahronic study might obviate the development of any further information on chronic exposure. A need would still remain to establish a mechanism for deciding which tests and which substances should be examined and which ones would be given a higher priority. The Committee on Priority Mechanisms examined this issue. · The select universe defined by the Committee on Toxicity Data Elements was fixed once the sample was taken so that, after sample analysis, useful estimates for the universe could be made. The universe of substances that the Committee on Priority Mechanisms considered, however, by definition is constantly expanding as more substances with a potential for human exposure are identified. · Working documents developed and used by the Committee on Toxicity Data Elements were designed to assess the status and quality of toxicity-testing information. The dossier concept adopted by the Committee on Priority Mechanisms is intended to provide an assessment of exposure and toxicity. The approach of the Committee on Sampling Strategies and the Committee on Toxicity Data Elements results in an estimation of toxicity-testing frequency, adequacy, and needs based on a retrospective analysis of existing information. The approach of the Committee on Priority Mechanisms results in a priority-setting framework that could be useful in determining which chemicals to test and which tests would yield the most informative data. 381

· me select universe of chemicals used in this study contained, by design, substances with a potential for human exposure. A sample was drawn from this select universe by the Committee on Sampling Strategies for use by the Committee on Toxicity Data Elements in its determination of toxicity-testing needs. Although the degree of potential human exposure was used in the determination of testing needs, it was not a determinant in selecting the sample of substances. In contrast, the Committee on Priority Mechanisms suggests procedures that use information on the degree of potential human exposure early in the chemical selection process. · For each substance in its sample, the Committee on Toxicity Data Elements searched comprehensively for and nonselectively used all . . . . . . . . . . . · · . . . . · ~ information that might assist It In aetermlnlog the testing neeas For that substance. The Committee on Priority Mechanisms was selective in applying information to each of the various stages of its priority- setting system. · Analysis of information was approached differently in each activity. The purpose of the Committee on Toxicity Data Elements was to determine the type and quality of available data, rather than review existing assessments of toxicity. These determinations were based on the expert judgment of committee members. The Committee on Priority Mechanisms devised an approach to making assessments of public-health concern as part of a system to select chemicals for testing. This approach explicitly provides for estimating the degree of uncertainty in the assessment. · Finally, the Committee on Toxicity Data Elements and the Committee on Sampling Strategies examined available information to determine whether there is enough to conduct at least partial health-hazard assessments. The Committee on Priority Mechanisms provides a framework for using the information to conduct such assessments. 382

Toxicity Testing: Strategies to Determine Needs and Priorities Get This Book
×
Buy Paperback | $70.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Prepared at the request of the National Toxicology Program, this landmark report reveals that many chemicals used in pesticides, cosmetics, drugs, food, and commerce have not been sufficiently tested to allow a complete determination of their potential hazards. Given the vast number of chemical substances to which humans are exposed, the authors use a model to show how research priorities for toxicity testing can be set.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!