2
Objectives, Principles, and Methods
Advances in biomedical research and clinical innovations have greatly expanded the array of medical interventions available to prevent or manage disease or injury. Keeping track of these advances and the scientific evidence about their potential benefits and harms has become increasingly difficult for busy clinicians and for payers and policymakers who want to cover beneficial care while limiting payments for ineffective or harmful services. One result has been increased demand for more systematic evaluations of the benefits and harms of health services. Another result has been a demand for improvements in the methods for conducting and reporting clinical research. Under the labels of technology assessment and evidence-based medicine, researchers, caregivers, payers, policymakers, and others have been seeking agreement on criteria, procedures, and techniques for evaluating evidence and reaching valid and credible conclusions about what works and what does not work in medical care.
This chapter reviews the committee’s principles for reaching conclusions and its analytic strategy for assessing evidence and estimating coverage costs. The chapters and appendixes on the different services assessed by the committee provide more specific details.
OBJECTIVES AND PRINCIPLES
Given its charge, the committee’s primary objective was to provide analyses that could help Congress make decisions about Medicare coverage for skin cancer screening, medically necessary dental services, and the elimination of the
three-year time limit on the coverage of immunosuppressive drugs for transplant recipients. The committee also intended that its findings and conclusions should be credible to practicing clinicians, patients, and the public. Several principles guided the committee’s work within the limits of existing evidence, time, and resources:
-
Findings and conclusions should be consistent with available knowledge; apparent departures from the evidence should be explained.
-
Health outcomes meaningful to patients or consumers—not only changes in physiological measures—should be emphasized in assessments. Meaningfulness relates to the kinds of benefits and harms identified, the magnitude of the effect of an intervention on an outcome, and the preferences of individuals about different outcomes.
-
The quality, strength, and limits of the evidence for findings and conclusions should be assessed and described. Evidence about effectiveness (results in usual clinical practice) as well as efficacy (results under controlled research conditions) should be considered.
-
The role of expert judgment and experience in assessing evidence and making judgments about the effectiveness of services should be identified.
-
Key analytic choices—such as the specification of the health care intervention, the identification of target populations, and the selection of data and methods for cost analyses—should be explained.
-
The limitations of analytic methods should be described. In this report, for example, a notable limitation is a cost estimation strategy that (consistent with the committee’s charge) focused on costs to the Medicare program rather than costs or cost-effectiveness from a societal perspective.
The committee’s task was not to craft statements that were precise and detailed enough to serve as legislative or regulatory language or clinical practice guidelines. (See Eddy et al. [1992] and IOM [1990a] for discussions of principles and criteria for development of practice guidelines.) While acknowledging their importance, the committee also did not examine the full range of ethical, economic, cultural, political, and other issues relevant to decisions about Medicare coverage policies or other options for achieving health goals.
Criteria and Trade-Offs
For each intervention examined, the committee found it helpful to consider a version of the “evidence pyramid” that Figure 2–1 depicts for a generic health care intervention. In this pyramid, each lower tier represents a condition to be met before the next-higher tier is considered. This generic pyramid has been modified to fit the special characteristics of the interventions examined in the next three chapters.
In brief, someone applying the criteria depicted above must first establish that a health problem exists. Because this report considers a public program, Medicare, the problem should affect Medicare beneficiaries. The next question is whether anything can be done about the problem, that is, whether effective treatment is available. Further, because treatment can be effective but still have significant side effects or harms, the balance of benefits relative to harms must be favorable.
Figure 2–2 modifies the evidence pyramid to illustrate how similar criteria could be applied to coverage decisions. Consistent with the discussion in Chapter 1, Chapters 3, 4, and 5 each discuss why the effectiveness of coverage in achieving desired health goals cannot be assumed.
In practice, the application of the criteria represented in the evidence pyramids involves other trade-offs besides the weighing of benefits against harms. For example, if a health problem affects many people, the benefits of an intervention are great, and the risks of the intervention are minimal, then weaker evidence may be tolerated in assessing options for patients (USPSTF, 1996). In contrast, if the condition is uncommon, the health risks of the intervention are significant, and the benefits are modest, then stronger evidence is usually required before an intervention is recommended. Some argue that preventive services should face stricter scrutiny than treatment services because rather than responding to sick people who need medical care, they invite healthy people to receive care.
Furthermore, even for an intervention that meets all the criteria in Figures 2–1 and 2–2, the extent of the benefits relative to the cost to Medicare, a health plan, or society generally would still have to be considered. In addition, the decision to implement an intervention would need to take into account various practical and cultural issues such as whether groups most at risk are likely to seek or be otherwise identified for care.
As noted throughout this report, the committee developed explicit estimates only of costs to Medicare, not costs to patients, families, or others. It did not
generate formal cost-effectiveness analyses for each of the interventions considered. For example, the analysis of eliminating the three-year limit on Medicare coverage of immunosuppressive drugs for transplant patients does not compare the estimated cost per life year gained from eliminating the limit with a similar estimate for extending coverage to outpatient antihypertensive drugs.
ANALYTIC STRATEGY
With the assistance of its consultants, the committee employed an analytic strategy that included several steps: (1) defining the intervention, population, and outcomes; (2) identifying and assessing the research literature; (3) linking the evidence to conclusions; (4) estimating costs to Medicare of extending coverage; and (5) considering benefits and costs together. These steps, in general, follow a set of broadly accepted methods for identifying and making use of the best available evidence.1
Defining the Intervention, Population, and Outcomes
Both the topics to be examined by the Institute of Medicine (IOM) and the population of interest (Medicare beneficiaries) were determined by the request from Congress. For each topic, one important step in the committee’s analysis was to define more fully and explicitly the intervention to be assessed. For example, as explained more fully in Chapter 3 and Appendix B, screening is defined as involving only people without symptoms. This definition thus excludes a skin examination conducted by a physician during a visit sought by a patient concerned about recent growth of a mole or other physical change. It likewise excludes a physician’s incidental discovery and further investigation of a suspicious mole during a visit for some other purpose.
In addition to identifying the tests, procedures, treatments, or other elements that characterize the intervention to be assessed, analysts must also specify the target population and the possible outcomes of the intervention. The target population for this report was generally those age 65 or over, who constitute the substantial majority of Medicare beneficiaries. The evidence reviewed was not, however, restricted to this age group. Clinical studies have sometimes excluded older patients, included too few for meaningful analysis, or not reported results by age. In the case of transplant-related interventions, the relevant population also includes a significant proportion of younger people who have qualified for Medicare by virtue of disability or diagnosis of permanent kidney failure. Clinical studies of transplant patients generally do not describe their Medicare status.
As discussed above, the committee was especially interested in health outcomes that would be directly meaningful to patients or consumers, including mortality, morbidity, and health-related quality of life. In identifying such outcomes for assessment, analysts need to consider possible harms as well as benefits. Although some interventions have little potential for harm, others have the potential to do considerable harm. Chapters 3, 4, and 5 consider benefits and harms relevant to the services and conditions being examined.
To reach conclusions specific enough to guide clinicians and policymakers, analysts also have to assess information about additional elements of an intervention—in particular, how frequently a screening test should be used. Evidence on which to base recommendations about frequency is scarce.
The committee and its consultants often found various tabular and graphic tools useful in analyzing the quite different kinds of clinical problems and interventions examined here.2 These tools helped the committee and consultants to
(1) identify missing or ambiguous aspects of the definition of the intervention, target population, outcomes, or costs to Medicare of covering the intervention; (2) clarify underlying assumptions or expectations about the causal pathway linking the intervention and outcomes; (3) identify uncertainties related to different links in the causal pathway that might temper the interpretation of evidence and the formulation of conclusions; (4) guide the literature search for direct and indirect evidence; and (5) understand the assumptions that underlie conflicts in analytic strategies and conclusions. Where tables and graphics are useful in presenting information, explanations, and conclusions, they are included in the report text or the background papers. Had the data available to the committee been more extensive and solid, some of these tools would have been employed further to guide mathematical modeling of the relationships between interventions and outcomes.
Identifying and Assessing the Scientific Literature
Given the definition of the intervention (and sometimes its redefinition in light of additional information and discussion), the next step was to identify available evidence about its effectiveness. The literature search strategies (including search terms, criteria for inclusion or exclusion in the analysis, databases consulted) are described in more detail in Appendix B for skin cancer screening and Appendix C for dental services. (Appendix D on immunosuppressive therapy for transplant patients was intended more as an overview than as a full and systematic evaluation of the literature.)
For the literature that met the criteria for further assessment, the next questions concerned the quality, relevance, and consistency of the evidence. Some studies employ stronger research designs that allow more confidence in their findings than studies using weaker designs. Ideally, analysts would locate evidence directly relating the intervention to the outcomes of interest, for example, multiyear, properly randomized, controlled trials that followed people over age 65 who had been screened or not screened for skin cancer and then reported consistent findings. Often, however, analysts must rely on chains of indirect evidence, for example, one set of studies of the stage of cancer identified during screening versus “usual” care and another set relating the stage of cancer to health outcomes. Analysts also may find that results of different studies are contradictory and cannot be explained by obvious differences in study methods or populations.
In general, the assessment of evidence here follows that of the U.S. Preventive Services Task Force (USPSTF [1996], adapted from the Canadian Task Force on the Periodic Health Examination [CTFPHE, 1979]). Unlike the USPSTF, the committee did not rate the quality of the evidence numerically but,
rather, described the types of evidence available (e.g., multicenter randomized clinical trial, small case-control studies) for each topic examined.3
If multiple studies are available, they will often differ sufficiently in their focus, methods, and results, so that overall conclusions are not obvious. The technique of meta-analysis is sometimes employed to synthesize the results of such studies, although experts still debate techniques for conducting and interpreting these analyses (Bailar, 1997; Blettner et al., 1999; Lau et al., 1997; Moher and Pham, 1999; Mulrow and Oxman, 1997; Sutton et al., 1998). The evidence identified in the course of this study did not warrant formal metaanalyses. Instead, the background papers included in Appendixes B, C, and D generally present tables describing relevant studies and their results.
Linking Evidence to Conclusions
For some interventions, the evidence will be sufficient in quality, relevance, clarity, and consistency to justify positive or negative conclusions about an intervention, at least under certain circumstances. For many interventions, however, analysts may find little or no direct evidence of efficacy or effectiveness, and useful indirect evidence may also be very limited. Even if analysts identify potentially relevant studies, they may be inconclusive, conflicting, or poorly designed. At this point, an analysis may essentially stop with the conclusion that there is insufficient evidence to justify a positive or negative conclusion about either clinical practice or coverage.4
Alternatively, the assessment process may tap professional expertise and experience to see whether a consensus can be reached about what clinical practice or insurance coverage should be in the absence of adequate evidence. Proc-
esses for reaching these kinds of consensus-based conclusions range from informal and implicit to formal and explicit (e.g., see IOM, 1985, 1990b,c, 1995b).
As methods for systematically reviewing and reporting research have developed, so have ways of describing the strength of conclusions or recommendations about an intervention. One approach used by the USPSTF (also adapted from the Canadian Task Force) takes into account the quality of the evidence, the direction and importance of reported effects (both benefits and harms), and the burden of disease associated with the condition in question. Again, the committee did not assign explicit ratings to its conclusions but rather described the strength or sufficiency of the evidence to support conclusions about the services it investigated.
Another way of summarizing the strength of a recommendation has been proposed for use by those developing clinical practice policies or guidelines (Eddy et al., 1992). It relies partly on the strength of the evidence base and partly on the degree of understanding and agreement about the outcomes associated with an intervention. This approach reserves the term standard for statements for which the health and economic consequences are reasonably well understood and people are virtually unanimous about the desirability or undesirability of the intervention. A guideline is a statement for which outcomes are reasonably well understood and are preferred (or not preferred) by a solid but not unanimous majority. If outcomes are not known or if preferences are unknown, indifferent, or split, then an intervention may be described as an option without being recommended. Although this scheme does not directly apply to coverage policies, it is nonetheless a useful way to think about the strength of the case for coverage changes.
For the interventions and outcomes examined here, the committee found little or no systematic evidence about either individual or societal preferences for different outcomes. As a result, the committee had to rely on its own experience and expertise in suggesting how people might value different outcomes for themselves or others. For example, the committee judged that the scarring produced by most biopsies for false negative results for skin cancer screening examinations was not likely to be viewed by most people as an important risk of screening, whereas the disfigurement that might result from late diagnosis and surgical treatment of squamous cell carcinoma was likely to be viewed as important. As explained in Chapter 3, the evidence did not warrant further steps such as efforts to assign utilities or numerical weights for the value of different outcomes.
ESTIMATING COSTS TO MEDICARE OF EXTENDING COVERAGE
A next analytic step was to estimate the costs to Medicare of covering the interventions analyzed in this report. At the outset, the committee decided that it
would present cost estimates for each intervention even if analysis suggested that the evidence did not support the extension of coverage. The rationale is, first, that the charge to the committee called for estimates of costs and, second, that the estimates might be useful if Congress continued to consider extending coverage despite the weakness of the evidence base for coverage.
As explained in Appendix E, the method for estimating Medicare costs generally followed the generic approach of the Congressional Budget Office (CBO), which was determined from past cost estimates and in discussions with CBO staff. This decision reflected the committee’s wish to provide Congress with estimates that were based on familiar procedures. Unlike cost-effectiveness analyses intended to inform broad public policy decisions, the CBO approach does not take a societal perspective, nor does it recognize costs to beneficiaries, families, or others affected by coverage policies. Other differences are that the estimates do not discount future costs to present value,5 and they consider future benefits only in the form of any direct cost offsets (e.g., avoided hospitalizations but not avoided absences from work) projected to result from covering a service.
Although specific procedures, assumptions, and data sources vary for each service examined as explained in later chapters and Appendix E, the basics of the committee’s approach to estimating costs to Medicare are as follows. The estimates:
-
cover the five-year period, from 2000 to 2004;
-
apply assumptions about the numbers of beneficiaries experiencing the intervention (including initial and referral visits), complications, and other relevant events based on the epidemiological and other literature and guidance from the committee and consultants;
-
specify the type and number of physician visits or other services and procedures (e.g., biopsies) that constitute the intervention based on Health Care Financing Administration (HCFA) data, research literature, and guidance from the committee and consultants;
-
adjust future costs for inflation but do not discount them to present value;
-
subtract the amounts beneficiaries would pay in coinsurance (generally 20 percent of the Medicare-approved payment); and
-
subtract the proportion of the total cost increase that would be transferred to beneficiaries through higher Part B premiums, which are set at 25 percent of Part B spending for elderly Medicare beneficiaries and which flow to the Part B Trust Fund.
The committee’s estimates of Medicare costs are based on a series of assumptions, some of which have supporting evidence or data but others of which are best guesses based on committee judgment in the absence of such information. For each condition or service, the estimates are intended to suggest the order of magnitude of the costs to Medicare of extending coverage, but they could be considerably higher or lower than what Medicare might actually spend were coverage policies changed. The tables in Appendix E allow readers to vary some of the committee’s assumptions and calculate alternative estimates.
Both Chapter 1 and Appendix E note that the rules now governing Congress generally require that decisions to increase federal government spending in one area be offset with reduced spending in other areas or increases in tax or other revenues. The committee did not explicitly factor these budget rules into its conclusions. Nonetheless, it was aware that, for example, higher net spending for skin cancer screening or dental services would probably have to be matched by increased taxes or by spending reductions elsewhere.
Coverage determinations by HCFA do not entail such explicit “neutrality” criteria. Thus, the services examined in this report—which require decisions by Congress—face a higher hurdle to achieve coverage than do services that fit within already established coverage categories.
The committee was not asked to estimate costs to the federal-state Medicaid program that might be added or reduced if Medicare extended coverage to the services examined in this report. For example, if the three-year limit on Medicare coverage of immunosuppressive drugs were eliminated, federal and state Medicaid costs should decrease because that program would be spending less for these drugs for beneficiaries who were eligible for both Medicare and Medicaid. In this case, the net cost to the federal budget of extending coverage would be less than the cost to Medicare.
Considering Health Outcomes and Costs Together
The possible combinations of overall health and cost outcomes can be set out in simplified terms as shown in Table 2–1. In this table, the rows labeled “better,” “same,” and “worse” refer to the overall health outcome of using the intervention compared to not using it. Similarly, the columns labeled “lower,” “same,” and “higher” describe the net cost to Medicare of covering the intervention relative to the cost of not covering it (i.e., the status quo). The pluses in the table’s cells indicate support for a positive decision about an intervention,
TABLE 2–1 Expanding Coverage to a New Intervention: Possible Outcomes and Directions for Decisionmakers
Health Outcome of the Intervention |
Cost to Medicare for the Intervention |
||
Lower |
Same |
Higher |
|
Better |
++ |
+ |
? |
Same |
+ |
? |
− |
Worse |
? |
− |
−− |
SOURCE: Adapted from Pauker and Col, 1999. |
the minuses indicate support for a negative decision, and the question marks indicate more mixed situations.
Thus, the combination of better outcomes and lower costs (upper left corner of Table 2–1) points toward a positive coverage decision whereas the combination of worse outcomes and higher costs (lower right corner) points to a negative decision. The diagonal row of question marks indicates the less clear-cut decision situations, for example, the not uncommon circumstance that an intervention produces better results but at a higher cost. Cost pressures can focus attention on options that might produce worse outcomes but reduce Medicare costs.
A more formal and comprehensive way of considering outcomes and costs together is cost-effectiveness analysis. Cost-effectiveness analyses relate the estimated costs of an intervention to its expected outcomes.6 They also allow comparisons of different interventions to be made in similar units. For example, the cost per year of life gained from implementing an effective screening test can be compared to the results for other screening tests or other interventions already covered by Medicare. Although such comparisons provide some context for assessing the projected consequences of different interventions, they do not in themselves indicate what is a “reasonable” cost-effectiveness ratio. Some have suggested the use of $100,000 per life year gained as a dividing point (Laupacis et al., 1992), whereas others have cautioned against using such a criterion (Siegel et al., 1996).
Increasingly, cost-effectiveness analyses incorporate measures that reflect an intervention’s effect on both the quantity of life achieved (reduced mortality) and the health-related quality of that life (e.g., see Gold et al., 1996; IOM, 1998). Such measures include quality-adjusted life years (QALYs), disability-adjusted life years (DALYs), and years of healthy life (YHLs).
The additional dimensions captured in assessments of health-related quality of life could be particularly useful in evaluating services for the Medicare population in which chronic disease is so prevalent. For example, two interventions might be equally effective in extending survival, but they might differ in the extent to which the extra years of life were lived with or without pain or serious limitations in physical or mental functioning. A number of methods and tools have been developed to assess health-related quality of life including methods for assessing people’s preferences for different health states (e.g., a year of life lived in severe pain versus nine months lived pain free).
Although formal cost-effectiveness analyses are useful in trying to understand the “value for money” of particular interventions, the committee’s charge called only for estimates of the costs to Medicare of extending coverage. Even if the committee had gone further, it would have encountered difficulties given the limited evidence of effectiveness and the lack of quality-of-life or patient preference data for the interventions examined. Studies have compared health-related quality of life for patients on renal dialysis with posttransplant patients taking immunosuppressive drugs, but the committee did not find comparable data on the other conditions considered here. Nonetheless, the approach used here—estimating only the costs to Medicare—provides an incomplete picture of the value for money of covering a service.