The heart of the committee’s assignment, as described in the Statement of Task (see Box 1-1), was to “develop a taxonomy of [clinical prevention] evidence gaps.” This chapter describes the taxonomy that the committee developed and offers examples of how that taxonomy can be applied to describe evidence gaps in clinical prevention research. In particular, it describes the taxonomy as it relates to recommendations and I statements issued by the U.S. Preventive Services Task Force (USPSTF) in its efforts to characterize what is known about the effectiveness of various preventive services in particular populations.
The need for a taxonomy of research gaps in preventive medicine is best understood in terms of the current situation facing the USPSTF. As described in Chapter 2, the 16-member USPSTF reviews the scientific evidence related to various clinical preventive services and then makes evidence-based conclusions about the benefits of these services. The USPSTF offers recommendations for screenings, preventive medications, and behavioral counseling accompanied by a letter grade indicating the magnitude and degree of certainty of the net benefit of the service based on the evidence. In the case of an A recommendation, for instance, the USPSTF recommends the preventive service and has concluded that “there is high certainty that the net benefit is substantial” (USPSTF, 2018c). A B recommendation would also indicate that the USPSTF recommends the service, but that there is either “high certainty that the net benefit is moderate or there is moderate certainty that the net benefit is moderate to substantial.” A C recommendation indicates that the USPSTF recommends that clinicians selectively
offer a service based on their “professional judgment and patient preferences” as “there is at least moderate certainty the net benefit is small.” A D recommendation indicates that the USPSTF discourages a particular preventive service, as there is “moderate to high certainty that the service has no net benefit or that the harms outweigh the benefits.”
Some services may also receive an I statement, which is a non-recommendation, and indicates that the USPSTF could not find sufficient evidence to make a recommendation either for or against the service. I statements indicate that the USPSTF has concluded that “the current evidence [regarding the service] is insufficient to assess the balance of benefits and harms of the service. Evidence is lacking, of poor quality, or conflicting, and the balance of benefits and harms cannot be determined” (USPSTF, 2018c).
As of September 14, 2021, there are 86 published topics that have at least one letter grade recommendation or I statement. Out of 131 individual letter grade recommendations or I statements issued across the 86 topics, 53 are I statements (encompassing 46 topics) (see Appendix A). Some cover an entire preventive topic, such as screening the general population for celiac disease, where the USPSTF has concluded that there is not enough evidence to make a recommendation one way or the other (USPSTF, 2017). In other cases, the I statement relates to a particular population, as with the recommendation for screening for chlamydia and gonorrhea: the USPSTF recommends screening women with a B grade but issued an I statement regarding screening men (USPSTF, 2021b).
The U.S. Department of Health and Human Services developed the USPSTF to issue clinical practice guidelines based on the existing medical literature, with little to no focus on assessing evidence gaps. In the intervening decades, Agency for Healthcare Research and Quality (AHRQ) staff and the USPSTF have increasingly prioritized identification of evidence gaps in each recommendation statement with the goal of catalyzing clinical prevention research on those topics. Additionally, the Office of Disease Prevention (ODP) liaises with National Institutes of Health (NIH) institutes and centers to make evidence gaps identified by the USPSTF known across NIH. However, the potential synergy between the USPSTF recommendations and the clinical prevention research community remains unrealized: recommendation updates are published about every 5 years, but a recent paper by Klabunde et al. (2021) observes that, on average, 8.4 years elapse between an I statement and a definitive recommendation being published. This is the state of affairs that the USPSTF, AHRQ, and NIH would like to change, and it is the issue that this report is intended to address.
Closing the evidence gaps in I statements would allow the USPSTF to give guidance to clinicians in areas where uncertainty now reigns
and thus improve health outcomes for many different groups of people. Another issue is that any topic with a letter grade recommendation may also have evidence gaps. It would be valuable as well to identify those areas in order to strengthen the evidence base for clinical preventive services more broadly.
Because the USPSTF neither conducts original research nor funds research conducted by others, it cannot directly address these evidence gaps. However, it can recast the evidence gaps sections of its recommendation statements to improve the likelihood that clinical prevention researchers will fill in the gaps. As NIH wrote in the request for proposal to the National Academies of Sciences, Engineering, and Medicine that resulted in this study, “More rigorous or systematic categorization of evidence gaps through application of a taxonomy may help stakeholders better understand and use the evidence gaps, assess their relative importance, identify innovative methods to help fill the evidence gaps, and accelerate research that addresses evidence gaps.” This is the purpose of the taxonomy described in this chapter.
As noted in that request for proposal, the taxonomy is intended to assist a number of different groups of stakeholders, including the following:
- Organizations that develop preventive services recommendations, such as the USPSTF, to more clearly communicate evidence gaps.
- Funding agencies, including NIH, to understand, assess the relative importance of, and use evidence gaps from preventive services recommendations in order to ultimately support research to fill the gaps.
- Members of the research community to focus their research on high-priority areas that have evidence gaps.
- All stakeholders to bridge the translation of evidence gaps into funding opportunities and other initiatives that ultimately use innovative methods to close evidence gaps related to clinical preventive services.
Ultimately, the main contribution of the taxonomy is to provide a structured approach to describing evidence gaps related to preventive services so that recommendations and I statements can be thorough and consistent in how they list and describe those gaps. The taxonomy serves as a road map, providing a systematic, step-by-step approach to the characterization of evidence gaps to supplement or replace the more subjective and less systematic descriptions that are currently found in USPSTF recommendations and I statements. Standardizing the process of describing evidence gaps should help increase the consistency and thoroughness
of the USPSTF’s efforts and make the recommendation statements more clear and consistent for researchers who are interested in helping to fill in evidence gaps.
To provide a clear and thorough explanation of the taxonomy the committee developed for use by the USPSTF, this chapter begins with a description of the approach that the committee took in creating the taxonomy, including an overview of taxonomies in general and a discussion of why the committee chose the particular type of taxonomy it did. Next is a section on the broader framework that the committee developed in which the taxonomy is applied. As will be explained in more detail in that section, the committee sees the taxonomy as something that is applied not in isolation but as part of a larger effort to characterize, understand, and prioritize evidence gaps and develop a research agenda for addressing those gaps. Thus the development of a taxonomy is just one step in this larger task.
With this framework having been described, the next section goes into detail about using the taxonomy to characterize evidence gaps, while the following section discusses how those evidence gaps might be prioritized and a research agenda created that will fill in those gaps according to the prioritization. A final section offers illustrations of how the taxonomy might be applied to specific I statements.
The committee developed and honed its taxonomy and accompanying workflow over many months of research and discussion. One of the first steps was to carry out a literature search on taxonomies to develop a clear idea of what a taxonomy is and how it is used. The committee also worked to understand exactly what was being asked of them in the statement of task; this part of the process involved having conversations with the study’s sponsors as well as soliciting input from outside experts. In addition, the committee convened four public meetings to discuss the various issues and problems related to the USPSTF recommendation statements as well as to hear suggestions for the best ways to address these issues.
Taxonomies come in various forms, but they all share certain characteristics. The general purpose of any taxonomy, for instance, is to bring order to a collection of information by organizing and systematizing it with a fixed vocabulary and clearly defined structure. A well-designed taxonomy makes it easier to see the big picture and to think about and work with the mass of information that it has imposed order on. Thus
the committee investigated a multitude of types of taxonomies to decide which would be most appropriate for the task of organizing and characterizing evidence gaps concerning clinical preventive services.
A taxonomy refers to “any controlled vocabulary of terms for a subject area domain or a specific purpose. The terms may or may not be arranged in a hierarchy, and they may or may not have even more complex relationships between each other” (Hedden, 2016, p. xxiv). The most well-known modern type of taxonomy is the Linnaean taxonomy used to classify living organisms. It has a tree-like branching structure with kingdoms broken into phyla, which are divided into classes, which are subdivided into orders, which are even further subdivided into families, genera, and species. In such hierarchical taxonomies, each item—other than those at the very top or bottom levels—has a broader “parent” category to which it belongs and contains one or more “child” classifications. These taxonomies are common and are used to organize not only living organisms but also such things as books in a library or pages in a website.
However, not all collections of information can be organized in such a hierarchical taxonomy. Information specialists have devised a variety of other approaches to organizing information, which can be characterized in three broad types of knowledge organization systems:
- Term lists (authority files, glossaries, dictionaries);
- Classifications and categories (subject headings, hierarchical taxonomies, categorization schemes); and
- Relationship lists (thesauri, semantic networks, and ontologies) (Hedden, 2016, p. 2).
Traditionally, many of these knowledge organization systems were not referred to as “taxonomies”—that term was strictly for the hierarchical taxonomy and often specifically for the organization of living organisms—but since the 1990s “taxonomy” has come to be used by information specialists to describe knowledge organization systems in general. Additionally, taxonomy documents are living documents, which should be updated regularly to incorporate new concepts and elements as needed.
In learning about taxonomies and thinking about which would be most appropriate to apply to evidence gaps in preventive services, the committee decided the traditional hierarchical taxonomy was not an appropriate system to identify and categorize evidence gaps in clinical prevention, which required the allowance of multiple nested terms. Instead, the committee selected a faceted taxonomy, a type of polyhierarchical taxonomy. Because of the structure of this faceted taxonomy, evidence gaps may be localized informally while the USPSTF develops a recommendation statement, and specified later with the nested terms
within the facets. The important feature of the taxonomy is that it provides a systematic method and language for identifying and describing the various gaps in evidence relating to clinical preventive services.
Adapting the U.S. Preventive Services Task Force Analytic Framework
In its search for a suitable taxonomy, the committee examined various potential choices. It looked, for instance, at the PICOTS (population, intervention, comparator, outcome, timing, setting) framework but decided that while this is an effective approach to characterizing an individual research study, it was not a particularly suitable framework for characterizing evidence gaps. As it worked through the different possibilities, the committee decided that the taxonomy should be straightforward; researchers with no particular expertise in information sciences should be able to understand, apply, and utilize it to identify evidence gaps.
Ultimately, after considering various approaches, the committee decided that the analytic framework used by the USPSTF to assess the evidence base and issue a recommendation would also serve to guide its analysis of evidence gaps. To supplement the evidence gap categories related to the analytic framework, the committee added two other facets of the taxonomy: “foundational issues” and “dissemination and implementation” (D&I). Partly in recognition of the USPSTF’s commitment to promoting health equity (Davidson et al., 2020, 2021; Doubeni et al., 2021), the committee has made an effort to highlight how the taxonomy and workflow can contribute to the promotion of health equity.
A USPSTF analysis using the analytic framework begins with the specification of a population at risk of a preventable disease or condition, and examines the evidence for links between particular actions (screening, prevention efforts, treatment) and particular outcomes (early detection, decreased morbidity or mortality, adverse effects, etc.). Each line in the figure between an action and an outcome is associated with a question of interest, for example,
- Does screening lead to an early detection of the target condition?
- Does screening lead to adverse effects?
- Do early detection and treatment lead to reduced morbidity or mortality?
- Do early detection and treatment lead to an intermediate outcome that may in turn be related to a decrease in morbidity or mortality?
- Do early detection and treatment lead to adverse effects?
It is important to note that the evidence base addressing these key questions typically provides for two “generic” analytic frameworks for screening and behavioral interventions (see Figures 4-1 and 4-2), which outline the overall approach to developing key questions about linkages. See Figures 4-3 and 4-4 for more complex frameworks developed by the USPSTF to address specific topics (e.g., preventing dental caries among children younger than age 5 and recommending preexposure prophylaxis [PrEP]). When the USPSTF examines the evidence concerning the effectiveness of a preventive service, it uses these questions to guide its analysis and to frame the reporting of its results.
The committee decided that these main components of the analytic framework—the risk assessment, early detection, intermediate outcomes, effectiveness, harms—could serve as ideal building blocks of the taxon-
omy, allowing the USPSTF to identify those evidence gaps that are most relevant to their evaluations. However, the committee determined that a taxonomy consisting only of elements from the analytic framework would be incomplete; thus they added other components to fill it out.
Evidence gap categories related to the foundational issues facet are often already recognized by the USPSTF. For example, the USPSTF has issued an I statement for screening for glaucoma. The USPSTF (2013) notes, “The natural history of glaucoma, particularly the role of IOP [intraocular pressure] and its relationship to optic nerve damage, visual field defects, visual impairment, and blindness, is poorly understood.” While other evidence gaps address questions directly related to risk assessment, benefits, harms,
intermediate outcomes, and health outcomes, some address more foundational elements of a disease (e.g., its natural history). Many foundational evidence gaps are coupled with analytic framework gaps in the USPSTF recommendations and I statements, and not all foundational evidence gaps will need to be addressed before analytic gaps are addressed. However, foundational gaps are distinct from the analytic framework gaps because they do not directly address linkages between actions and outcomes.
Another category of evidence gaps not assessed by the USPSTF analytic framework involves the D&I of a preventive service once it has been recommended. The USPSTF does not routinely consider gaps related to D&I, but the committee agreed it was crucial to include as part of a taxonomy of clinical prevention evidence gaps. How a preventive service affects morbidity and mortality is necessary but insufficient for effective clinical prevention practice—preventive services must also be feasible, scalable, and sustainable for clinicians to implement and for patients to adhere to.
As described in more detail in the following sections, the committee decided on a taxonomy consisting of two sections: (1) three different facets of prevention-related evidence gaps: foundational issues, analytic framework gaps, and D&I gaps; and (2) two facets for use in outlining a research agenda. With this general structure, the committee assessed specific facets by applying them to existing I statements and evaluating how they were able to characterize the evidence gaps described in those statements. This process was carried out on approximately a dozen different I statements, with the taxonomy being fine-tuned according to the lessons learned from each exercise.
In addition to developing the taxonomy, the committee thought it was also important to offer some direction for how the taxonomy could be used by the USPSTF and other clinical practice guideline developers. As described in the next section, the committee members developed a workflow to outline the identification of evidence gaps and development of a research agenda.
The committee agreed early on that it would be crucial to describe both its taxonomy and the way that taxonomy should be applied to maximize its utility. The committee did this by developing a workflow graphic to visualize the appropriate steps in using the taxonomy to characterize evidence gaps, prioritize those gaps, and describe the type of research necessary to close the gaps (see Figure 4-5 for the overall workflow). This section describes the workflow, which can be viewed as a companion guide on how to use the taxonomy, along with its purpose and how it may be used.
The first step is to review the evidence concerning a particular preventive service in order to develop a recommendation statement. As far as this applies to the USPSTF, this would be after they receive their draft evidence review from an Evidence-based Practice Center with which they have contracted for the review. However, the workflow does not assume its only users will be members of the USPSTF; and any interested party may carry out the review of evidence. In some cases, it will be necessary to gather additional information from stakeholders to outline issues around D&I.
Once the review of evidence has been completed, the next step is to use the taxonomy (as described in the next section) to identify the gaps that exist in the evidence. These gaps will be placed in one of three facets of the taxonomy: foundational, analytic framework, and D&I. Most evidence gaps that would lead to an I statement will primarily characterize foundational and/or analytic framework gaps; that is, those evidence gaps that have traditionally been considered by the USPSTF. However, the committee recognizes that gaps in D&I evidence are also important to address to improve health outcomes related to clinical preventive services. Thus, although it was not mentioned in the statement of task, the committee chose to make D&I part of its taxonomy.
Once the three types of evidence gaps have been identified and characterized with the taxonomy, the next step in the workflow is to develop a research agenda. The first step in this process is to prioritize the various gaps. This is done using prioritization criteria, as discussed in the next section. The goal is to get a clear idea of which evidence gaps make sense to address first, so that finite resources for prevention research can be put to optimal use.
With a list of prioritized evidence gaps, the final step is to outline the key research studies needed to address the high-priority evidence gaps. The necessary studies are specified according to a set of characteristics that are an extension of the PICOTS framework.
Not every interested party will necessarily carry out every step in the workflow. Some users may choose to focus on the traditional USPSTF concerns and apply only the first two parts of the taxonomy—covering foundational issues and analytic framework issues—and not address D&I. Others, such as funders, might focus on evidence gaps identified by the USPSTF or another guideline-making group and prioritize the gaps differently than the USPSTF might to develop a research agenda that reflects the highest-priority research needs based on their assessment. The committee envisions that the results of the prioritization step may vary depending on the relative weights different groups or individuals place on the criteria evaluated, which in turn could result in different research needs receiving the highest priority among different stakeholders. For example, some
funders may prioritize promoting health equity and addressing health disparities in clinical prevention, thus focusing on research needs that have identified populations that experience health disparities. Ultimately, the taxonomy and workflow will ensure that stakeholders invested in clinical prevention research will share a language to advance the field.
As already described, in developing its taxonomy for classifying evidence gaps the committee chose to modify the analytic framework used by the USPSTF to describe evidence gaps. This section comprises the first three facets of the taxonomy: foundational issues, analytic framework gaps, and D&I gaps. The result is an approach that has significant overlap with the USPSTF methodology in assessing linkages between preventive services and health outcomes but expands its considerations of evidence gaps beyond those directly related to the analytic framework.
The taxonomy provides a systematic way of describing evidence gaps with a clear list of categories and a controlled vocabulary for consistency across evidence reviews. Furthermore, by offering a list of the different classes of evidence gaps that may play a role in analyzing a preventive service, the taxonomy and the workflow offer a road map for researchers and funders. The formal listing of categories also makes it easier to compare the evidence gaps among different preventive services and, ultimately, to prioritize the gaps and specify the research needed to address them (see Box 4-1 for a comprehensive view of the five facets of the evidence gaps taxonomy).
While many evidence gaps noted in the USPSTF recommendations refer to linkages in the analytic framework, some evidence gaps did refer to foundational issues. Foundational issues include gaps in a basic understanding of disease processes, psychometric properties of a service, and whether any preventive service exists. Accordingly, the committee created a “foundational issues” facet. Not every I statement will have foundational issues to address, but they will be important in some cases as prerequisites or complements to address analytic framework questions.
The third facet of the evidence gaps taxonomy is D&I, which covers an aspect of preventive services that the USPSTF traditionally does not address in recommendation statements. However, identifying and understanding evidence gaps related to D&I are important aspects of improving preventive services and should be included in any framework aimed at developing a research agenda for such services.
The existence of some foundational evidence gaps may mean guideline developers are not able to continue assessing a preventive service. Foundational evidence gaps seem likely to affect the magnitude of certainty regarding harms and benefits if they are to affect a recommendation or I statement at all. However, the USPSTF recommendations and I statements included enough reference to the following categories that the committee felt it necessary to identify a separate taxonomy facet. The committee developed the following categories for foundational evidence gaps related to clinical preventive services (see Figure 4-6 for the corresponding workflow section).
Condition Definition and Nomenclature
It is vital that researchers and clinicians working in a particular area agree on the meanings of the terms they use to describe the condition, disease, screening, intervention, results, and so forth. Having a precise definition of the disease or condition under consideration is particularly important, as it helps ensure that study findings can be compared,
contrasted, and aggregated. This is not to say that variations in a diagnosis create a gap, but rather that variations in diagnostic criteria can affect the validity of research findings that combine the results of studies using different criteria. Unfortunately, there can be variation in definitions across a field for a variety of reasons. For example, different medical specialties have varying definitions of hypertension, and different researchers have different definitions of myalgic encephalomyelitis/chronic fatigue syndrome. In addition, some diagnoses, like attention deficit/hyperactivity disorder, have changed over time, making it difficult to compare studies. In some cases, it can be difficult to know if two reports are discussing the same condition. In other cases, it may be difficult to conduct evidence reviews when not all terminology is known. Such inconsistencies can limit the conclusions that can be drawn from the literature and must be resolved in order for the USPSTF to comprehensively assess the benefits and harms of a preventive service. This category encompasses gaps in the evidence base regarding nomenclature and the definition of diseases or conditions.
Are there important details about the causes, development, and results of the disease that remain unclear? If so, it can make it difficult to determine the most effective methods of screening, for instance, or to accurately gauge the effects of an intervention. This category includes unresolved issues related to the disease’s pathophysiology, etiology, natural history, and comorbidities.
“Preventive services” refers to screening, counseling, and preventive medication—three strategies of primary and secondary preventive care. There can be foundational issues related to each of these three approaches. It is not always clear, for example, how best to screen for a particular disease. For example, the recommendation for colorectal cancer screenings refers to multiple types of screening tests. To be recommended by the USPSTF, a screening test should be consistent and well defined. Foundational evidence gaps regarding preventive services may include what devices or technology may be required to administer it, how frequently it should be administered, or if it has any method issues (e.g., does this test require a special diet?).
Development of Standards
Standards play a vital role in medical science as an agreed-upon set of rules for how data are collected, procedures are conducted, decisions are made, and so forth. Most research protocols offer detailed standards on how to perform a screening. For example, for occult blood screening of colorectal cancer, research protocols detail dietary restrictions, timing of collection, and so on. Inconsistent research standards between studies can affect the validity of research findings that combine results of studies using different standards, thus creating a gap. In that sense, the need for clear standards is similar to the need for standardized nomenclature and disease definitions—both help provide a solid foundation for everything else. In some cases, what seems to be conflicting evidence regarding a particular preventive service or intervention may actually be an indication of a lack of standards in the performance of the service or in the measurement of the outcome, and it may be necessary to clear up these inconsistencies before a clear conclusion can be drawn about the effectiveness of the service.
“Psychometric properties” refers to reliability and validity of a measurement tool (Asunta et al., 2019). Once a tool is identified and its use has been standardized, the tool itself will also require assessment as to its sensitivity, specificity, predictive value, reliability, and other factors. While there is an argument that these issues belong in the Analytic Framework, the committee feels these are foundational issues that affect test performance, which will be demonstrated (or not) in the Analytic Framework. Without evidence of rigorous evaluation, the USPSTF is unlikely to weigh in on recommending a service. For example, the USPSTF reviewed “risk assessment instruments used to identify children for whom preventive interventions might be indicated and found limited and inconsistent evidence on the validity and reliability of these tools,” which served as a contributing factor to the topic’s I statement (USPSTF, 2018b).
There may be foundational evidence gaps that do not clearly fit into any of the previous categories. The committee envisions that those who manage the taxonomy will update it as appropriate. Because of this, it is probable that subject matter not currently in the taxonomy but that is consistently added would eventually be developed into its own category within its respective facet. This applies to all facets of the taxonomy.
Analytic Framework Gaps
The committee first identified evidence gaps that can be observed in the context of the USPSTF’s generic analytic framework. Most of those gaps are associated with links between an action (i.e., screening, behavioral interventions, preventive medication) and an outcome (e.g., early detection of a condition, adverse effects associated with screening or treatment, reduced morbidity and mortality). These questions include “Does screening leading to early detection?” and “Does screening lead to adverse effects?” Because the analytic framework anchors their assessment of a preventive service, most evidence gaps identified by the USPSTF are gaps in these actions, outcomes, or the linkages between them. The committee classified these evidence gaps according to the outcomes shown in the USPSTF’s analytic frameworks: Risk Assessment and Health Equity Considerations, Early Detection, Intermediate Outcomes, Effectiveness, and Harms. Those five categories form the first level of the committee’s analytic framework taxonomy, with various subcategories included within each main category (see Figure 4-7 for the corresponding workflow section).
Risk Assessment and Health Equity Considerations
In the typical analytic framework produced by the USPSTF as exemplified by Figures 4-1 and 4-2, risk assessment serves as the beginning of the process in which a clinician offers or does not offer a preventive service to a patient. In some cases, the USPSTF recommends that clinicians offer a preventive service to the entire adult population at risk of acquiring a disease, such as screening for depression in adults, screening for hypertension, or screening for unhealthy drug use (USPSTF, 2016b, 2020b, 2021a).
However, most of the topics assessed by the USPSTF require a clinician to perform some kind of risk assessment as to whether the risk or burden of disease may be higher for an individual based on a number of risk factors. For example, a clinician may offer a nonpregnant patient a combination of behavioral counseling and pharmacotherapy after they confirm that the patient uses tobacco (USPSTF, 2021e). A clinician would also assess behavioral risk factors, asking about sexual activity and injection drug use before suggesting a patient take PrEP to prevent acquisition of HIV (USPSTF, 2019).
Many evidence gaps identified by the USPSTF in published recommendation or I statements include the lack of information about the way a disease or a preventive service affects specific subpopulations. Another
example is the recommendation for women aged 50–74 years to have a biennial mammogram (B grade). “All women,” “women with dense breasts,” and “women aged 75 years or older” each have I statements. In its notes on evidence gaps, the USPSTF observes:
Direct evidence about any differential effectiveness of breast cancer screening is lacking for important subgroups of women, such as African American women, who are at increased risk for dying of breast cancer, and older women, for whom balancing the potential benefits and harms of screening may become increasingly challenging with advancing age. (USPSTF, 2016a)
The committee developed a list of factors that could identify individuals and groups for whom evidence is lacking or insufficient regarding any aspect of the linkages between actions and outcomes in clinical prevention. They also grouped most risk factors in two categories, “behavioral and sociodemographic,” and “biological and clinical.” A list of important respective risk factors can be found in Box 4-1, though the committee recognizes that the list of risk factors is not comprehensive, and AHRQ, NIH, the USPSTF, or other stakeholders may expand the list as needed.
Behavioral and sociodemographic risk factors
This category is a large and complex set of risk factors that encompasses many of the risk characteristics that, with sufficient research, can guide application of preventive services. Some factors are easy to measure and verify (e.g., age or comorbidities), but many others depend on self-report of personal characteristics and social exposures. Many risk factors in this category include complex individual behaviors, such as substance use, exercise, diet, and the behavioral manifestations of mental illnesses, and patients may be hesitant to disclose some social circumstances. The complex issues of race, ethnicity, culture, and inequality are also contained here, including factors that may be proxies for other characteristics. Furthermore, many behavioral and sociodemographic characteristics have genetic, metabolic, and other biological underpinnings, and the pathways from genotype to phenotype are often not fully understood (Lappalainen and MacArthur, 2021) or malleable.
Sociodemographic risk factors may also serve as a surrogate for social determinants of health, which are the “social and economic conditions” and “fundamental drivers of those conditions” in which communities live, rather than the immediate needs of any one individual (Castrucci and Auerbach, 2019; CSDH, 2008; Davidson et al., 2020, 2021).
Biological and clinical risk factors
Many risk factors that underpin preventive interventions are explicitly “biological” and are often primarily assessed measuring genetic, metabolic, physiological, or other biomarkers, such as polygenic risk scores, blood pressure, and plasma total cholesterol levels or imaging outcomes. However, despite many successes, all biomarkers have measurement issues that may challenge risk assessment and limit application. Also, importantly, many persons who are otherwise generally suitable for a preventive intervention may have comorbid conditions of varying import or severity that may substantially alter the evidence base for that suitability, such as diseases that modify metabolic risk (e.g., diabetes) or personal behaviors (e.g., obsessive-compulsive disorder or eating disorders). Various treatments of comorbid conditions may also change target disease detection or progression (e.g., radiation-induced myocarditis from cancer treatment or weight gain from anti-psychotic drugs).
“Early detection” and the related evidence gaps refers to any problem in disease screening in which a disease or condition that appears to be earlier in its natural course can be detected, but when the condition is followed into the future, the untoward clinical outcomes are the same or worse, even when considering the total amount of time the disease has been in place. A useful example is that some colon tumors may be detected that seem to be in earlier stages (i.e., they are smaller and less dysplastic, but still may have the same disease and mortality outcomes over a common follow-p interval). This has often been referred to as “lead-time bias” or “length-time bias” (MacDonald et al., 2011). Here, research studies must better understand whether “early detection” and management of a disease actually favorably alter the longer-term clinical outcomes, to justify the preventive service.
The intermediary outcomes category comprises evidence gaps in “pathologic, physiologic, psychologic, social, or behavioral measures related to a preventive intervention” that are “not [health outcomes] in and of themselves” (Wolff et al., 2018, p. S5). For example, an intermediate outcome is the change in blood pressure. Due to the proximal pathophysiology between blood pressure and morbidity and mortality outcomes, an evidence base linking regular blood pressure screenings (preventive service) with lower blood pressure (intermediate outcome) may be strongly considered by the USPSTF in the absence of sufficient
evidence linking blood pressure screenings directly with morbidity and mortality outcomes. While the USPSTF “exercises caution when making a recommendation that depends in large part on the evidence linking [intermediate outcomes] and [health outcomes],” addressing gaps in intermediate outcomes may still help strengthen the evidence base for or against a preventive service (Wolff et al., 2018, p. S7).
The committee envisions two main subcategories in the intermediate outcomes category, although (as with the other categories) those using the taxonomy will likely find others that fit in this category as well. The following are two subcategories of evidence gaps:
- The effects of treatment, including behavioral interventions (i.e., does the treatment or behavioral intervention affect intermediate outcomes?)
- Association with morbidity and mortality (i.e., are the intermediate outcomes linked with health outcomes?)
In the blood pressure example, the first category would include questions about whether a particular treatment—such as a low sodium diet or the use of an ACE (angiotensin-converting enzyme) inhibitor—leads to lower blood pressure, while the second category would contain questions such as whether lowering an individual’s blood pressure as a result of treatment leads to a longer life expectancy or lowers the risk of stroke.
The effectiveness category involves questions relating to whether an early preventive service (i.e., screening, behavioral counseling, or medication) is either effective in decreasing morbidity or mortality or, in the absence of such data, intermediate outcomes (Wolff et al., 2018). Evidence gaps in this category will involve situations in which there are insufficient data to conclude that a preventive service is effective.
The harms category includes evidence gaps related to whether the preventive intervention or the treatment leads to adverse effects. As with the effectiveness category, there are two main subcategories:
- Harms associated with the screening or assessment
- Harms associated with the treatment
Evidence gaps in this category will involve situations in which there are insufficient data to conclude that a preventive intervention or a treatment leads to harms. In some cases, the USPSTF or other guideline developers may note that there are evidence gaps regarding harms of a preventive service, but ultimately not prioritize research addressing them.
Dissemination and Implementation Gaps
Historically, the USPSTF has not reviewed issues of D&I in its recommendation statements, focusing instead on whether sufficient evidence exists to conclude that a particular preventive service is effective in reducing harms. If this is the sole focus, then the taxonomy would be complete with just the two previous facets—the foundational taxonomy and the analytic framework taxonomy. However, the committee chose to approach the taxonomy from a more holistic point of view, recognizing that there is more to the decision of which preventive services to encourage than simply determining which ones have the most supporting evidence. As D&I each have multiple definitions, different guideline developers may choose to assess the dissemination of their own guidelines, the implementation of the preventive services, or a combination of the two. The committee notes that in a recent publication, USPSTF members and AHRQ staff wrote the following:
Although the use of implementation research is not within the scope of the USPSTF’s deliberations, it notes that there are critical questions about how to best implement recommended clinical preventive services in primary care practices. Additional implementation and translational research will increase the value of the USPSTF’s work and would be helpful in its deliberations. (Mabry-Hernandez et al., 2018, p. S98)
The authors do not elaborate on how such information would be helpful.
In particular, the success of a preventive service depends not only on its effectiveness in a research setting but also on how well it can be disseminated and implemented. If an intervention is difficult to disseminate or impractical to implement, it may not be a good choice as a preventive service. Consider, for example, computed tomography (CT) angiography, which can create detailed scans of blood vessels. While it is certainly a useful diagnostic tool, it is expensive, and thus there are questions concerning how widely CT angiography is likely to be used in preventive services, even in cases where it may help decrease morbidity or mortality in a group of at-risk individuals. Additionally, implementation is a major factor in terms of health equity and access.
To categorize evidence gaps related to D&I, the committee chose an existing framework developed to design and evaluate health promotion or disease prevention programs, referred to as 4S/PIPE (Ӓikӓs et al., 2017, 2019, 2020; Pronk, 2003). The 4S component refers to four elements of designing a health promotion program: size, scope, scalability, and sustainability. By contrast, PIPE is an impact metric with four different components: penetration, implementation, participation, and effectiveness. Feedback loops are included in the model that connect 4S to PIPE in an effort to ensure that insights and learnings from the PIPE component continue to inform design features that are part of the 4S component. Together they provide a breakdown of important components of the design, dissemination, and implementation of a program, so that evidence gaps related to D&I can generally be placed into one of the eight slots defined by 4S/PIPE (see Figure 4-8 for the corresponding workflow section).
“Size” refers to the magnitude, extent, relative aggregate amount or number, or dose of the program or intervention that has an impact on the user, thereby creating the desired effect (i.e., effect size). The size of an intervention can refer to a variety of aspects, depending on the details of the intervention, but it always refers to the magnitude, amount, or dose of whatever is being done or the amount of change that is generated. Thus, for a particular implementation of a preventive intervention, size might refer to the number of individuals or patients who would be targeted, the number of clinicians or health care workers who would be involved, the intervention dose or amount of change effort that is presented as part of the intervention, or the amount of change that is generated at an individual patient level or the group level of participating individuals. A size-related D&I evidence gap might be an uncertainty in the number of practitioners who would be competent to implement an intervention, the number of individuals who should be targeted, or the amount of change anticipated in a certain subpopulation. From an equity perspective, research results that provide an estimate of the effect size that may be anticipated when a certain preventive intervention or screening is implemented may not be applicable to specific subpopulations. Biases that providers or researchers may have in screenings or counseling processes can contribute to an evidence-generation problem that manifests itself in the effect size of the intervention. At the same time, anticipated effect sizes may not be achieved when implementation approaches for certain subgroups are not commensurate with their needs, thereby generating unequal impacts across various subpopulations.
Scope of Services, Including Cost
“Scope” refers to the range or extent of the set of services that are considered the components of an intervention. For example, some interventions may require more time, a number of clinicians, or involve specialists. Some interventions are designed to include outreach to the community, whereas others are limited to only those individuals who are patients of a specific clinic or care system, and therefore the scope of programmatic aspects and services will differ. Another important consideration is that the scope of program services is directly related to program costs. An understanding of program costs would allow a provider, vendor, or payer to estimate resource needs in order to implement interventions with fidelity and thus inform resource allocation decisions. Without a clearly defined scope of services an organization will not be able to assess program costs and, as a result, will be uncertain about the program’s viability, scalability, and sustainability. From an equity perspective, scope of services is directly related to, for example, culturally specific outreach or inclusion of strategies that are sensitive and specific to subpopulations of interest.
“Scalability” refers to how easily an intervention can be spread from an initial small number of practitioners to the much larger number required to make the intervention available to an entire population. Scalability is dependent upon multiple factors, including, but not limited to, the willingness of members of the target population to participate, the per-unit costs of the intervention in the context of total resource availability, the effective use of all available media to recruit and engage individuals, and the partnership potential with other stakeholders supporting similar goals. In the case of novel interventions, there may be a lack of key insights or critical evidence concerning this scalability, creating an evidence gap regarding the D&I of that preventive service. From an equity perspective, scalability of a preventive intervention may be hindered by data from available research results that are limited to observations representative of a single population (e.g., only white Caucasian males), whereas gaps exist among specific target populations (e.g., non-white female populations).
“Sustainability” refers to how easy it will be to maintain a program once it has reached the full necessary extent to cover a population. This in turn will depend on such factors as to whether there is institutional buy-in and sources of funding that can be relied on over the long term. More
specifically, it refers to the long-term, ongoing support for the program in relation to an accepted value proposition that balances allocated resources (e.g., time, money, people, or other available means) against generated revenues or benefits and includes the confirmation of long-term program support through adequate proof of performance. As with scalability, it may be difficult to judge the likely sustainability of a novel intervention that is significantly different from existing interventions. From an equity perspective, preventive interventions that are sustained over time should generate outcomes that are equally distributed across various subpopulations. Evaluations of programs implemented in the “real world” are critical to provide such insights, and such evaluations may be designed using the PIPE Impact Metric yet to be discussed. Evaluation outcomes in this regard are to be fed back into the 4S design cycle in order to improve program design and generate improvements in outcomes.
Penetration of the Program into the Intended Audience
A measure of how widely disseminated and implemented an intervention is, “penetration” refers to the percentage of the target audience that is reached by the intervention—that is, the percentage that is made aware of the intervention and to whom it is available. For new interventions with novel characteristics it may be difficult to gauge the likely penetration—thus the need for studies of penetration and the existence of evidence gaps when such studies have not be carried out. From an equity perspective, outreach to subpopulations needs to be coordinated in a way that resonates with such audiences, be culturally appropriate, and use media that ensure all members of the audience may be reached.
Implementation of the Scope of Services
Implementation is a measure of the extent to which an intervention is being carried out in accordance with the standardized set of guidelines developed for that intervention or the specific work plan associated with the service. The interpretation of the X rays or CT images used in screening can be complex, for instance, with great variation in the interpretation among practitioners; such variation could blunt the effectiveness of a preventive service that would otherwise be quite valuable. Alternatively, outreach efforts related to mammography screening may use a variety of strategies in order to achieve a certain performance target; not achieving the target may be related to incomplete implementation of the work plans specific to such strategies. Thus, evidence gaps concerning the implementation of an intervention could make it difficult to judge how effectively the intervention could be applied across an entire population.
From an equity perspective, it is critical that strategies designed to close disparity gaps are fully implemented according to plan. Failure to do so will undoubtedly generate suboptimal outcomes.
Participation in the Program
Participation refers to the uptake of a preventive service by individuals to whom the service has been offered. While penetration is concerned with the percentage of the target population that is reached by the intervention, participation examines that part of the population and asks how many of them actually do take advantage of the opportunity. Together, penetration and participation indicate the percentage of the total target audience who have the intervention available and take advantage of it. Participation depends on a wide variety of factors, including how well practitioners explain the purpose of the intervention and its value. Participation rates are also likely to vary by population characteristics—race and socioeconomic position, for instance. While there are broad patterns visible in the data on participation rates that make it possible to guess what the participation in a new type of intervention is likely to be, there will always be a great deal of uncertainty concerning the actual rates until the relevant studies have been carried out. Thus, evidence gaps are likely for this category for any new type of preventive service. From an equity perspective, participation in preventive interventions may be closely related to the degree of outreach and attention provided to the target audience. For example, despite awareness of the availability of infectious disease vaccines, subpopulations may be hesitant to engage and participate for various reasons, including lack of trust in the health care system. For such specific subpopulations additional strategies to overcome vaccine hesitancy are needed to close the gaps in vaccine rates.
Effectiveness of the Program
The last category, effectiveness, examines the portion of a target population that has participated in an intervention and asks in how many of those the intervention was effective in terms of reducing morbidity or mortality or according to some other desired endpoint. Measuring effectiveness can be challenging, particularly since it can be difficult to know on an individual level what the outcome would have been without the intervention and because many of the desired endpoints, such as mortality and various morbidities, may not become apparent for years or even decades after the intervention. Many studies of effectiveness may depend on measures of intermediate outcomes. Depending on the intervention
and the desired outcome, there may be a number of types of evidence gaps related to its effectiveness. Effectiveness may be considered as the proportion of “successes” generated—how many people among those who participated achieved a successful outcome? From an equity perspective, the proportion of successful interventions among the members of the subpopulation can be directly compared to other subpopulations, may be considered in the context of the 4Ss of program design, and may inform potential changes to be made in future implementation efforts.
The committee emphasizes that the D&I evidence gap taxonomy described here is not intended as a finished product but rather as a starting point for a taxonomy that is more closely tailored to the evidence gaps most likely to appear related to the D&I of preventive interventions. The committee settled on 4S/PIPE as a well-established framework that has clear application to the D&I of preventive services because it wished to offer a concrete example that would demonstrate the value of including D&I in the overall taxonomy, but the committee members believe that more suitable terms might be developed by others.
The committee recognizes that, unlike the case with the foundational and analytic framework facets, the D&I facet requires the USPSTF to cover new ground, as it has generally not included any dissemination or implementation issues as in its statements, except under “Considerations for Practice.” However, the committee members believe that the USPSTF members are likely to have experience in the area that will allow them to identify various evidence gaps related to D&I, particularly if they use the committee’s taxonomy as a guide. To the extent that the USPSTF members do not have the capacity or expertise to identify these gaps, they might contract with outside experts to conduct evidence reviews and do so (see Chapter 5 for more detail). However it is done, the committee believes it is important that the USPSTF and other groups invested in improving clinical prevention research identify evidence gaps related to the D&I of the preventive services. This is particularly true when it comes to the potential for D&I to address gaps in health equity.
While the committee divided the various issues related to recommending preventive services into three facets—foundational, analytic framework, and D&I—for the purpose of creating a taxonomy of these issues, it recognizes that in the real world the division is not nearly as neat as the taxonomy might seem to indicate. In reality, there is a great deal of overlap and interaction among these three broad categories. Foundational gaps in knowledge about the pathophysiology of a disease are likely to result in various gaps in knowledge related to the true effectiveness of
screening, for instance, and advances in foundational knowledge could lead to modifications in screening or even completely new techniques. Indeed, the close interplay between foundational issues and analytic framework issues means that there is a significant gray area between the two where it is not always clear whether an issue should be considered foundational or part of the analytic framework. There can be similar interplay between foundational issues and D&I or between the analytic framework and D&I.
Thus, it is important to keep in mind when categorizing the various evidence gaps related to a preventive intervention that various crosscutting issues can arise, and each of these must be dealt with on a case-by-case basis as there is no easy way to account for them in the taxonomy as defined.
Once a set of evidence gaps has been classified according to the elements of the taxonomy, the next step in the process is to develop a research agenda to address those gaps. As envisioned by the committee, putting together such a research agenda will involve two steps: prioritizing the various evidence gaps and then specifying the type of research that will be required to fill each gap—or at least each of the high-priority evidence gaps. These two steps use two respective facets that complement the rest of the taxonomy of evidence gaps in clinical research. The end result of this process will be a collection of prioritized research needs, with studies outlined to address each research need.
Prioritizing the Evidence Gaps
One of the key purposes of the taxonomy is to make it possible to characterize the various evidence gaps in a consistent way that allows researchers and policy makers to compare the evidence gaps and to prioritize them. This prioritization facet (along with the study specification facet, described next) differs from the first three facets of the taxonomy because it involves value judgments. The facets that categorize evidence gaps serve as the traditional elements of a taxonomy, for systematic categorization. For example, for a topic like screening for autism spectrum disorder (ASD) among asymptomatic children aged 18–30 months, it would be expected that multiple parties with different interests would consistently identify evidence gaps within the analytic framework facet. However, researchers and funders may have different areas of interest in addressing any or all of those gaps and thus may prioritize them differently. The prioritization facet offers consistent language on which to base those judgments (see Figure 4-9 for the corresponding workflow section).
As envisioned by the committee, the prioritization would be used mainly to order the research needs within the context of a particular preventive service or topic (e.g., the research needed to remove the I designation from the recommendation statement on using electrocardiograms to screen for atrial fibrillation). In theory, it would be possible to apply a similar process to prioritize among different recommendation statements—to decide, say, which I statements should be addressed first. Prioritizing evidence gaps should follow clinical logic as well; a topic may remain an I statement because the body of evidence required to recommend against the service (D grade) may be too large to be feasible.
After a great deal of discussion, the committee settled on a list of categories that should be taken into consideration when prioritizing research into evidence gaps related to preventive interventions. The list, which is not exhaustive, contains the following areas to consider:
Population impact assesses the magnitude of net benefit if the I statement under consideration could be transformed into a recommendation. Priority should be given to filling those evidence gaps that will make a larger difference in people’s lives or that will affect significantly more people, rather than to filling those gaps where the effect will be smaller or where fewer people will be affected.
How important is a particular evidence gap to updating an I statement to a recommendation or one letter grade recommendation to another? If only one or two gaps are responsible for the finding of insufficient evidence, it makes sense to tackle them first and then work on other gaps later to strengthen the recommendation and fill in additional details.
Existing health disparities are difficult to address in clinical prevention (AHRQ, 2018). These inequities arise from a number of complicated sociodemographic factors outside the scope of clinical practice, including inequitable distribution of resources, due to systems such as classism and racism that affect the social determinants of health. As a result, certain populations of individuals—particularly members of marginalized communities, such as racial and ethnic minorities—are less likely to receive high-quality services from the nation’s health care system. It is vital that any additional medical research or development not add to this disad-
vantage and, if possible, that it work to make the health care system more equitable. Thus, researchers might give priority to those evidence gaps that, if addressed, could advance health equity. For example, in some cases, there has been uncertainty as to whether a preventive intervention is more or less effective for a particular population or racial group, because existing research has not included said population; as a matter of equity, research that addresses that uncertainty can be prioritized.
Relative importance reflects the potential benefit the service could provide relative to other threats to health the patient population experiences. For example, screening a specific age group for a condition may continue to receive I statements despite updates because that age group has a high prevalence rate of other conditions or diseases. There are often competing risks to health in patient populations, and some may be determined as more important than others to the USPSTF, researchers, and funders.
Some evidence gaps may be more urgent to address than others, particularly if they are foundational issues that affect the analytic framework evidence gaps. The effects of the interventions recommended by the USPSTF often play out over many years or even decades, and in these cases it may not make a major difference whether an evidence gap is addressed next year or 5 years from now. But depending on the burden of disease, some evidence gaps may be more pressing and thus more highly prioritized.
Some preventive services can be adopted more quickly and more widely than others, perhaps because of the nature of the intervention or of the population in which the intervention will take place.
Value, Including Economic Considerations
Stakeholders will also consider the value of the research that may fill an evidence gap. This question may be challenging because the value of the research depends in part on the value of the preventive service itself, and different approaches to estimating this value are possible. How does one value, for instance, research about a preventive service that is
primarily intended to improve quality of life, compared with research about a preventive service that is primarily intended to increase life expectancy? There are various approaches to doing such valuations, and they can serve as the basis of estimating the cost-effectiveness of research aimed at closing different evidence gaps. When possible, priority should be given to the research that is most cost-effective.
Research studies that require unusually large study populations or expensive, time-consuming treatments of state-of-the-art measurements will be more difficult to carry out than those that require smaller populations, less expensive and less-labor-intensive treatments, and standard measurement techniques. The more feasible a study is, the faster and easier it will be to complete and the more likely it will be to be completed successfully. When prioritizing studies to address evidence gaps, priority should be given to those that are more feasible.
These prioritization considerations are not meant to be comprehensive. Instead they are intended to, first, bring attention to the importance of prioritizing research aimed at the various research gaps identified with the taxonomy and, second, to serve as a starting point for better lists that are more closely aligned with the goals of whatever organizations are doing the prioritization. Furthermore, as noted earlier, prioritization inevitably involves value judgments, so while the committee has come up with the foregoing list of considerations it finds to be most important, others will inevitably have their own opinions concerning what should be on the list. Of course, not every factor will have the same weight in determining a prioritization, and the weights used will vary from user to user and even sometimes from case to case. Some may value equity above all other considerations, for instance, while others may believe cost-effectiveness or population impact to be most important. The key here is to recognize that priorities must be set, to decide in a transparent way how priorities will be set, and then use that prioritization as a guide to which evidence gaps will be addressed first, second, and so forth.
The final step in developing a research agenda, after developing a taxonomy of evidence gaps and prioritizing those gaps is to decide on the details of the studies that will be required to fill in those gaps. This is not a novel process; indeed, it is one familiar to many investigators and funders. The approach to study specification is based on the PICOTS framework: population, intervention, comparison, outcome, timeframe, and setting
(FDA, n.d.). The committee added three more considerations important in specifying research aimed at filling in evidence gaps: Aggregability, design considerations, and potential funders and funding mechanisms (see Figure 4-10 for the corresponding workflow section). For many of the studies, effectiveness x risk and effectiveness x patient factors interactions will be important to assess. In designing such studies, suspected interactions (e.g., effectiveness differences) should be hypothesized a priori, and that purposive sampling may be necessary. Alternatively, sometimes the evidence gap will be within a subgroup, and that should perhaps be considered the entire study.
Each of those considerations is described briefly below.
The proposed study design should explicitly define the population that will be the subject of the study. In determining the population, researchers should consider its relationship to the general affected population as well as how various individual factors may affect outcomes and lead to outcome differences across different individuals, particularly in the context of addressing health disparities and promoting health equity. The researchers should also explicitly define the selection criteria they will use and take into account any biases that may arise because of how the patients are selected or because of differences in patient attrition (FDA, n.d.).
The investigators should define all of the components of the intervention and should make note of any factors that could influence the effectiveness or safety of the study, including such things as other treatments that took place before, during, or after the study and any specialized training of the provider of the intervention (FDA, n.d.).
Concerning the comparison group, researchers should specify whether they are using a placebo or an active control for comparison. In the case of an active comparator it is best to use a comparator similar to what is currently being used in treatment of the disease or condition and to avoid any comparator that performs poorly in specific subpopulations (FDA, n.d.). It should also be ensured that the comparison group is as close as possible to receiving “usual care.”
In determining which safety and effectiveness outcomes to use, researchers should give preference to those that matter most to patients and that best predict successful results over the long term. Standard outcome measures should be recommended and used wherever possible. These measures may be defined based on measures used in prior studies, or may be selected based on accuracy or comprehensiveness in evaluating the important outcomes. Thoughtful selection of outcome measures will enhance the likelihood that each study can be compared or aggregated with past and future studies. Any surrogate outcomes should be clinically relevant. Researchers should determine ahead of time what outcome measures and analyses will be used, and the findings should be reported in terms of those predetermined measures and analyses, with any post hoc analyses clearly noted (FDA, n.d.).
Timeframe for Follow-Up
Researchers should determine the duration of treatment ahead of time as well as the timing of any follow-up. Ideally, both short-term and long-term outcomes will be included in the study (FDA, n.d.).
Possible settings include the offices of primary care or specialty providers, hospital inpatient clinics, and long-term care facilities, including nursing homes. The researchers should specify in which setting(s) the intervention will take place and explain why the study’s setting is relevant to real-world use (FDA, n.d.).
Aggregability refers to how well a new study will combine with previous studies to get closer to a definitive answer to a question about preventive services. In some cases of insufficient evidence, the results of a new study can be combined with those of previous studies, resulting in aggregate data that are sufficient for the USPSTF or other bodies to issue a recommendation for a service. Aggregability should be considered when designing a study to address an evidence gap, and such considerations should be explicitly set forth in descriptions of the proposed study. However, aggregation should be done with care if the research findings are not considered robust.
Design considerations for addressing research needs may involve selecting a randomized controlled trial, observational study design, or an appropriate novel research method (see Appendix B for more on novel research methods). Other factors for consideration may include anticipated internal and external validity and extrinsic factors like time urgency or logistical barriers. Other considerations may include high per-patient costs; sample size and power; and the possibility of heterogeneous effects on different subpopulations within the study.
Potential Funders and Mechanisms
Finally, the description of a study should discuss where funding may come from, including potential funders and funding mechanisms. By combining these prioritization and specification steps, researchers can develop a systematic approach to addressing relevant evidence gaps in the evidence base for preventive interventions. The goal, of course, is to address those gaps with well-designed, high-quality studies that have the greatest chance of closing the gaps and leading to new recommendations. AHRQ offers a useful guidance on what such studies should look like. Specifically, in a guidance on using the PICOTS framework to strengthen evidence gathered in clinical trials, AHRQ concluded that “trials that provide high strength of evidence” do the following:
- Study patients who are likely to be offered the intervention in everyday practice
- Examine clinical strategies and complexities that are more likely to be replicated in practice
- Measure the most relevant set of benefits and harms
- Have low risk of bias
- Have adequate power to address subgroups
- Directly compare interventions
- Include all important intended and unintended effects including adherence and tolerability (FDA, n.d.)
To illustrate how the taxonomy can be applied to make the USPSTF recommendations more systematic and consistent, the committee prepared several examples based on existing I statements. In reading over
these examples, it is important to keep in mind that they are based on what was contained in the recommendation statements, supplemented by additional considerations suggested by committee members, and thus they are likely to be incomplete and perhaps contain some inaccuracies. In short, they are not meant to be taken literally as conclusions about the various evidence gaps but instead to serve as illustrations of how the USPSTF might proceed if it adopted the committee’s taxonomy.
The committee did not map out research agendas to address the evidence gaps related to the I statement in large part because the results of the prioritization are likely to be different for each stakeholder who goes through the process. However, once evidence gaps have been identified, the committee believes that it should be relatively straightforward to prioritize the relevant evidence gaps and design experiments to fill in those gaps. In addition, actors outside the USPSTF will take those steps, and the main purpose of this report is to offer a process to identify evidence gaps for the USPSTF, so the examples offered in this section focus on the part of the framework that is directly applicable to the USPSTF’s work.
Screening for Cognitive Impairment in Older Adults
The USPSTF has concluded, “Current evidence is insufficient to assess the balance of benefits and harms of screening for cognitive impairment in older adults” (USPSTF, 2020a). The following statements from the research needs and gaps section expand on what is needed to transform this I statement into a recommendation. (The letters following each are intended to make it easier to see how the statements from the USPSTF research needs and gaps section map onto statements in the committee’s taxonomy.) See Box 4-2 for an example of the original evidence gaps described by the USPSTF in the cognitive impairment screening, and see Tables 4-1 through 4-3 for an example of those gaps mapped to the taxonomy (identified with the same letters as Box 4-2), along with additional suggestions of evidence gaps from this committee. A program screening for cognitive impairment has two components: the screening and the management of those patients (and caregivers) who screen positive. Management might include more formal testing and both pharmacologic and nonpharmacologic interventions. The evidence is that screening can detect cognitive impairment, but there may still be D&I issues related to screening. The recommendation is an I statement because of insufficient evidence on the benefits of treatment.
|Condition Definition and Nomenclature||The body of evidence on screening and interventions for cognitive impairment would benefit from more consistent definitions … to allow comparisons across trials, especially from trials with longer-term follow-up. (C)|
|Disease Processes||Studies are needed to look for risk factors with better predictive validity for cognitive impairment. If these are found, further studies would be needed to see if identifying and treating these individuals can positively affect outcomes.|
|Preventive Services||Studies are needed on ways to improve the detection of those with early cognitive impairment who will respond to treatment.|
|Development of Standards||The body of evidence on screening and interventions for cognitive impairment would benefit from more … reporting of outcomes to allow comparisons across trials, especially from trials with longer-term follow-up. (D)|
NOTE: Text in Roman font indicates recommendations issued by the USPSTF; text in italic indicates examples of evidence gaps suggested by this committee.
|Risk Assessment and Health Equity Considerations|
|Effects of Treatment on Intermediate Outcomes|
|Link Between Intermediate Outcomes and Morbidity or Mortality|
|Effects of Screening on Reduced Morbidity or Mortality||Studies are needed of the effects of caregiver or patient–caregiver dyad interventions on delay or prevention of institutionalization (E), and the effects of delay in institutionalization on caregivers (F).
More research is needed on the effect of screening (A) and early detection (B) of cognitive impairment (mild cognitive impairment and mild to moderate dementia) on important patient, caregiver, and societal outcomes, including decision making, advance planning, and caregiver outcomes.
|Effects of Treatment on Morbidity or Mortality||Research is needed on treatments that clearly affect the long-term clinical course of cognitive impairment (G).|
|Harms Associated with Screening||It is also important that studies on screening for cognitive impairment report harms and reasons for attrition of trial participants (H).|
|Harms Associated with Treatment||It is also important that studies on interventions for cognitive impairment report harms and reasons for attrition of trial participants (H).|
|Scope of Services, Including Cost||Identify potential problems in health system delivering full range of medications and behavioral interventions.|
|Penetration of Program into the Intended Audience||Identify degree to which clinicians will add or have added screening to practice.|
|Implementation of the Scope of Services||Identify issues in implementing a complex, multi-modal intervention in clinical settings.|
|Participation in the Program||Identify barriers to patient participation in screening and likelihood of opting for or against treatment after a positive screen.|
|Effectiveness of the Program|
NOTE: For illustration in the D&I example, the committee chose to assume effectiveness of treatment.
Screening with Electrocardiograms for Atrial Fibrillation
In 2018, the USPSTF concluded, “The current evidence is insufficient to assess the balance of benefits and harms of screening for atrial fibrillation with electrocardiography” (ECG) (USPSTF, 2018a). A draft new recommendation statement for late 2021 is available for public comment, and includes the following evidence gaps (letters in parentheses were added by the committee) (USPSTF, 2021d):
- Randomized trials enrolling asymptomatic persons that directly compare screening to usual care (A) and that assess both health outcomes (B) and harms (C) are needed to understand the balance of benefits and harms of screening for AF [atrial fibrillation]. It is important that screening trials enroll sufficient participants of both sexes and diverse racial/ethnic groups to enable assessment of whether the detection of AF (D) or the benefits (E) or harms (F) of screening vary in different population groups.
- More research is needed on how to best optimize the accuracy of screening tests or strategies for AF (G).
- Understanding the stroke risk associated with subclinical AF (H), how that risk varies with duration or burden of AF (I), and the potential
- benefit of anticoagulation therapy among persons with subclinical AF is an important research need (J).
The committee transformed this paragraph into a taxonomy of evidence gaps as indicated below. As before, the committee also included some additional suggested gaps (in italic), which they determined by examining each element of the taxonomy and asking whether there might be additional gaps overlooked by the USPSTF (see Tables 4-4 through 4-6).
One of the clear benefits of using this taxonomy is the way it provides a visual overview of the evidence gaps. Unlike the paragraphs on evidence gaps, the USPSTF’s recommendation statement, which contains much of the same information, this structured taxonomy shows at a glance that there are a couple of foundational issues to address along with a large number of evidence gaps concerning the clinical application of the preventive services (i.e., the analytic framework). Thus, the taxonomy provides a “big picture” of the evidence gaps that can be drilled down into in order to get more details on the specifics of the gaps.
Finally, as noted in an earlier example, the structure of the taxonomy encourages a systematic approach to thinking about the evidence gaps that makes it possible to identify gaps that may have otherwise gone unremarked. In this case, the committee suggested a few additional evidence gaps identified using the taxonomy.
|Condition Definition and Nomenclature|
|Preventive Services||The effectiveness of newer technologies capable of assessing pulse and heart rhythm as potential screening strategies should be evaluated (E).|
|Development of Standards||Understanding how to best optimize the accuracy of ECG interpretation (D).|
|Risk Assessment and Health Equity Considerations|
|Studies are needed as to whether subpopulations exist that are at greater risk or for whom the preventive service may be more effective.|
|Early Detection||Direct comparison of screening and early detection with usual care and early detection is needed (A).|
|Effects of Treatment on Intermediate Outcomes|
|Link Between Intermediate Outcomes and Morbidity or Mortality||Research is needed to understand the stroke risk associated with brief episodes of subclinical atrial fibrillation (F).|
|Effects of Screening on Reduced Morbidity or Mortality||Randomized trials enrolling asymptomatic persons that assess health outcomes and harms are needed to understand the balance of benefits and harms of screening for atrial fibrillation (B).|
|Effects of Treatment on Morbidity or Mortality||Understanding the potential benefits of anticoagulation therapy when risk is significant is an important research need (G). The research should address the issue of whether benefits are similar across various populations.|
|Harms Associated with Screening||Randomized trials enrolling asymptomatic persons that assess harms are needed to understand the balance of benefits and harms of screening for atrial fibrillation (C).|
|Harms Associated with Treatment|
NOTE: Text in Roman font indicates recommendations issued by the USPSTF; text in italic indicates examples of evidence gaps suggested by this committee.
|Size||Research is needed to assess frequency, duration, and degree of exertion required by patients who receive an ECG, compared with pulse palpitation and compared with usual care.|
|Scope of Services, Including Cost||Research is needed to assess the accessibility of screening with ECGs in adults.
Research is also needed to assess the costs of ECG compared with other interventions.
|Scalability||At which venues can ECG be deployed to screen for atrial fibrillation in order to reach a larger proportion of adults?|
|Sustainability||Identify the barriers to population-wide implementation|
|Penetration of the Program into the Intended Audience||Research is needed to assess if all individuals 65 and older have access to care for an ECG screening.
Identify if there are equity issues caused by barriers to access to screening.
|Implementation of the Scope of Services||Identify barriers to implementation in order to reach all adults in a given community.|
|Participation in the Program||Identify reasons as to why not all adults participate in the screening.|
|Effectiveness of the Program||Research is needed to assess if the rate of case finding significantly improved over other preventive services, such as pulse palpitation.|
AHRQ (Agency for Healthcare Research and Quality). 2018. Achieving health equity in preventive services: Systematic evidence review. Effective Healthcare Program. https://effective-healthcare.ahrq.gov/products/health-equity-preventive/protocol (accessed August 31, 2021).
Ӓikӓs, A. H., N. P. Pronk, M. H. Hirvensalo, and P. Absetz. 2017. Does implementation follow design? A case study of a workplace health promotion program using the 4-S program design and the PIPE impact metric evaluation models. Journal of Occupational and Environmental Medicine 59(8):752–760.
Ӓikӓs, A. H., P. Absetz, M. H. Hirvensalo, and N. P. Pronk. 2019. What can you achieve in 8 years? A case study on participation, effectiveness, and overall impact of a comprehensive workplace health promotion program. Journal of Occupational and Environmental Medicine 61(12):964–977.
Ӓikӓs, A., P. Absetz, M. Hirvensalo, and N. Pronk. 2020. Eight-year health risks trend analysis of a comprehensive workplace health promotion program. International Journal of Environmental Research and Public Health 17(24):9426.
Asunta, P., H. Viholainen, T. Ahonen, and P. Rintala. 2019. Psychometric properties of observational tools for identifying motor difficulties: A systematic review. BMC Pediatrics 19(1):322.
Castrucci, B. C., and J. Auerbach. 2019. Meeting individual social needs falls short of addressing social determinants of health. Health Affairs Blog. https://www.healthaffairs.org/do/10.1377/hblog20190115.234942/full (accessed October 21, 2021).
Chou, R., A. Cantor, B. Zakher, J. P. Mitchell, and M. Pappas. 2014. Prevention of dental caries in children younger than 5 years old: Systematic review to update the U.S. Preventive Services Task Force recommendation. Evidence Synthesis, No. 104. Rockville, MD: Agency for Healthcare Research and Quality. https://www.ncbi.nlm.nih.gov/books/NBK202092/figure/ch2.f1 (accessed November 2, 2021).
Chou, R., C. Evans, A. Hoverman, C. Sun, T. Dana, C. Bougatsos, S. Grusing, and P.T. Korthuis. 2019. Pre-Exposure prophylaxis for the prevention of HIV infection: A systematic review for the U.S. Preventive Services Task Force. Evidence Synthesis, No. 178. Rockville, MD: Agency for Healthcare Research and Quality. https://doi.org/10.1001/jama.2019.2591 (accessed November 2, 2021).
CSDH (Committee on Social Determinants of Health). 2008. Closing the gap in a generation: Health equity through action on the social determinants of health. Geneva, Switzerland: World Health Organization. https://www.who.int/social_determinants/final_report/csdh_finalreport_2008.pdf (accessed October 21, 2021).
Davidson, K. W., A. R. Kemper, C. A. Doubeni, C. Tseng, M. A. Simon, M. Kubik, S. J. Curry, J. Mills, A. H. Krist, Q. Ngo-Metzger, and A. Borsky. 2020. Developing primary care-based recommendations for social determinants of health: Methods of the U.S. Preventive Services Task Force. Annals of Internal Medicine 173:461–467.
Davidson, K. W., A. H. Krist, C. Tseng, M. Simon, C. A. Doubeni, A. R. Kemper, M. Kubik, Q. Ngo-Metzger, J. Mills, and A. Borsky. 2021. Incorporation of social risk in US Preventive Services Task Force recommendations and identification of key challenges for primary care. JAMA 326(14):1410–1415.
Doubeni, C. A., M. Simon, and A. H. Krist. 2021. Addressing systemic racism through clinical preventive service recommendations from the US Preventive Services Task Force. JAMA 325(7):627–628.
FDA (U.S. Food and Drug Administration). n.d. Using the PICOTS framework to strengthen evidence gathered in clinical trials—guidance from the AHRQ’s Evidence-Based Practice Centers program. https://www.fda.gov/media/109448/download (accessed July 14, 2021).
Harris, R. P., M. Helfand, S. H. Woolf, K. N. Lohr, C. D. Mulrow, S. M. Teutsch, and D. Atkins. 2001. Current methods of US preventive services task force: A review of process. American Journal of Preventive Medicine 20(3):21–35.
Hedden, H. 2016. The accidental taxonomist. 2nd ed. Medford, NJ: Information Today.
Klabunde, C. N., E. M. Ellis, J. Villani, E. Neilson, K. Schwartz, E. A. Vogt, and Q. Ngo-Metzger. 2021. Characteristics of scientific evidence informing changed U.S. Preventive Services Task Force insufficient evidence statements. American Journal of Preventive Medicine. https://www.sciencedirect.com/science/article/abs/pii/S0749379721004578 (accessed November 29, 2021).
Lappalainen, T., and D. G. MacArthur. 2021. From variant to function in human disease genetics. Science 373(6562):1464–1468. https://www.science.org/doi/full/10.1126/science.abi8207 (accessed November 2, 2021).
Mabry-Hernandez, I. R., S. J. Curry, W. R. Phillips, F. A. García, K. W. Davidson, J. W. Epling, Q. Ngo-Metzger, A. S. Bierman. 2018. U.S. Preventive Services Task Force priorities for prevention research. American Journal of Preventive Medicine 54(1 Suppl 1):S95–S103.
MacDonald, A. J., H. McEwan, M. McCabe, and A. Macdonald. 2011. Age at death of patients with colorectal cancer and the effect of lead-time bias on survival in elective vs emergency surgery. Colorectal Disease 13(5):519–525.
Pronk, N. P. 2003. Designing and evaluating health promotion programs: Simple rules for a complex issue. Disease Management and Health Outcomes 11(3):149–157.
USPSTF (U.S. Preventive Services Task Force). 2013. Glaucoma: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/glaucoma-screening (accessed October 21, 2021).
USPSTF. 2016a. Breast cancer: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/breast-cancer-screening (accessed August 17, 2021).
USPSTF. 2016b. Depression in adults: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/depression-in-adults-screening (accessed October 26, 2021).
USPSTF. 2017. Celiac disease: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/celiac-disease-screening (accessed July 14, 2021).
USPSTF. 2018a. Atrial fibrillation: Screening with electrocardiography. https://www.uspreventiveservicestaskforce.org/uspstf/draft-recommendation/screening-atrial-fibrillation (accessed August 17, 2021).
USPSTF. 2018b. Child maltreatment: Interventions. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/child-maltreatment-primary-care-interventions (accessed November 2, 2021).
USPSTF. 2018c. Grade definitions. https://www.uspreventiveservicestaskforce.org/uspstf/about-uspstf/methods-and-processes/grade-definitions (accessed November 2, 2021).
USPSTF. 2019. Prevention of human immunodeficiency virus (HIV) infection: Preexposure prophylaxis. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/prevention-of-human-immunodeficiency-virus-hiv-infection-pre-exposure-prophylaxis#bootstrap-panel--6 (accessed October 19, 2021).
USPSTF. 2020a. Screening for cognitive impairment in older adults. JAMA 323(8):757–763.
USPSTF. 2020b. Unhealthy drug use: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/drug-use-illicit-screening (accessed August 17, 2021).
USPSTF. 2021a. Hypertension in adults: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/hypertension-in-adults-screening (accessed October 26, 2021).
USPSTF. 2021b. Chlamydia and gonorrhea: Screening. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/chlamydia-and-gonorrhea-screening (accessed November 2, 2021).
USPSTF. 2021c. Procedure manual section 3: Topic workplan development. https://www.uspreventiveservicestaskforce.org/uspstf/about-uspstf/methods-and-processes/procedure-manual/procedure-manual-section-3-topic-work-plan-development (accessed October 25, 2021).
USPSTF. 2021d. Screening for atrial fibrillation: Draft recommendation statement. https://www.uspreventiveservicestaskforce.org/uspstf/draft-update-summary/screening-atrial-fibrillation (accessed November 2, 2021).
USPSTF. 2021e. Tobacco smoking cessation in adults, including pregnant persons: Interventions. https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/tobacco-use-in-adults-and-pregnant-women-counseling-and-interventions (accessed August 17, 2021).
Wolff, T. A., A. H. Krist, M. LeFevre, D. E. Jonas, R. P. Harris, A. Siu, D. K. Owens, M. W. Gillman, M. H. Ebell, J. Herzstein, R. Chou, E. Whitlock, and K. Bibbins-Domingo. 2018. Update on the methods of the US Preventive Services Task Force: Linking intermediate outcomes and health outcomes in prevention. American Journal of Preventive Medicine 54(1S1):S4–S10. https://doi.org/10.1016/j.amepre.2017.08.032 (accessed November 2, 2021).