Skip to main content

Currently Skimming:


Pages 113-146

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 113...
... Old-fashioned thinking even in modern science. Cold Spring Harbor Perspectives in Biology 6:a021238.
From page 115...
... In other situations, when descent-associated population descriptors are advisable or needed for methodological reasons, this chapter gives guidance on which approaches to consider and why. In formulating these recommendations, the committee recognizes that there exists a large amount of legacy data in which study participants have already been classified on the basis of population descriptors (Khan et al., 2022; Wallace et al., 2020)
From page 116...
... to the labels used in an analysis. The primary focus of this chapter is on the first two, namely the conceptual approaches and specific language that enable appropriate and accurate use of population descriptors in genomics research.
From page 117...
... CONCLUSION AND RECOMMENDATIONS Conclusion 5-1. In employing population descriptors and assigning group labels in genetics studies, researchers tend to rely on existing and commonly used population classifications, often with unclear justifica tion for their choices.
From page 118...
... TOOLS FOR SELECTING AND USING POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH The table below and decision tree in Appendix D suggest which descent-associated population descriptors are most appropriate as analytical tools for each of the seven genetics study types outlined in this report. Note that each descriptor represents a particular concept of difference across populations.
From page 119...
... Genetic ancestry: the paths through an individual's family tree by which they have inherited DNA from specific ancestors. Genetic ancestry can be thought of in terms of lines extending upwards in a family tree from an individual through their genetic ancestors (see Figure 2-1)
From page 120...
... For example, when matching the background allele frequencies of cases to controls, there is a need to identify a set of individuals who are genetically similar, but not to rely on inferences about their genetic ancestry. Likewise, identifying individuals who are genetically similar to each other or to a reference panel is usually sufficient to delimit a subset of participants for genome-wide association studies (GWAS) . Although the distinction between genetic ancestry and genetic similarity may be subtle, it is nonetheless important to enable moving beyond fundamental misconceptions about population descriptors, particularly race and typological thinking.
From page 121...
... For example, a project may incorporate both geography and ethnicity simultaneously to distinguish, say, Kurds in Iraq from Kurds in Turkey. In some contexts, descent-associated population descriptors are used not as indicators of shared genetic ancestry but as proxies for shared environmental exposures (see "The Importance of Environmental Factors in Genetics and Genomics Research" in Chapter 2)
From page 122...
... The committee recognizes, however, that researchers may wish -- or be obligated -- to use population descriptors for other research-related activities, notably for constructing and/or describing samples of individuals whose genetic material is to be analyzed. In the interests of equity, justice, or the diversification of human genetic data and knowledge about it, researchers may choose to use race and/or ethnicity in order to identify individuals to be included in their studies (Oni-Orisan et al., 2021)
From page 123...
... ?  measure; at fine-scale, other variables may be useful 2: Trait Prediction - No population descriptors may be Mendelian Traits � E E ?
From page 124...
... Best Practice 1: To enable identification of additional cases, rather than using genetic ancestry or ethnicity, researchers should use categories based on kinship (e.g., recent genealogical ancestors) , identity-by-de scent information, or fine-scaled geographical or genetic similarity data.
From page 125...
... When the modifier alleles are known, however, sequencing the individual will provide much more accurate individual information than will population descriptors. Best Practice 3: Where the genetic basis of a trait is known, research ers should focus on characterizing the individual's alleles rather than use population descriptors as an unreliable proxy for the genomic background.
From page 126...
... , researchers should aim to directly collect information about as many potentially relevant environmental factors as possible. Best Practice 5: When including population descriptors for phenotype prediction of Mendelian traits, researchers should be explicit about whether the aim is to study genetic or environmental effects or both, and whether these can be disentangled given the study design.
From page 127...
... Best Practice 6: When mapping variants that contribute to complex traits, the goal is to conduct the study in a set of individuals that are genetically more similar, rather than to infer ancestry per se. Therefore, researchers should characterize their study participants in terms of their genetic similarity to one another or to a reference panel, with a specified similarity measure (Coop, 2022)
From page 128...
... . For related reasons, the practice of performing genetic prediction after stratifying by a population descriptor can increase predictive power because it implicitly captures both genetic similarity and shared environmental exposures.
From page 129...
... . Considerations Common to Gene Discovery and Prediction for Complex and Polygenic Traits The committee recognizes that after delimiting study participants based on genetic similarity to a reference panel, researchers may want to refer to the set of study participants with a label based on ethnicity (e.g., Yoruba)
From page 130...
... Best Practice 10: Where the goal is to control for environmental ef fects that are correlated with genomic background effects, researchers should, if possible, replace or, at least, augment the use of population descriptors with more reliable and precise measures of individual en vironmental effects. Whenever labels remain, researchers should be explicit about their reasons for using them.
From page 131...
... as a tool to better understand underlying mechanisms. When specific candidate loci or salient environmental factors are unknown, a common approach has been to use population descriptors, and in particular ancestry group labels, as a proxy for differences in allele frequencies across the genome and potentially environmental exposures.
From page 132...
... Regardless, researchers should be explicit about their intent in using population descriptors, including whether the aim is to study genetic or environmental effects or both, and whether these can be teased apart given the study design. Study Type 6: Studies of Health Disparities with Genomic Data Health disparities studies often compare groups of individuals identified by different descent-associated population descriptors (e.g., by OMB racial and ethnic categories)
From page 133...
... • Health Disparities Study Type 1: The sole goal is to study the role of one or multiple genetic variants on observed or possible health disparities between groups. Best Practice 13: In this type of study, what is needed is to consider the effects of the focal variant of interest among individuals with similar allele frequencies, so genetic similarity is the relevant de scriptor to use, and racial and ethnic labels should not be used.
From page 134...
... . In that case, descent-associated population descriptors may not be necessary, such as when the goal is to estimate the time to the most recent common genetic ancestor of modern humans at a locus (Mallick et al., 2016)
From page 135...
... In genetics studies of human evolutionary history, social or geographic population descriptors are often used to describe genetic ancestry groups inferred based on genetic similarity (e.g., labels may be based on shared characteristics of participants such as language spoken, self-identified ethnicity, or location sampled) in order to shed light on population history.
From page 136...
... Decision Tree for the Use of Population Descriptors To aid a researcher contemplating a specific genetics or genomics study, the committee believes that a decision tree to systematically decide which descent-associated population descriptors to consider using and which to
From page 137...
... . Harmonization of population descriptors, specifically, would allow greater interoperability among data sets in human genomics research.
From page 138...
... For example, for geographic population descriptors: • Identification of specific geographic labeling scheme (e.g., based on sam ple location, birthplace) • If relevant, set of geographic entities with associated shape files defin ing boundary of the entity or latitude/longitude specifying representative locations • Per individual either: • Point based: █ Latitude █ Longitude █ Estimated mean square error in units of kilometers █ Provenance: Self-reported, ascribed externally, other • Geographic entity based: █ Entity value █ Provenance: Self-report, ascribed externally, other Upholding the principle of transparency and adhering to Recommendations 6, 7, and 8 inherently support harmonization through the application of consistent definitions of population descriptors and transparent communication of methods.
From page 139...
... In the context of genetics studies, genetic similarity to specific reference sets could have advantages for promoting harmonization. While a broader sampling of human genetic diversity is needed, current candidates for specific reference sets include, for example, data from the 1000 Genomes Project, the Human Genome Diversity Project, and the Simons Genome Diversity Project (1000 Genomes Project Consortium et al., 2015; Bergström et al., 2020; Cann et al., 2002; Mallick et al., 2016)
From page 140...
... For admixed individuals themselves, a harmonious approach using the language of genetic similarity would be to refer to the best approximating reference group; for example, "1KG-PEL-like," and "1KG-PUR-like" are two among many possible genetic similarity descriptors of Latino populations, with PEL = Peruvian in Lima, Peru, and PUR = Puerto Rican in Puerto Rico. While potentially difficult to read by novices, the use of abbreviations for precision and conciseness is in fact a key aspect of scientific language in many fields (e.g., chemistry and the abbreviations for the elements, though the committee notes the analogy is not exact as there are no fundamental elements with regards to genetic ancestry)
From page 141...
... Standardization for such genetic similarity procedures may be feasible, and would be fruitful to develop, especially as a fuller representation of human genetic variation is sampled by ongoing studies. Nonetheless, the abbreviation plus -like approach would have less vagueness than the current widespread use of such terms as European genetic ancestry and African genetic ancestry, where both the reference populations and the methods to ascribe an affiliation to European or African sources are unclear and make implicit assumptions about the time frame of interest.
From page 142...
... 2023. Genetic similarity versus genetic ancestry groups as sample descriptors in human genetics.
From page 143...
... GUIDANCE FOR SELECTION AND USE 143 Giannakopoulou, O., K
From page 144...
... 2007. Race, skin color and genetic ancestry: Implications for biomedical research on health disparities.
From page 145...
... 2019. Clini cal use of current polygenic risk scores may exacerbate health disparities: A systematic literature review.
From page 146...
... American Journal of Human Genetics 104(1)


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.