National Academies Press: OpenBook

Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field (2023)

Chapter: 1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges

« Previous: SECTION I: PAST AND CURRENT USE OF POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

1

Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges

THE STUDY OF HUMAN GENETIC VARIATION

Our social conceptions of race and ethnicity do not match the underlying biological and genetic variation within our species, and we should never confuse the things that were created for the purposes of oppressing people with the nature of that biological and genetic variation.

—Joseph Graves Jr., testimony to the committee
in a public session on April 4, 2022

Genetics is the study of heredity, specifically the mechanisms by which traits or characteristics, known as phenotypes, are transmitted from one generation to the next (King et al., 2014). It is a long-standing observation that no two members of a species, except identical twins or clones, have identical features (Strickberger, 1985), spurring the development of a science that sought to understand how individual traits vary, how this variation is generated, and how it is transmitted to the next generation. This raises the question of how different members of a species can share individual traits, for example, a particular eye color. What is the biological basis of this sharing and its transmission, and is this biological basis the same or different across members of the same species? Since the rediscovery of gene transmission rules in 1900, there has remained a debate on whether such differences and commonalities are from genes, environments, or both, and when there is an effect of genes, whether it stems from one or many genes (Provine, 1971; Provine and Russell, 1986). In recent times, epigenetic

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

and stochastic variation,1 beyond genetic variation, have been elaborated upon as other causes of phenotypic variation (Panzeri and Pospisilik, 2018).

Human genetics, since its origin in 1900 with the discovery of interindividual differences in blood transfusions by Karl Landsteiner, has been exceptional among the genetic and genomic sciences in that it focuses on existing groups of individuals to examine heredity rather than only on the offspring of controlled crosses, as is possible in other species. Although the study of trait transmission in human families is widespread and has been successful for rare conditions that follow Mendelian inheritance patterns, family studies are uncommon for common continuous (metrical) phenotypes, whose inheritance patterns are complex or non-Mendelian (NIH, 2007). A more efficient and generalizable study paradigm has, therefore, been to compare and contrast groups of individuals with and without a specific trait feature, such as persons with hypertension versus persons with normal blood pressure. Specifically, what is compared are the frequency differences of a specific genetic variant, this variant being one of at least two forms (alleles) of a gene (Manolio et al., 2009).

Over the past two decades, technological advances have enabled the identification and comparison of genomic sites (base pairs) across the whole genome,2 both within and outside genes. Regardless of where they are sampled in the world, two human genomes differ at approximately 1 in 1,000 genomic sites on average or a total of 3 million positions (Sachidanandam et al., 2001). While the vast majority of non-ancestral alleles are rare (e.g., found at frequencies of below 1 percent in population samples), most of the variants that differ between two genomes are common and often found in multiple regions of the world (Biddanda et al., 2020; Rosenberg, 2021). The frequency of a variant depends on when it arose, the demographic history of humans who carried it, and whether it affects fitness.

Across the globe, geneticists have catalogued tens of millions of such variants (1000 Genomes Project Consortium, 2015). Most of the common genetic variants existing across human populations arose as early humans evolved within Africa and then migrated across Africa and the rest of the world (Chakravarti, 2014). This variation is a shared human legacy shaping, and in rare circumstances determining, human traits. Studying these variants in, say, hypertensives versus normotensives, can identify variants that are correlated with this trait difference. It takes substantially greater effort to demonstrate whether the detected variants are themselves biologi-

___________________

1 Epigenetic variation arises from chemical modification of DNA in body cells (soma), that can modify the functions of genes; not being a permanent DNA change means these are not transmitted to the next generation. Stochastic variation is alteration of gene function from random processes in cells, that are neither genetic nor epigenetic (Angers et al., 2020).

2 The totality of an individual’s DNA is known as their genome.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

cal causes of that trait difference or are simply markers that are correlated with the shared history or environment of the individuals studied.

The human population is very young in evolutionary time; when humans are grouped by geographic origin, between-group differences are substantially smaller than within-group differences. Two other historical aspects need to be considered. First, although humans have migrated into new ecologies ever since spreading within and beyond Africa, there has been extensive ancient and recent movement and mixing of peoples both within and across continents, which has affected global patterns of genetic variation (Cavalli-Sforza et al., 1996; Chakravarti, 2014). Second, many humans also have residual ancestry from long-extinct hominids such as the Neanderthals and Denisovans; its extent varies across the globe (Pääbo, 2014).

Human genetic variation is the result of many forces—historical, social, and biological—and cannot be represented by any single variable. Additionally, science is not the only, and sometimes not even the major, source of human origin stories. Each human culture, adapting to its lands and environments over time, has developed its own narrative of its emergence, stories that are rich, powerful, and deeply meaningful to it. The question today is, with all of this knowledge within reach, how should genetics studies of human phenotypes be designed and conducted?

The existence of genetic variation across geographic space does not mean that it is clustered in the distinct groups that notions of race presume. To be sure, if group boundaries on humans are imposed across the globe, thus inventing 2 or 3 or 20 “races,” average differences in allele frequencies between geographically distant groupings will be discerned. The existence of such genetic differences in the aggregate, however, is not proof that the boundaries applied were natural, objective, or otherwise genetically meaningful in the first place. Too often, statistical findings of genetic differences between groups are misinterpreted as groupings determined by significant biological/genotypic characteristics as opposed to simply reflecting widespread social presumptions about who is similar to whom based on shared physical/phenotypic characteristics. So, how should individuals and populations be described in genetics and genomics studies? To answer this question, it is crucial to reflect on what such studies aim to accomplish in the first place.

WHAT IS A STUDY USING GENETIC INFORMATION TRYING TO ACCOMPLISH?

Genetic information is assessed directly or indirectly from the genome and can be defined narrowly or broadly. Narrowly defined genetic information is based on data from direct measurements of DNA, RNA, proteins, or

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

epigenetic signatures such as DNA methylation, whereas broadly defined genetic information refers to phenotype information that includes indirect assessments of function (e.g., peripheral blood count) or form (e.g., observable traits such as eye or hair color) influenced by the genome. The advent of the Human Genome Project (HGP) now enables studies of all genes simultaneously using sequence variants across the entire genome. Genetics research typically studies the role of a variant, gene, or small number of genes in an outcome of interest, whereas approaches that interrogate the DNA sequence or epigenetic signatures across the entire genome are known as genomics studies. Both genetics and genomics studies are today common in biomedical research on humans.

Researchers use human genetic information to address a wide variety of questions about history and evolution; the development and function of cells, tissues, and organs; the biology of the human genome; and the risks and mechanisms underlying rare conditions,3 common and rare diseases,4 and heritable traits (e.g., height, blood glucose). Genetics and genomics studies are conducted by scientists from a broad range of disciplines (e.g., human and medical geneticists, physicians in various medical specialties, genetic epidemiologists, forensic scientists, evolutionary biologists, biostatisticians, demographers, anthropologists, other social scientists) with different experiences, expertise, and biases. Genetic information is increasingly easy and inexpensive to produce, and tools to analyze genetic information have become widely available and straightforward to use.

Expectations of researchers and the lay public about discoveries made by genetics studies have changed substantially over time. For decades, discovery that a condition or trait had a genetic basis, or more recently, the identification of the specific genetic basis of a condition or trait (e.g., the gene underlying a Mendelian condition such as cystic fibrosis) satisfied both the scientific community and public. However, over the past 10 years, there has been a growing expectation that genetics studies deliver information that can be used for improving health (e.g., accurately estimating the risk of a common disease or accelerating the development of novel treatment approaches and therapeutics) or precisely answering questions about population origins, migration patterns, or the effect of past environmental factors as forces of natural selection. Moreover, information from genetics,

___________________

3 In the United States, the Orphan Drug Act defines a rare disease or condition as one that affects less than 200,000 people (21 C.F.R. §316.20(b)); many rare conditions are so-called Mendelian conditions, which means that changes in a single gene are necessary and sufficient to cause the condition (Chial, 2008).

4Common, beyond frequency, refers to conditions that are variously called polygenic (many genes), multifactorial (many causes), or complex, the latter implying that both genes and environment are causal factors. Examples of such traits are cardiovascular disease, diabetes, and obesity.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

genomics, and sociogenomics studies is being used in new ways for financial, political, or legal gain (Bliss, 2020; Roberts, 2011).

Genetics has proven to be a powerful paradigm in medicine, from explaining individual differences in medical outcome (e.g., ABO blood types for blood transfusion and the human leukocyte antigen (HLA) types for organ transplant compatibility), to explaining disease pathogenesis (e.g., in persons with rare Mendelian conditions, such as Marfan syndrome), to identifying therapeutic targets from knowledge of the genes involved (e.g., PCSK9 inhibitors for reducing serum cholesterol). The field has also transformed researchers’ knowledge of where and how modern humans arose and migrated across the globe.

Yet, genetics also has substantial limitations. Virtually all conditions and traits are the result of both genetic and environmental factors as well as stochastic or nondeterministic influences. The effect of these nongenetic factors varies across different conditions and traits with some conditions strongly influenced (e.g., susceptibility to infectious disease, obesity, cardiovascular disease) and others only weakly so (e.g., achondroplasia, fragile-X syndrome, Huntington’s disease). The effect of nongenetic factors falls between these extremes for most genetic conditions, and the degree to which nongenetic factors influence a condition or trait is itself influenced by the genetic architecture of the condition (e.g., the type, number, and strength of the genetic variants involved), risk genotype(s) (e.g., the variants in an individual’s personal genome), and the effect of genetic modifiers (e.g., other genetic variants with indirect influence on the principal genes). Identifying nongenetic factors that influence a genetic condition or trait is challenging, and for most conditions they, therefore, remain unknown. Moreover, without careful study design, the effects of environmental and genetic factors can often be conflated.

It should be further noted that although genetic variation can be critical to identifying disease mechanisms and interindividual trait differences, human biological processes are universal. For example, everywhere in the world, the same ocular biology and neural pathways underlie human vision (Chakraborty et al., 2020). The ingestion of lead produces the same biochemical effects in human bodies whether they are in Alaska or Zambia (Fu and Xi, 2020). Vaccines for coronavirus disease 2019 (COVID-19) work via the same immunological mechanisms in Peru or Poland (Sadarangani et al., 2021). While environmental factors as well as inherited local genetic variants may influence these processes, their physiological mechanisms are essentially the same. In other words, genetic variation is used to identify fundamental mechanisms that are biologically universal among humans—often even relevant to other species, including ones used as model systems (e.g., mice)—to understand human biology and medicine.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

CLASSIFICATION OF GENOMICS STUDY TYPES

There is no one kind of genetics or genomics study; thus, it is helpful to consider the various classes of such studies, some of which have a long history of use while others are more recent. Such a categorization is also practical because each study type, with its different questions in mind, recruits study participants differently, and therefore may require tailored guidance to researchers on how to improve the use of population descriptors; in other words, there is no “one-size-fits-all” solution. The committee considers seven such archetypal studies, which are by no means an exhaustive list but serve to illustrate the different usages of population descriptors and highlight some of the considerations that should come into play in choosing a classification scheme for a study:

  1. Gene discovery for Mendelian traits: studies aimed at identifying the genetic basis (e.g., pathogenic variant) underlying Mendelian disorders or traits.
  2. Prediction for Mendelian traits: approaches that rely on the presence of a specific genotype to predict risk for or incidence of a Mendelian disease or specific outcome, as done in research settings or the clinical context of prenatal or newborn screening or presymptomatic testing.
    • Examples are newborn screening for phenylketonuria (PKU), sickle cell disease, and others (Watson et al., 2022) or analysis of BRCA1/2 mutation-associated tumors (e.g., Shah et al., 2022).
  3. Gene discovery for complex and polygenic traits: studies aiming to identify genetic variants associated with quantitative traits or complex disease risk, as done in genome-wide association studies (GWAS).
  4. Prediction for complex and polygenic traits: studies that aim to make probabilistic predictions about individual disease risk or traits based on genomic data.
    • Such studies often use “polygenic scores” (also called polygenic risk scores or polygenic indexes; e.g., Khera et al., 2018).
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
  1. Elucidation of molecular, cellular, or physiological mechanisms: studies using related or unrelated participants or cell lines derived from their biological tissues to understand molecular, cellular, or physiological mechanisms.
  2. Studies of health disparities with genomic data: elucidation of the role of genetic and environmental effects in how social disadvantage leads to health disparities.
  3. Studies of human evolutionary history: Inferences about human evolutionary history using samples of related or unrelated participants.

A series of population descriptors that could be tailored to specific types of genetics studies will be examined in Chapter 2, and best practices for the use of population descriptors will be discussed in Chapter 5 for each of the seven study types.5

FEATURES OF HUMAN GENOME VARIATION

By 2001, when the draft sequence of the human genome was reported (Lander et al., 2001; Venter et al., 2001), the tools developed to sequence the human genome and the resulting data were already transforming how genetics research could be done and enabling unprecedented characterization of patterns of human genetic variation (Aach et al., 2001; Birney et al., 2001; Lander et al., 2001). The sequence of the first reference genome was quickly followed by a number of efforts to characterize human genetic diversity such as the International Haplotype Map Project (HapMap),

___________________

5 The discussion of other types of genetics and genomics studies, such as those in forensics and genealogy reconstructions, are not a part of this study (see statement of task in Box 1-2).

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Genome Aggregation Database (gnomAD), and the 1000 Genomes Project (1000G). These efforts, and the subsequent debates over the sampling and applicability of a limited number of reference populations, led to grappling with the use of population descriptors, specifically race, ethnicity, and ancestry. These projects would confirm the high levels of genetic similarity among humans across the globe and the poor correspondence between racialized groups and the distribution of human genetic variation (Lewontin, 1972). In brief, scientists’ current understanding of the distribution of human genetic variation and its evolutionary origins is that

  • Anatomically modern humans arose somewhere in the African continent approximately 300,000 years ago (Hublin et al., 2017). Their descendants expanded across much of the rest of the world within the past 100,000 years, giving rise to all modern humans today (Mallick et al., 2016; Nielsen et al., 2017).
  • Mating between members of human groups occurred repeatedly throughout evolution, from interbreeding that occurred with archaic forms of humans (e.g., Neanderthal and Denisova) (Narasimhan et al., 2019; Pääbo, 2014), to gene flow between various human groups throughout the world (e.g., Gomez et al., 2014; Reich, 2018).
  • Allele frequencies over time and space diverge gradually, owing to random fluctuations (known as genetic drift) and changes caused by natural selection, and are made more similar by gene flow (Novembre and Di Rienzo, 2009). As a result of the relatively recent common origin of modern humans and the repeated mixing of groups, the alleles carried by people living all over the globe show little differentiation:
    • Levels of genetic diversity in humans are low compared to those of many other species: pairs of chromosomes differ only at approximately 1 in 1,000 sites in humans (Leffler et al., 2012), in contrast to 1 in 100 sites in the fruit fly Drosophila melanogaster and 3 in 1,000 sites in the chimpanzee (Leffler et al., 2012).
    • Alleles that are common in one population are typically shared across multiple populations, as they tend to be older. Variants that are rare in a population tend to be recent and are usually found much more locally—for example if very rare, only among close relatives (Biddanda et al., 2020).
  • Human allele frequencies tend to vary continuously with geographic distance (isolation by distance), with slightly larger differences seen across long-term inhibitors of migration such as oceans
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
  • or mountains (Rosenberg, 2021). These geographic boundaries do not correspond to racial groupings.
  • Even when differences at any given locus are subtle, information from many loci can be aggregated to make each human genome recognizably unique and to assess an individual’s genetic similarity to others (e.g., Figure 2, Novembre and Peter, 2016). This similarity measure is often paired with geographic or other labels from genetically similar individuals in order to assign the individual to a single or multiple groupings (e.g., a method might assign a single geographic population designation or model an individual as a mixture of different “ancestry clusters”; see Chapter 2).
  • In some regions of the genome, allele frequencies also vary geographically because a variant contributes to adaption to past or present local environments (Novembre and Di Rienzo, 2009). Where selection on an individual locus was strong and sustained over hundreds of generations, these allele frequency differences can be larger than is typical in the genome. In humans, there are very few cases where one allele is present at very high frequency across a broad-scale geographic region but not shared elsewhere in the world, besides at loci such as those that contribute to infectious disease susceptibility (e.g., the Duffy null allele at the Duffy gene) (Hamblin and Di Rienzo, 2000).

POPULATION CLASSIFICATION SCHEMES IN GENETICS AND GENOMICS RESEARCH

The Origins of Describing Individuals and Populations in Human Genetics

Human genetics research was propelled by the discovery of interindividual differences in blood transfusions by Karl Landsteiner in the early 1900s, and his subsequent demonstration that the bloods of humans can be classified into what we now call the A, B, AB, and O groups (Landsteiner, 1961). Importantly, as early as 1901–1903, he had also suggested that the characteristics that determine blood groups were inherited (Nobel Prize Outreach AB, 2022). Shortly after, in 1910–1911, Emil von Dungern and Ludwik Hirschfeld showed, using families of the teaching staff of Heidelberg University, that Landsteiner’s normal human serological features were inherited in a Mendelian pattern (von Dungern and Hirschfeld, 1962), thus making ABO the first known common human genetic trait (Bugert and Klüter, 2012).

In 1919, Ludwik and Hanka Hirschfeld recorded the ABO blood types in more than 8,000 soldiers and refugees on the Macedonian front during

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

World War I (Hirschfeld and Hirschfeld, 1919). These studies were the first to demonstrate differences in the frequencies of blood group alleles (variants), in mostly unrelated individuals, across different populations whom the authors refer to as “races” (Hirschfeld and Hirschfeld, 1919). The population descriptors used were highly varied and included a mix of continental, geographic, and religious labels (e.g., Europeans, Indians, and Jews). This study on a very large sample of heterogeneous individuals, which also showed geographic patterns of east–west and north–south blood group allele frequency variation, became highly influential in anthropology and human genetics by suggesting widespread allele frequency differences in human populations (Hirschfeld and Hirschfeld, 1919). By 1977, Arthur E. Mourant, a British hematologist and geneticist, had updated his compilation titled The Distribution of the Human Blood Groups to include genotype and allele frequency data from hundreds of thousands of samples collected across the globe (Mourant, 1977). These samples were also identified by a dizzying array of terms meant to signify their origin.

The choices of population descriptors used by twentieth-century scientists were consistent with a long-standing European and U.S. belief that human beings are naturally divided into biologically distinguishable races (Gossett, 1997; Hammonds and Herzig, 2009; Keel, 2018; Painter, 2010). The categorization of human beings into races was integral to settler colonialism and slavery, and simultaneously became foundational to scientific thinking in the United States (Frederickson, 2002; Higginbotham et al., forthcoming; Roberts, 2011; Smedley and Smedley, 2012; TallBear, 2013). For example, prominent nineteenth-century scientists such as Harvard biologist Louis Agassiz and Samuel Morton, president of the Academy of Natural Sciences in Philadelphia, promoted the white supremacist view that human beings were divided into unequal racial groups that descended from separate origins (Gould, 1996). These ideas continued to influence U.S. science after the Civil War and persisted through the eugenics era in the twentieth century into the twenty-first century (Graves, 2001; Reardon, 2009; Roberts, 2011; Zuberi, 2003). Some of the harmful scientific and societal practices of the eugenics era are described in a recently released statement and report6 from the American Society of Human Genetics, acknowledging and apologizing for the involvement of some of its early leaders in the American eugenics movement.

Classifying people by race has been essential to institutional racism and tightly interwoven into political, economic, legal, scientific, and social practices in the United States. Race was “baked into” the very first instruments of governance in the United States, from its first census to its first law governing who could become a citizen (both in 1790). Under the Jim Crow

___________________

6 See https://www.ashg.org/wp-content/uploads/2023/01/Facing_Our_History-Building_an_Equitable_Future_Final_Report_January_2023.pdf (accessed January 25, 2023).

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

regime, which extended from the end of Reconstruction to the Supreme Court’s 1954 Brown v. Board of Education decision and the passage of federal civil rights legislation, many states maintained rigid racial classification systems to help enforce de jure segregation (Dorr, 2008; Pascoe, 2009). The civil rights movement of the 1950s through the 1970s also shaped the scientific use of race, ethnicity, and ancestry as descriptors. New federal legislation, including the Civil Rights Act of 1964, the Voting Rights Act of 1965, the Fair Housing Act of 1968, and the Home Mortgage Disclosure Act of 1975, required many federal agencies to monitor discrimination and to do so meant classifying people, typically into racial and ethnic categories. In 1977, the Office of Management and Budget (OMB) issued Statistical Directive 15–Race and Ethnic Standards for Federal Statistics and Administrative Reporting to standardize federal agencies’ recordkeeping, collection, and presentation of data on race and ethnicity (equated with Hispanic origin), including its use on the census (OMB, 1977, 1997).

OMB Directive 15 has had widespread effects because its racial and ethnic categories have been used widely across government and the private sector, including by many scientific researchers in genetics and genomics (Kahn, 2006; Nobles, 2000). This is in part because the NIH and other federal research agencies require OMB-based racial and ethnic information collection in funding proposals and applications for purposes of inclusion (Epstein, 2007). Although OMB Directive 15 is clear that race is a social—and not a biological—classification, this categorization is frequently applied as if it were biological. Thus, the institutional demand for biomedical research to become more inclusive has led to many U.S.-based genetics and genomics research projects collecting OMB ethnic and racial category-based information on study participants, including measurement of biological differences between these groups (Epstein, 2007).

Race and racism also continue to figure in genomics research because many scientists hold the view that race is a biological category or that race is a useful proxy for human biological variation. Scientists not only learn biological concepts of race in their professional training but also, like the rest of U.S. society, are exposed from the earliest ages to racial concepts and practices (Morning, 2011). Racial taxonomy becomes a familiar way of seeing and describing the world, one that is taken for granted and presumed to be “natural” and objective (Hirschfeld, 1996; Hirschfeld and Gelman, 1997; Obasogie, 2010; Van Ausdale and Feagin, 1996). This framework has made its way unnoticed into the design and execution of scientific research. For example, a study by Fujimura and Rajagopalan (2011) highlights how despite the development of new technologies focused on genetic similarity that would preclude the need for pre-labeling populations, the use of terminology such as “ancestry” or “shared ancestry” in genetic analyses can lead to slippage toward racial concepts. In some cases, certain tools

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

and practices currently used in genomics research can blur the differences between ancestry and race (Fullwiley, 2008). In their study of geneticists’ population labeling practices, Panofsky and Bliss (2017) found “persistent and indiscriminate blending of classification schemes” that has made the definition of population in genetics “more ambiguous rather than standardized over time” (p. 59). The outsized influence of U.S.-based research on scientific practice worldwide, moreover, means that Americans’ widespread exposure to racial thought, discourse, and institutions is transmitted to scientists around the globe.

A complete history of the use of population descriptors in human genetics and the early and persistent use of race in science is beyond the scope of this report and outside of the committee’s statement of task. The brief summary provided here is meant only to emphasize several important points. First, early studies, like those by the Hirschfelds, used population descriptors of many categories—racial, continental, ethnic, religious, and more—in ways that imply an interchangeableness among them when none may have existed. Second, the biological concept of race in humans was created to support settler colonialism and slavery, and has always been entangled with racist institutions, policies, and practices. The use of race as a population descriptor in scientific research therefore has caused incalculable confusion and harm. Third, the federal requirement to use OMB categories in many contexts perpetuates the institutional racism, confusion, and harm caused by false concepts of race as a biological grouping. Fourth, racist concepts of race that are deeply embedded in science and U.S. society more broadly continue to affect scientific thinking and research. Scientists must critically examine the underlying assumptions about race—and human commonality and difference—that shape their research studies. For a more complete history of population descriptors in genetics, and for a deeper understanding of the history of the race concept and the intersections of race, science, and society, see the list of references in Box 1-1.

Local and Global Contexts

The conceptualization of “American” as an equivalent to being from the United States has led to the use of derived terms such as African American, European American, and Native American. This terminology has been adopted by the genetics community and applied in many population genomics studies (Bryc et al., 2015; Kidd et al., 2012; Price et al., 2007; Ruiz-Linares et al., 2014; Williamson et al., 2007). However, outside the United States these terms do not have the same context and may imply different meanings, as the adjective American has a geographic reach across the North and South American continents, meaning the Americas, rather than a national one (the United States). It is important to move away from

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

U.S.–centric definitions when working with global populations and to be aware of the historical use of alternative descriptors in order to come up with the best possible consensus to embrace diversity while making accurate descriptions of populations for scientific purposes.

Population group classifications are context specific and vary globally. For example, consider the classifications used in ongoing studies from three different countries: the United Kingdom (UK) Biobank, the South African HAALSI study, and the Brazilian BIPMed study (Table 1-1). All population descriptors vary with each study and are not interchangeable. The descriptors are context specific for those regions, and some involve language groups, country of origin, background, or geographic region.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

The UK Biobank (UKB) is a biomedical database of genetic and health information from 500,000 participants living in the United Kingdom.7 The data in the UKB are globally accessible to approved researchers undertaking studies related to health and disease. Thus far, there are over 30,000 global registrations (80 percent from non-UK investigators) (UK Biobank, 2022) and over 5,000 scientific papers published (Conroy et al., 2022). The population descriptors used in the UKB include labels such as white, mixed, and so on, as outlined in Table 1-1.

The Health and Aging in Africa: A Longitudinal Study of an INDEPTH Community in South Africa (HAALSI) study includes a community-based cohort of 5,059 men and women 40 years old or older.8 Study data were collected around the following areas: cognition and dementia, cardiometabolic disease, human immunodeficiency virus (HIV) and treatment, public policies and health, and multimorbidity. While no population descriptor labels are used in the study, data on country of origin were collected (Gómez-Olivé et al., 2018), and in the second wave of the survey, questions related to the languages the participants spoke were included (Berkman, 2020).

The Brazilian Initiative on Precision Medicine (BIPMed) is an initiative of five research, innovation, and dissemination centers funded by the São Paulo Research Foundation (FAPESP) (Rocha et al., 2020). The five centers share data to create BIPMed, which provides genomic and phenotypic information to the global research community. BIPMed investigates the distribution of rare and common variants within two BIPMed data sets including the Brazilian population from the metropolitan area of São Paulo. The Brazilian population structure derives from African, European, and Native American populations (de Moura et al., 2015; Mychaleckyj et al., 2017) but in the BIPMed study, the team decided to use geographic regions where individuals were born as population descriptors as this was more relevant for their study, and it was noted that two regions were not well represented in the data (Rocha et al., 2020).

Challenges with Legacy Data and Harmonization

In an effort to establish uniformity in the use of population descriptors across the globe, several international organizations including the United Nations (UN) and the European Commission have issued recommendations for their member states’ census or other data collection efforts related to race and/or ethnicity (Farkas, 2017; UN, 2017). The UN, for example, includes guidance on data collection for ethnic and/or national groups, one of which is to consult with groups that will be categorized. The guidance

___________________

7 For more information on the UK Biobank see https://www.ukbiobank.ac.uk (accessed November 3, 2022).

8 For more information on HAALSI see https://haalsi.org/data (accessed November 3, 2022).

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

TABLE 1-1 Comparison of Classification Schemes Used in Three Studies Using Genetics from Three Distinct Global Contextsa

UK Biobank HAALSI BIPMed

White:

British

Irish

Any other white background

Mixed

White and black

Caribbean

White and black African

White and Asian

Any other mixed background

Asian or Asian British

Indian

Pakistani

Bangladeshi

Any other Asian background

Black or black British

Caribbean

African

Any other black background

Chinese

Other ethnic group

Do not know

Prefer not to answer

Native Language:

Shangaan

English

Afrikaans

Zulu

Xhosa

Portuguese

Other

Geographic Regions in Brazil where participants were born:

North

Northeast

Centre West

Southeast

South

Unknown

a A more extensive, yet still not exhaustive, list of international programs and the population descriptors they use can be found in Appendix C.

NOTE: BIPMed = Brazilian Initiative on Precision Medicine; HAALSI = Health and Aging in Africa: A Longitudinal Study of an INDEPTH Community in South Africa; UK = United Kingdom.

SOURCES: https://www.ukbiobank.ac.uk; Berkman, 2020; Rocha et al., 2020.

also notes the diversity of categories and terminology across countries and states that “no internationally accepted criteria are possible” as a result (UN, 2017).

Researchers have noted the challenges of harmonization across countries. A study of 138 national censuses conducted around the world in the 1995–2004 period found that 63 percent included some kind of descent-associated question, including those on “race,” “population,” “tribe,” and “caste” (Morning, 2008). In other words, ethnoracial items were far from universal on censuses worldwide. Even among nations that did count their populations by ethnicity or race, they used a wide-ranging set of catego-

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

ries, such as “Kankanaey” in the Philippines or “Rotuman” in Fiji, that did not necessarily overlap with labels used elsewhere. In short, geographic variation in the descent-associated groups that are salient—as well as in the underlying classificatory concepts, practices, and norms that are valued—implies that a single, universal standardization is likely infeasible. In addition, any attempt to impose a standard global framework of population descriptors runs the risk of being detrimental or viewed unfavorably in many locales (Bourdieu and Wacquant, 1999; Onishi and Méheut, 2021; Wimmer, 2015).

Despite the global variation in these systems, in recent years, there has been a growing need in genomics to analyze multiple data sets across studies to increase statistical power and to make cross-study comparisons. However, heterogeneity among studies in their design, recruitment methods, population descriptors, and measurements makes it difficult to easily compare and combine the data and metadata from multiple studies. Challenges of data harmonization include how to deal with missing data or how to compare or aggregate data and metadata in which similar but nonidentical terms are used.

The goal of harmonizing population descriptors is to bring disparate classification systems into greater alignment for specific research goals. Even within a single country, many studies have different recruitment processes and reasons for their selection of population descriptors. Not only are there differences in the specific labels used but also in the underlying concepts represented. In addition, for harmonizing population descriptor data, it is challenging to address across studies differences in scale, resolution, or descriptors used, or to work with studies that use the same term but have different definitions for that term. Existing legacy data often pose additional complications; for example, because some legacy data sets were collected before standards for data sharing were established, there may be uncertainty around whether these data meet current ethical or scientific standards.

As there is no universal system of descriptors, tools and strategies are needed to harmonize them—that is, to reduce heterogeneity—when looking at data across studies. Data harmonization strives to aggregate data from multiple cohorts and/or biobanks to a degree that is scientifically adequate yet acknowledges the heterogeneity among the data sets. There are two main harmonization methods: prospective and retrospective harmonization. Prospective approaches establish standard procedures prior to data collection, making aggregation and comparison considerably easier. One such approach is using common data elements, also called CDEs, which are standardized pieces of information collected as part of a study. However, prospective methods are not always feasible, especially when using existing data sets. Thus, other investigators use retrospective methods to integrate data sets after collection.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

ATTEMPTS TO ADDRESS THE USE OF RACE, ETHNICITY, AND ANCESTRY IN THE GENOMIC ERA

Advances in the measurement of human genetic variation and subsequent debates over the sampling and applicability of reference populations have led many in the research ecosystem to grapple with the use of population descriptors, especially race, ethnicity, and ancestry. For more than 20 years, numerous articles have been published and workshops held to discuss these implications, including calls for “a new vocabulary of human genetic variation” (Sankar and Cho, 2002) and the establishment of guidelines for using racial, ethnic, and ancestral categories in human genetics research (Bonham et al., 2018; Caulfield et al., 2009; Flanagin et al., 2021; Khan et al., 2022; NIMHD, 2017; Takezawa et al., 2014; Yudell et al., 2020). Yet, two decades later, use of these descent-associated population descriptors in genetics research remains largely unchanged and controversial.

One impetus for the urgency twenty years ago arose from the rapid technological advancements that made possible whole genome analyses of genetic variation on large numbers of samples. This raised the concern that, without thoughtful guidance, classical and stereotypical views of race and ethnicity would be exacerbated by genome analyses. In 2002, Sankar and Cho published an article on the use of race as a research variable in the study of human genetic variation (Sankar and Cho, 2002). They argued that researchers need to be more thoughtful, deliberate, and precise when designing a study, analyzing the results, and reporting the findings. The authors close their article with an appeal to researchers:

It is imperative for the research community to acknowledge that the maps used in research are not the only maps used to describe the terrain they study and that careful use of language is necessary to avoid misunderstanding (Sankar and Cho, 2002, p. 1338).

Other studies have focused on why it is difficult to effect change. For example, Caulfield et al. (2009) underscore how researchers work within structures that have been defined by the complex history of race and institutional racism. The obstacles they highlighted were the requirement to use federal directives like the OMB Directive 15 categories of race and ethnicity for reporting; the media’s tendency to simplify scientific findings and use race and ethnicity as proxies without explaining how the social categories relate to the research design and results; and the qualities of race, its fluidity, ambiguity, and contingency, which make it difficult to define neatly (Caulfield et al., 2009).

The appropriate use of population descriptors in genomics research is a global issue, not one limited to the United States (Mir et al., 2013). Following a series of workshops held in 2011 and 2012 in Japan, attendees

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

noted that continental labels, such as European, African, and Asian, are tremendously broad, and that among Japanese researchers at the workshop, there was no consensus on what populations should be called Asian. The authors also pointed out, as have others, that when samples are given continental labels but are drawn from limited and specific groups, and there is no attempt to account for the “significant diversity within each region,” then the findings may not generalize to the larger group (Takezawa et al., 2014). They closed with recommendations that echo many of those from other researchers around the world, among them

  • Respect cultural preferences in labeling processes, and use names that reflect ethnic or cultural backgrounds as much as possible.
  • Use categories that are more specific to avoid misinterpretation of results as emphasizing “racial” categories.
  • Underscore that genetic and trait differences among populations do not reflect discrete differences but rather frequency or probability.
  • Develop a clear summary of research findings to aid journalists in reporting appropriate population descriptors.

In 2016, the National Human Genome Research Institute and National Institute on Minority Health and Health Disparities (NIMHD) hosted a workshop on the use of race and ethnicity data in biomedical and clinical research and how the data are and should be applied to research on minority health and health disparities (Bonham et al., 2018; NIMHD, 2017). A partial summary of the workshop’s themes and recommendations includes

  • Collect data across multiple dimensions, including self-identified race and ethnicity, race and ethnicity description by others, how individuals perceive others to view their race and ethnicity, self-identified ancestry, and genetic ancestry.
  • Update OMB categories, including disaggregating South Asian from other Asian, adding categories to describe individuals from the Middle East/North Africa, adding a category for individuals native to the United States, including an option for multiracial description, adding parent and grandparent self-identified race and ethnicity, including variables to capture sociodemographic data, and updating questions that capture information related to historical racial narratives.
  • Educate the public on the purpose of, and misconceptions about, data generated from race-associated biomedical genomics research and distinguish genetic ancestry data from sociopolitical or culturally based racial self-identification. Consider ways to improve clinician and medical student education in human population genetics.
  • Work to improve the accessibility and comparability of race and
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
  • ethnicity data via the standardization of analysis, tagging, and data reporting; harmonization of methods for data collection, analysis, and reporting; communication of community-based research incorporating race and ethnicity data with study participants; and collaborative efforts to standardize race and ethnicity descriptors in electronic health records (NIMHD, 2017).

In concluding their 2018 paper, Bonham et al. (2018) noted:

Genomic knowledge has not changed the need to move beyond the misuse of social categories of race and ethnicity as a proxy for genomic variation. The challenge that scientists and medical journal editors must address is how to report human genomic variation without inappropriately describing racial and ethnic groups as discrete population groups (p. 1534).

The National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine (TOPMed) program collects and analyzes whole-genome sequencing and other -omics data (e.g., RNA, proteins, metabolites) with a wide range of basic and clinical data on heart, lung, blood, and sleep disorders. The program has over 180,000 participants, of whom 60 percent are of non-European descent.9 TOPMed researchers have recently provided recommendations on using and reporting population descriptors for race, ethnicity, and ancestry in genomics research, including ones that acknowledge the expanding global nature of genomics research and the current focus in the United States on reckoning with racism (Khan et al., 2022):

  • Avoid using U.S. racial categories to describe study participants not in the United States.
  • Retain detailed population data, if possible, rather than lumping individuals in broader categories early on in the process.
  • Understand the potential benefits and harms of analyzing populations before deciding whether to conduct or how to conduct the study.
  • Recognize the interdisciplinary work already being done on health care disparities when using these as a justification for genomics research.
  • Follow community preference and study-specific reporting guidelines when describing study populations.

___________________

9 For more information on TOPMed see https://topmed.nhlbi.nih.gov/ (accessed December 9, 2022).

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Despite these and many other efforts, there has been little significant change in the confusing and damaging uses of race, ethnicity, and ancestry as population descriptors in genetics and genomics. In particular, scientists continue to debate whether race is a useful proxy for unmeasured biological differences in human beings—a debate that is fueled by deeply embedded, and often unexamined, biological concepts of race (Nelson et al., 2019; Wagner et al., 2017). Furthermore, scientists are part of a research enterprise whose members (e.g., journal editors, funders, research institutions) to date have failed to effectively coordinate their efforts in developing and implementing transformative policies and practices. Success requires a collective will to confront and resolve the inevitable challenges, change current ways of thinking and doing, and enrich science and society. The committee suggests a path forward in this report.

WHY IS THIS STUDY IMPORTANT? WHY ANOTHER STUDY? WHY NOW?

While this history of prior attempts to address population descriptors may create some skepticism about the usefulness of another report aiming to create best practices for this complex area, there are several reasons that this is a particularly opportune and important moment to offer concrete guidance to the research community.

Research using human genetic data has grown exponentially over the last decade. Moving from a field largely populated by geneticists, the use of genetic information is now widespread across biomedical research and requires new thinking by all researchers. In addition to a general appreciation of the importance of genetic variation in human disease and health, and the reduction in the cost of and widespread access to genomic technologies, this growth has occurred in part by major investments in large-scale studies, many of which have genomic sequence data available. With this growth, genetics research is now conducted by a wide range of investigators—many of whom have a limited understanding of the rationale and use of population descriptors in human genetics, particularly its history—both exacerbating the risk of misuse of such descriptors and creating an important opportunity to implement substantive changes. Projects such as NIH’s All of Us Research Program, the Million Veteran Program, and many others will further democratize access to genomic data for clinical research and accelerate this transformation. While some early genetics research included groups of individuals that have relatively high genetic and environmental similarity (e.g., inhabitants of Iceland, Amish residents of the United States) or conducted pedigree studies (Francomano et al., 2003), recent large-scale efforts are enrolling more cosmopolitan and a more diverse set of populations (Morales et al., 2018; Zhou et al., 2022), raising more questions about

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

how best to represent their diversity in the study data. Clear guidance about the use of population descriptors is therefore urgently needed before the mistakes of the past are baked into this new era of genetics research.

With this growth in genetics research has come the development of more advanced methods of understanding and describing population structure and variation, as well as a growing clarity about the contribution of such methods to elucidating the relationship between genetic variation and human traits and health outcomes. Methods to assess genetic similarity and infer genetic ancestry have been developed as have nongenetic approaches, such as geospatial mapping of study participants to states/provinces, cities, and neighborhoods. These advances have been accompanied by the growing recognition of the importance of social and physical environmental factors in health generally, and in modifying the relationship between genotype and disease more specifically (All of Us Research Program Investigators, 2019; Davidson et al., 2022). The importance of these factors has led to new efforts to develop and implement environmental measures in many fields, including in genetics.

These advances have not been accompanied, however, by new approaches to the use of population descriptors in genetics and genomics research. In the absence of a strong and widely disseminated conceptual framework to guide the use of population descriptors, researchers often assume that the only issue is one of finding the “correct” nomenclature for the groups whose data they analyze. This report aims to break new ground by distinguishing on one hand the fundamental conceptual decisions that genetics researchers must grapple with explicitly when they employ population descriptors, from the choices of terminology they face on the other hand. In other words, the committee emphasizes that scientists must get the descent-associated concepts right—that is, have a clear understanding of what these descriptors represent and a rigorous rationale for using them—before selecting the appropriate group categories and labels to work with. Without a deliberate, reasoned, and transparent deployment of population descriptors, human genetics and genomics studies are likely to fall into the same trap as in the past—namely, unwarranted typological thinking that reinforces long-standing prejudices about the characteristics of descent-associated groups.

Since 2020, the U.S. scientific community has become more attentive to the urgency of addressing racism and the lack of diversity in science as well as the admission that little progress has been made in making science accessible and relevant to a more diverse citizenry (Yudell et al., 2020). Research universities embarked on efforts to address diversity, equity, and inclusion in their scientific and educational programs. The social construct

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

of race, the role of intersectionality, and the fundamental effect of racism on all aspects of science and medicine have become parts of faculty trainings at many institutions (Dupree and Boykin, 2021; Holdren et al., 2022; Kossek et al., 2022).

Journal editors recognized the problems of using racial labels in research studies, with growing calls for eliminating the use of high-risk proxy measures (Flanagin et al., 2021; Nature Human Behaviour, 2022). The call to remove race from clinical prediction models, like glomerular filtration rate, spread rapidly because of the attention to the danger of false assumptions about innate racial differences and resulting harms to patients (Vyas et al., 2020). Recognition by the U.S. biomedical research community of the need to address the complex and important issue of population descriptors in genetics research has never been greater.

This Report’s Audience

Given the charge, the committee notes that the primary audience for the report is researchers who use genomic data. However, the committee recognizes that many of the recommendations and concepts presented in the report will be beneficial to the broader biomedical and social science research communities. One of the foundational tenets of the report is the need for all researchers to be intentional about which population descriptors they choose and how they use and describe descriptors in their research. Furthermore, research is increasingly multidisciplinary; thus, the recommendations in this report could be useful for investigators interested in using biological data that may not necessarily have a genetic component. The chapters of the report reflect the complexity of the task the committee was charged with and the report’s diverse audience. Chapter 5 includes a somewhat technical discussion on how to select appropriate population descriptors for genetics research, and there, the primary audience is genetics and genomics researchers. Chapters 3 and 4 focus on guiding principles to support trustworthy research and requisites for change that could facilitate implementation of the recommendations in the report. The committee notes that these two chapters are intended for a more general audience. Finally, to achieve lasting change, the recommended actions in the report will need support from a broad and multidisciplinary group of relevant parties. Chapter 6 includes recommendations for implementation and highlights the roles that study participants; funders of genetics and genomics research; professional societies and research journals; journalists, media, and researchers; and research institutions can play in conjunction with researchers to operationalize the report’s recommendations.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

WHAT IS THE GOAL OF THIS REPORT?

Given this background, the committee was asked by NIH to review and assess the existing methodologies, benefits, and challenges in using race, ethnicity, and other population descriptors in genomics research (see Box 1-2 for the full statement of task). Fourteen different institutes, program, and offices within the NIH sponsored and funded the study. The statement of task focuses on understanding the current use of population descriptors in genomics research; examining best practices in the use of race, ethnicity, and genetic ancestry as population descriptors; and identifying how best practices in the use of population descriptors could be widely adopted within the biomedical and scientific communities to strengthen genetics and genomics research. The statement of task identifies four areas that are beyond the scope of this consensus study: examining the use of race and ethnicity in clinical care; examining racism in science and genomics; examining the use of race and ethnicity in biomedical research generally (e.g., beyond nongenetic and genomics research); and providing policy recommendations to NIH and government agencies. To accomplish the task, the National Academies convened a committee of 17 members representing diverse expertise areas including human genetics; clinical genetics; population genetics; statistical and computational genetics and genomics; historical, ethical, legal, and social implications research; sociology and anthropology; and demography and population statistics (see Appendix E for the committee biographical sketches).

During the committee’s first open meeting, NIH delivered the charge to the committee and clarified information related to the statement of task and the project scope (see Appendix A for the public session agendas). NIH specified that while it would be outside the scope of the committee’s work to develop recommendations for the four areas listed in the statement of task as being beyond the scope of this study, discussion and awareness around these topics are necessary to formulate thoughtful recommendations. NIH also clarified that while examining the use of race and ethnicity in clinical care is outside the scope of the committee’s work, clinical research using genomic data would be within the scope of the report. Furthermore, representatives said that discussing issues such as the effects of systemic racism in the field of genomics could be a useful context for addressing study design recommendations. NIH also reiterated that the recommendations and best practices identified by the committee over the course of the study would be beneficial for the broader scientific and genomics research communities (as opposed to government agencies) and that the committee should have this audience in mind. NIH acknowledged that the consideration and use of population descriptors is quickly evolving in the scientific community and

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

indicated that it would be useful to identify a framework and principles for considering race, ethnicity, and other population descriptors in genomics research.

The statement of task emphasizes the use of appropriate and valid population descriptors in genomics research. Understanding the potential benefits and harms of past and current population descriptors used in genomics research is discussed at length (see Box 2-1 for key definitions). The committee is mindful that the use of population descriptors including race, ethnicity, and genetic ancestry in genomics research is currently nonstandardized and is influenced by factors such as government categories and journal reporting guidelines. Categories of race and ethnicity, as constructs of social identity and culture, have had long-standing historical implications for individuals in the United States and globally, to the marginalization of some and benefit of others. Genomics research takes place within this context, and social identity from one research participant to the next may vary. The committee is also mindful that additional variation in use of population descriptors is occurring in research studies outside of the United States, and best practices for genomics research might need to be applied differently. See Appendix A for details of the study methods.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

REFERENCES

1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526:68-74.

Aach, J., M. L. Bulyk, G. M. Church, J. Comander, A. Derti, and J. Shendure. 2001. Computational comparison of two draft sequences of the human genome. Nature 409(6822):856-859.

All of Us Research Program Investigators. 2019. The “All of Us” Research Program. New England Journal of Medicine. 381(7):668-676.

Anderson, M. J. 1988. The American census: A social history. New Haven and London: Yale University Press.

Angers, B., M. Perez, T. Menicucci, and C. Leung. 2020. Sources of epigenetic variation and their applications in natural populations. Evolutionary Applications 13(6):1262-1278.

Belsky, D. W., A. Caspi, L. Arseneault, D. L. Corcoran, B. W. Domingue, K. M. Harris, R. M. Houts, J. S. Mill, T. E. Moffitt, J. Prinz, K. Sugden, J. Wertz, B. Williams, and C. L. Odgers. 2019. Genetics and the geography of health, behaviour and attainment. Nature Human Behaviour 3(6):576-586.

Berkman, L. 2020. Health and Aging in Africa: A Longitudinal Study of an INDEPTH Community in South Africa [HAALSI]: Agincourt, South Africa, 2015-2019. Inter-university Consortium for Political and Social Research, November 11. https://doi.org/10.3886/ICPSR36633.v3.

Biddanda, A., D. P. Rice, and J. Novembre. 2020. A variant-centric perspective on geographic patterns of human allele frequency variation. eLife 9.

Birney, E., A. Bateman, M. E. Clamp, and T. J. Hubbard. 2001. Mining the draft human genome. Nature 409(6822):827-828.

Bliss, C. 2020. Conceptualizing race in the genomic age. The Hastings Center Report 50 Suppl 1:S15-S22.

Bonham, V. L., E. D. Green, and E. J. Pérez-Stable. 2018. Examining how race, ethnicity, and ancestry data are used in biomedical research. JAMA 320(15):1533-1534.

Bourdieu, P., and L. Wacquant. 1999. On the cunning of imperialist reason. Theory, Culture & Society 16(1):41-58.

Bryc, K., E. Y. Durand, J. M. Macpherson, D. Reich, and J. L. Mountain. 2015. The genetic ancestry of African Americans, Latinos, and European Americans across the United States. American Journal of Human Genetics 96(1):37-53.

Bugert, P., and H. Klüter. 2012. 100 years after von Dungern & Hirschfeld: Kinship investigation from blood groups to SNPs. Transfusion Medicine and Hemotherapy 39(3):161-162.

Byron, G. L. 2002. Symbolic blackness and ethnic difference in early Christian literature. London: Routledge.

Carter, J. K. 2008. Race: A theological account. New York: Oxford University Press.

Caulfield, T., S. M. Fullerton, S. E. Ali-Khan, L. Arbour, E. G. Burchard, R. S. Cooper, B.-J. Hardy, S. Harry, R. Hyde-Lay, J. Kahn, R. Kittles, B. A. Koenig, S. S.-J. Lee, M. Malinowski, V. Ravitsky, P. Sankar, S. W. Sherer, B. Séguin, D. Shickle, G. Suarez-Kurtz, and A. S. Daar. 2009. Race and ancestry in biomedical research: Exploring the challenges. Genome Medicine 1(1):8.

Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1996. The history and geography of human genes. Princeton, NJ: Princeton University Press.

Chakraborty, R., S. A. Read, and S. J. Vincent. 2020. Understanding myopia: Pathogenesis and mechanisms. In Updates on myopia: A clinical perspective, edited by M. Ang and T. Y. Wong. Singapore: Springer, Singapore. Pp. 65-94.

Chakravarti, A. 2014. Perspectives on human variation through the lens of diversity and race. Cold Spring Harbor Perspectives in Biology 7(a023358).

Chial, H. 2008. Mendelian genetics: Patterns of inheritance and single-gene disorders. Nature Education 1(1):63.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Chong, J. X., K. J. Buckingham, S. N. Jhangiani, C. Boehm, N. Sobreira, J. D. Smith, T. M. Harrell, M. J. McMillin, W. Wiszniewski, T. Gambin, Z. H. Coban Akdemir, K. Doheny, A. F. Scott, D. Avramopoulos, A. Chakravarti, J. Hoover-Fong, D. Mathews, P. D. Witmer, H. Ling, K. Hetrick, L. Watkins, K. E. Patterson, F. Reinier, E. Blue, D. Muzny, M. Kircher, K. Bilguvar, F. Lopez-Giraldez, V. R. Sutton, H. K. Tabor, S. M. Leal, M. Gunel, S. Mane, R. A. Gibbs, E. Boerwinkle, A. Hamosh, J. Shendure, J. R. Lupski, R. P. Lifton, D. Valle, D. A. Nickerson, Centers for Mendelian Genomics, and M. J. Bamshad. 2015. The genetic basis of Mendelian phenotypes: Discoveries, challenges, and opportunities. American Journal of Human Genetics 97(2):199-215.

Conroy, M. C., B. Lacey, J. Bešević, W. Omiyale, Q. Feng, M. Effingham, J. Sellers, S. Sheard, M. Pancholi, G. Gregory, J. Busby, R. Collins, and N. E. Allen. 2023. UK Biobank: A globally important resource for cancer research. British Journal of Cancer 128:519-527.

Daniloski, Z., T. X. Jordan, H.-H. Wessels, D. A. Hoagland, S. Kasela, M. Legut, S. Maniatis, E. P. Mimitou, L. Lu, E. Geller, O. Danziger, B. R. Rosenberg, H. Phatnani, P. Smibert, T. Lappalainen, B. R. Tenoever, and E. Sanjanade. 2021. Identification of required host factors for SARS-CoV-2 infection in human cells. Cell 184(1):92-105.e116.

Davidson, J., R. Vashisht, and A. J. Butte. 2022. From genes to geography, from cells to community, from biomolecules to behaviors: The importance of social determinants of health. Biomolecules 12(10).

De Moura, R. R., A. V. C. Coelho, V. d. Q. Balbino, S. Crovella, and L. A. C. Brandão. 2015. Meta-analysis of Brazilian genetic admixture and comparison with other Latin America countries. American Journal of Human Biology 27(5):674-680.

Dorr, G. M. 2008. Segregation’s science: Eugenics and society in Virginia. Charlottesville, VA: University of Virginia Press.

Dupree, C. H., and C. M. Boykin. 2021. Racial inequality in academia: Systemic origins, modern challenges, and policy recommendations. Policy Insights from the Behavioral and Brain Sciences 8(1):11-18.

Enattah, N. S., T. Sahi, E. Savilahti, J. D. Terwilliger, L. Peltonen, and I. Järvelä. 2002. Identification of a variant associated with adult-type hypolactasia. Nature Genetics 30(2):233-237.

Epstein, S. 2007. Inclusion: The politics of difference in medical research. Chicago, IL: University of Chicago Press.

Fan, S., D. E. Kelly, M. H. Beltrame, M. E. B. Hansen, S. Mallick, A. Ranciaro, J. Hirbo, S. Thompson, W. Beggs, T. Nyambo, S. A. Omar, D. W. Meskel, G. Belay, A. Froment, N. Patterson, D. Reich, and S. A. Tishkoff. 2019. African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations. Genome Biology 20(1).

Farkas, L. 2017. Data collection in the field of ethnicity. Brussels: European Commission.

Flanagin, A., T. Frey, and S. L. Christiansen. 2021. Updated guidance on the reporting of race and ethnicity in medical and science journals. JAMA 326(7):621.

Francomano, C. A., V. A. McKusick, and L. G. Biesecker. 2003. Medical genetic studies in the Amish: Historical perspective. American Journal of Medical Genetics 121C(1):1-4.

Frederickson, G. M. 2002. Racism: A short history. Princeton and Oxford: Princeton University Press.

Fu, Z., and S. Xi. 2020. The effects of heavy metals on human metabolism. Toxicology Mechanisms and Methods 30(3):167-176.

Fujimura, J. H., and R. Rajagopalan. 2011. Different differences: The use of ‘genetic ancestry’ versus race in biomedical human genetic research. Social Studies of Science 41(1):5-30.

Fullwiley, D. 2008. The biologistical construction of race: ‘admixture’ technology and the new genetic medicine. Social Studies of Science 38(5):695-735.

Gomez, F., J. Hirbo, and S. A. Tishkoff. 2014. Genetic variation and adaptation in Africa: Implications for human evolution and disease. Cold Spring Harbor Perspectives in Biology 6(7):a008524.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Gómez-Olivé, F. X., L. Montana, R. G. Wagner, C. W. Kabudula, J. K. Rohr, K. Kahn, T. Bärnighausen, M. Collinson, D. Canning, T. Gaziano, J. A. Salomon, C. F. Payne, A. Wade, S. M. Tollman, and L. Berkman. 2018. Cohort profile: Health and Ageing in Africa: A Longitudinal Study of an Indepth community in South Africa (HAALSI). International Journal of Epidemiology 47(3):689-690j.

Goodman, A. H., Y. T. Moses, and J. L. Jones. 2019. Race: Are we so different? UK: Wiley-Blackwell.

Gossett, T. F. 1997. Race: The history of an idea in America. New York: Oxford University Press.

Gould, S. J. 1996. The mismeasure of man. New York: W.W. Norton.

Graves, J. L., Jr. 2001. The emperor’s new clothes: Biological theories of race at the millennium. New Brunswick, NJ: Rutgers University Press.

Hamblin, M. T., and A. Di Rienzo. 2000. Detection of the signature of natural selection in humans: Evidence from the Duffy blood group locus. American Journal of Human Genetics 66(5):1669-1679.

Hammonds, E. M., and R. M. Herzig (eds). 2009. The nature of difference: Sciences of race in the United States from Jefferson to genomics. Cambridge, MA: MIT Press.

Hannaford, I. 1996. Race: The history of an idea in the West. Washington, DC: Woodrow Wilson Center Press.

Higginbotham, E., N. R. Powe, G. Barabino, E. Fuentes-Afflick, W. L. Harris, D. S. Massy, E. J. Perez-Stable, R. Pettigrew, P. Pierre, N. Risch, and C. Rotimi. Forthcoming. The use of race in health, science & society: Origins, concepts, implications, alternatives and the path forward. NAM Perspectives. Discussion Paper, National Academy of Medicine, Washington, DC.

Hirschfeld, L., and H. Hirschfeld. 1919. Serological differences between the blood of different races: The result of researches on the Macedonian front. Lancet 194(5016):675-679.

Hirschfeld, L. A. 1996. Race in the making: Cognition, culture, and the child’s construction of human kinds. Cambridge, MA: MIT Press.

Hirschfeld, L. A., and S. A. Gelman. 1997. What young children think about the relationship between language variation and social difference. Cognitive Development 12(2):213-238.

Holdren, S., Y. Iwai, N. R. Lenze, A. B. Weil, and A. M. Randolph. 2022. A novel narrative medicine approach to DEI training for medical school faculty. Teaching and Learning in Medicine:1-10.

Hublin, J.-J., A. Ben-Ncer, S. E. Bailey, S. E. Freidline, S. Neubauer, M. M. Skinner, I. Bergmann, A. Le Cabec, S. Benazzi, K. Harvati, and P. Gunz. 2017. New fossils from Jebel Irhoud, Morocco, and the pan-African origin of Homo sapiens. Nature 546(7657):289-292.

Kahn, J. 2006. Genes, race, and population: Avoiding a collision of categories. American Journal of Public Health 96(11):1965-1970.

Keel, T. 2018. Divine variations: How Christian thought became racial science. 1st ed. Stanford, CA: Stanford University Press.

Kerem, B.-S., M. Rommens, J. A. Buchanan, D. Markiewicz, T. K. Cox, A. Chakravarti, M. Buchwald, and L.-C. Tsui. 1989. Identification of the cystic fibrosis gene: Genetic analysis. Science 245(4922):1073-1080.

Khan, A. T., S. M. Gogarten, C. P. McHugh, A. M. Stilp, T. Sofer, M. L. Bowers, Q. Wong, L. A. Cupples, B. Hidalgo, A. D. Johnson, M.-L. M. McDonald, S. T. McGarvey, M. R. G. Taylor, S. M. Fullerton, M. P. Conomos, and S. C. Nelson. 2022. Recommendations on the use and reporting of race, ethnicity, and ancestry in genetic research: Experiences from the NHLBI TOPMed program. Cell Genomics 2(8):100155.

Khera, A. V., M. Chaffin, K. G. Aragam, M. E. Haas, C. Roselli, S. H. Choi, P. Natarajan, E. S. Lander, S. A. Lubitz, P. T. Ellinor, and S. Kathiresan. 2018. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics 50(9):1219-1224.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Kidd, J. M., S. Gravel, J. Byrnes, A. Moreno-Estrada, S. Musharoff, K. Bryc, J. D. Degenhardt, A. Brisbin, V. Sheth, R. Chen, S. F. McLaughlin, H. E. Peckham, L. Omberg, C. A. Bormann Chung, S. Stanley, K. Pearlstein, E. Levandowsky, S. Acevedo-Acevedo, A. Auton, A. Keinan, V. Acuña-Alonzo, R. Barquera-Lozano, S. Canizales-Quinteros, C. Eng, E. G. Burchard, A. Russell, A. Reynolds, A. G. Clark, M. G. Reese, S. E. Lincoln, A. J. Butte, F. M. De La Vega, and C. D. Bustamante. 2012. Population genetic inference from personal genome data: Impact of ancestry and admixture on human genomic variation. American Journal of Human Genetics 91(4):660-671.

King, R. C., W. D. Stansfield, and P. K. Mulligan. 2014. A dictionary of genetics. 8th ed. Oxford University Press.

Kossek, E. E., P. M. Buzzanell, B. J. Wright, C. Batz-Barbarich, A. C. Moors, C. Sullivan, K. Kokini, A. S. Hirsch, K. Maxey, and A. Nikalje. 2022. Implementing diversity training targeting faculty microaggressions and inclusion: Practical insights and initial findings. The Journal of Applied Behavioral Science 002188632211323.

Kramer, H. J., A. M. Stilp, C. C. Laurie, A. P. Reiner, J. Lash, M. L. Daviglus, S. E. Rosas, A. C. Ricardo, B. O. Tayo, M. F. Flessner, K. F. Kerr, C. Peralta, R. Durazo-Arvizu, M. Conomos, T. Thornton, J. Rotter, K. D. Taylor, J. Cai, J. Eckfeldt, H. Chen, G. Papanicolau, and N. Franceschini. 2017. African ancestry-specific alleles and kidney disease risk in Hispanics/Latinos. Journal of the American Society of Nephrology 28(3):915-922.

Kremer, B., E. Almqvist, J. Theilmann, N. Spence, H. Telenius, Y. P. Goldberg, and M. R. Hayden. 1995. Sex-dependent mechanisms for expansions and contractions of the CAG repeat on affected Huntington disease chromosomes. American Journal of Human Genetics 57(2):343-350.

Lander, E. S., L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin, K. Devon, K. Dewar, M. Doyle, W. FitzHugh, R. Funke, D. Gage, K. Harris, A. Heaford, J. Howland, L. Kann, J. Lehoczky, R. LeVine, P. McEwan, K. McKernan, J. Meldrim, J. P. Mesirov, C. Miranda, W. Morris, J. Naylor, C. Raymond, M. Rosetti, R. Santos, A. Sheridan, C. Sougnez, Y. Stange-Thomann, N. Stojanovic, A. Subramanian, D. Wyman, J. Rogers, J. Sulston, R. Ainscough, S. Beck, D. Bentley, J. Burton, C. Clee, N. Carter, A. Coulson, R. Deadman, P. Deloukas, A. Dunham, I. Dunham, R. Durbin, L. French, D. Grafham, S. Gregory, T. Hubbard, S. Humphray, A. Hunt, M. Jones, C. Lloyd, A. McMurray, L. Matthews, S. Mercer, S. Milne, J. C. Mullikin, A. Mungall, R. Plumb, M. Ross, R. Shownkeen, S. Sims, R. H. Waterston, R. K. Wilson, L. W. Hillier, J. D. McPherson, M. A. Marra, E. R. Mardis, L. A. Fulton, A. T. Chinwalla, K. H. Pepin, W. R. Gish, S. L. Chissoe, M. C. Wendl, K. D. Delehaunty, T. L. Miner, A. Delehaunty, J. B. Kramer, L. L. Cook, R. S. Fulton, D. L. Johnson, P. J. Minx, S. W. Clifton, T. Hawkins, E. Branscomb, P. Predki, P. Richardson, S. Wenning, T. Slezak, N. Doggett, J. F. Cheng, A. Olsen, S. Lucas, C. Elkin, E. Uberbacher, M. Frazier, R. A. Gibbs, D. M. Muzny, S. E. Scherer, J. B. Bouck, E. J. Sodergren, K. C. Worley, C. M. Rives, J. H. Gorrell, M. L. Metzker, S. L. Naylor, R. S. Kucherlapati, D. L. Nelson, G. M. Weinstock, Y. Sakaki, A. Fujiyama, M. Hattori, T. Yada, A. Toyoda, T. Itoh, C. Kawagoe, H. Watanabe, Y. Totoki, T. Taylor, J. Weissenbach, R. Heilig, W. Saurin, F. Artiguenave, P. Brottier, T. Bruls, E. Pelletier, C. Robert, P. Wincker, D. R. Smith, L. Doucette-Stamm, M. Rubenfield, K. Weinstock, H. M. Lee, J. Dubois, A. Rosenthal, M. Platzer, G. Nyakatura, S. Taudien, A. Rump, H. Yang, J. Yu, J. Wang, G. Huang, J. Gu, L. Hood, L. Rowen, A. Madan, S. Qin, R. W. Davis, N. A. Federspiel, A. P. Abola, M. J. Proctor, R. M. Myers, J. Schmutz, M. Dickson, J. Grimwood, D. R. Cox, M. V. Olson, R. Kaul, C. Raymond, N. Shimizu, K. Kawasaki, S. Minoshima, G. A. Evans, M. Athanasiou, R. Schultz, B. A. Roe, F. Chen, H. Pan, J. Ramser, H. Lehrach, R. Reinhardt, W. R. McCombie, M. de la Bastide, N. Dedhia, H. Blocker, K. Hornischer, G. Nordsiek, R. Agarwala, L. Aravind, J. A. Bailey, A. Bateman, S. Batzoglou, E. Birney, P. Bork, D. G. Brown, C. B. Burge, L. Cerutti, H. C. Chen, D. Church, M. Clamp, R. R. Copley, T. Doerks, S. R. Eddy, E. E. Eichler, T. S. Furey, J. Galagan, J. G. Gilbert, C. Harmon, Y. Hayashizaki, D. Haussler, H. Hermjakob,

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

K. Hokamp, W. Jang, L. S. Johnson, T. A. Jones, S. Kasif, A. Kaspryzk, S. Kennedy, W. J. Kent, P. Kitts, E. V. Koonin, I. Korf, D. Kulp, D. Lancet, T. M. Lowe, A. McLysaght, T. Mikkelsen, J. V. Moran, N. Mulder, V. J. Pollara, C. P. Ponting, G. Schuler, J. Schultz, G. Slater, A. F. Smit, E. Stupka, J. Szustakowki, D. Thierry-Mieg, J. Thierry-Mieg, L. Wagner, J. Wallis, R. Wheeler, A. Williams, Y. I. Wolf, K. H. Wolfe, S. P. Yang, R. F. Yeh, F. Collins, M. S. Guyer, J. Peterson, A. Felsenfeld, K. A. Wetterstrand, A. Patrinos, M. J. Morgan, P. de Jong, J. J. Catanese, K. Osoegawa, H. Shizuya, S. Choi, Y. J. Chen, and J. Szustakowki. 2001. Initial sequencing and analysis of the human genome. Nature 409(6822):860-921.

Landsteiner, K. 1961. On agglutination of normal human blood. Transfusion 1(1):5-8.

Leffler, E. M., K. Bullaughey, D. R. Matute, W. K. Meyer, L. Ségurel, A. Venkat, P. Andolfatto, and M. Przeworski. 2012. Revisiting an old riddle: What determines genetic diversity levels within species? PLoS Biology 10(9):e1001388.

Lettre, G., A. U. Jackson, C. Gieger, F. R. Schumacher, S. I. Berndt, S. Sanna, S. Eyheramendy, B. F. Voight, J. L. Butler, C. Guiducci, T. Illig, R. Hackett, I. M. Heid, K. B. Jacobs, V. Lyssenko, M. Uda, M. Boehnke, S. J. Chanock, L. C. Groop, F. B. Hu, B. Isomaa, P. Kraft, L. Peltonen, V. Salomaa, D. Schlessinger, D. J. Hunter, R. B. Hayes, G. R. Abecasis, H. E. Wichmann, K. L. Mohlke, and J. N. Hirschhorn. 2008. Identification of ten loci associated with height highlights new biological pathways in human growth. Nature Genetics 40(5):584-591.

Lewontin, R. C. 1972. The apportionment of human diversity. In Evolutionary biology. Vol. 6, edited by T. Dobzhansky, M. K. Hecht, and W. C. Steere. New York: Springer. Pp. 381-398.

Mallick, S., H. Li, M. Lipson, I. Mathieson, M. Gymrek, F. Racimo, M. Zhao, N. Chennagiri, S. Nordenfelt, A. Tandon, P. Skoglund, I. Lazaridis, S. Sankararaman, Q. Fu, N. Rohland, G. Renaud, Y. Erlich, T. Willems, C. Gallo, J. P. Spence, Y. S. Song, G. Poletti, F. Balloux, G. van Driem, P. de Knijff, I. G. Romero, A. R. Jha, D. M. Behar, C. M. Bravi, C. Capelli, T. Hervig, A. Moreno-Estrada, O. L. Posukh, E. Balanovska, O. Balanovsky, S. Karachanak-Yankova, H. Sahakyan, D. Toncheva, L. Yepiskoposyan, C. Tyler-Smith, Y. Xue, M. S. Abdullah, A. Ruiz-Linares, C. M. Beall, A. Di Rienzo, C. Jeong, E. B. Starikovskaya, E. Metspalu, J. Parik, R. Villems, B. M. Henn, U. Hodoglugil, R. Mahley, A. Sajantila, G. Stamatoyannopoulos, J. T. Wee, R. Khusainova, E. Khusnutdinova, S. Litvinov, G. Ayodo, D. Comas, M. F. Hammer, T. Kivisild, W. Klitz, C. A. Winkler, D. Labuda, M. Bamshad, L. B. Jorde, S. A. Tishkoff, W. S. Watkins, M. Metspalu, S. Dryomov, R. Sukernik, L. Singh, K. Thangaraj, S. Pääbo, J. Kelso, N. Patterson, and D. Reich. 2016. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538(7624):201-206.

Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson, J. L. Haines, T. F. C. Mackay, S. A. McCarroll, and P. M. Visscher. 2009. Finding the missing heritability of complex diseases. Nature 461(7265):747-753.

Marks, J. 2017. Is science racist? New York: John Wiley & Sons.

Marx, A. W. 1998. Making race and nation: A comparison of South Africa, the United States, and Brazil. Cambridge: Cambridge University Press.

Mir, G., S. Salway, J. Kai, S. Karlsen, R. Bhopal, G. T. H. Ellison, and A. Sheikh. 2013. Principles for research on ethnicity and health: The Leeds consensus statement. European Journal of Public Health 23(3):504-510.

Molina, N. 2006. Fit to be citizens: Public health and race in Los Angeles, 1879-1939. Berkeley: University of California Press.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Morales, J., D. Welter, E. H. Bowler, M. Cerezo, L. W. Harris, A. C. McMahon, P. Hall, H. A. Junkins, A. Milano, E. Hastings, C. Malangone, A. Buniello, T. Burdett, P. Flicek, H. Parkinson, F. Cunningham, L. A. Hindorff, and J. A. L. MacArthur. 2018. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS catalog. Genome Biology 19(1):21.

Morning, A. 2008. Ethnic classification in global perspective: A cross-national survey of the 2000 census round. Population Research and Policy Review 27(2):239-272.

Morning, A. 2011. The nature of race: How scientists think and teach about human difference. Oakland, CA: University of California Press.

Mourant, A. E. 1977. The distribution of the human blood groups. 2nd ed. Oxford, UK: Blackwell Scientific Publications.

Mychaleckyj, J. C., A. Havt, U. Nayak, R. Pinkerton, E. Farber, P. Concannon, A. A. Lima, and R. L. Guerrant. 2017. Genome-wide analysis in Brazilians reveals highly differentiated Native American genome regions. Molecular Biology and Evolution 34(3):559-574.

Narasimhan, V. M., N. Patterson, P. Moorjani, N. Rohland, R. Bernardos, S. Mallick, I. Lazaridis, N. Nakatsuka, I. Olalde, M. Lipson, A. M. Kim, L. M. Olivieri, A. Coppa, M. Vidale, J. Mallory, V. Moiseyev, E. Kitov, J. Monge, N. Adamski, N. Alex, N. Broomandkhoshbacht, F. Candilio, K. Callan, O. Cheronet, B. J. Culleton, M. Ferry, D. Fernandes, S. Freilich, B. Gamarra, D. Gaudio, M. Hajdinjak, É. Harney, T. K. Harper, D. Keating, A. M. Lawson, M. Mah, K. Mandl, M. Michel, M. Novak, J. Oppenheimer, N. Rai, K. Sirak, V. Slon, K. Stewardson, F. Zalzala, Z. Zhang, G. Akhatov, A. N. Bagashev, A. Bagnera, B. Baitanayev, J. Bendezu-Sarmiento, A. A. Bissembaev, G. L. Bonora, T. T. Chargynov, T. Chikisheva, P. K. Dashkovskiy, A. Derevianko, M. Dobeš, K. Douka, N. Dubova, M. N. Duisengali, D. Enshin, A. Epimakhov, A. V. Fribus, D. Fuller, A. Goryachev, A. Gromov, S. P. Grushin, B. Hanks, M. Judd, E. Kazizov, A. Khokhlov, A. P. Krygin, E. Kupriyanova, P. Kuznetsov, D. Luiselli, F. Maksudov, A. M. Mamedov, T. B. Mamirov, C. Meiklejohn, D. C. Merrett, R. Micheli, O. Mochalov, S. Mustafokulov, A. Nayak, D. Pettener, R. Potts, D. Razhev, M. Rykun, S. Sarno, T. M. Savenkova, K. Sikhymbaeva, S. M. Slepchenko, O. A. Soltobaev, N. Stepanova, S. Svyatko, K. Tabaldiev, M. Teschler-Nicola, A. A. Tishkin, V. V. Tkachev, S. Vasilyev, P. Velemínský, D. Voyakin, A. Yermolayeva, M. Zahir, V. S. Zubkov, A. Zubova, V. S. shinde, C. Lalueza-Fox, M. Meyer, D. Anthony, N. Boivin, K. Thangaraj, D. J. Kennett, M. Frachetti, R. Pinhasi, and D. Reich. 2019. The formation of human populations in South and Central Asia. Science 365(6457):eaat7487.

Nature Human Behaviour. 2022. Science must respect the dignity and rights of all humans. Nature Human Behaviour 6(8):1029-1031.

Ng, S. B.,W. Bigham, K. J. Buckingham, M. C. Hannibal, M. J. McMillin, H. I. Gildersleeve, A. E. Beck, H. K. Tabor, G. M. Cooper, H. C. Mefford, C. Lee, E. H. Turner, J. D. Smith, M. J. Rieder, K. Yoshiura, N. Matsumoto, T. Ohta, N. Niikawa, D. A. Nickerson, M. J. Bamshad, and J. Shendure. 2010. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nature Genetics 42(9):790-793.

Nelson, S. C., J. H. Yu, J. K. Wagner, T. M. Harrell, C. D. Royal, and M. J. Bamshad. 2019. A content analysis of the views of genetics professionals on race, ancestry, and genetics. AJOB Empirical Bioethics 9(4):222-234.

Nielsen, R., J. M. Akey, M. Jakobsson, J. K. Pritchard, S. Tishkoff, and E. Willerslev. 2017. Tracing the peopling of the world through genomics. Nature 541(7637):302-310.

NIH (National Institutes of Health). 2007. Biological sciences curriculum study. https://www.ncbi.nlm.nih.gov/books/NBK20363/ (accessed December 8, 2022).

NIMHD (National Institute on Minority Health and Health Disparitie). 2017. Workshop examines the use of race and ethnicity in genomics and biomedical research. Bethesda, MD: NIMHD. https://www.nimhd.nih.gov/news-events/features/inside-nimhd/nimhd-nhgri-wrkshp.html (accessed December 2, 2022).

Nobel Prize Outreach AB. 2022. “Karl Landsteiner – facts.” NobelPrize.org. https://www.nobelprize.org/prizes/medicine/1930/landsteiner/facts/ (accessed November 17, 2022).

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Nobles, M. 2000. Shades of citizenship: Race and the census in modern politics. Redwood City, CA: Stanford University Press.

Novembre, J., and A. Di Rienzo. 2009. Spatial patterns of variation due to natural selection in humans. Nature Reviews Genetics 10(11):745-755.

Novembre, J., and B. M. Peter. 2016. Recent advances in the study of fine-scale population structure in humans. Current Opinion in Genetics & Development 41:98-105.

Obasogie, O. K. 2010. Do blind people see race? Social, legal, and theoretical considerations. Law & Society Review 44(3-4):585-616.

OMB (U.S. Office of Management and Budget). 1977. Directive no. 15 race and ethnic standards for federal statistics and administrative reporting. https://transition.fcc.gov/Bureaus/OSEC/library/legislative_histories/1195.pdf (accessed December 12, 2022).

OMB (U.S. Office of Management and Budget). 1997. Revisions to the standards for the classification of federal data on race and ethnicity. https://www.whitehouse.gov/wp-content/uploads/2017/11/Revisions-to-the-Standards-for-the-Classification-of-Federal-Data-on-Race-and-Ethnicity-October30-1997.pdf (accessed December 5, 2022).

Onishi, N., and C. Méheut. 2021. Heating up culture wars, France to scour universities for ideas that “corrupt society.” New York Times, February 21, 2021.

Pääbo, S. 2014. The human condition—a molecular approach. Cell 157(1):216-226.

Painter, N. 2010. The history of white people. New York: W.W. Norton & Company.

Panofsky, A., and C. Bliss. 2017. Ambiguity and scientific authority: Population classification in genomic science. American Sociological Review. 82(1):59-87.

Panzeri, I., and J. A. Pospisilik. 2018. Epigenetic control of variation and stochasticity in metabolic disease. Molecular Metabolism 14:26-38.

Pascoe, P. 2009. What comes naturally: Miscegenation law and the making of race in America. Oxford University Press.

Price, A. L., N. Patterson, F. Yu, D. R. Cox, A. Waliszewska, G. J. McDonald, A. Tandon, C. Schirmer, J. Neubauer, G. Bedoya, C. Duque, A. Villegas, M. C. Bortolini, F. M. Salzano, C. Gallo, G. Mazzotti, M. Tello-Ruiz, L. Riba, C. A. Aguilar-Salinas, S. Canizales-Quinteros, M. Menjivar, W. Klitz, B. Henderson, C. A. Haiman, C. Winkler, T. Tusie-Luna, A. Ruiz-Linares, and D. Reich. 2007. A genomewide admixture map for Latino populations. American Journal of Human Genetics 80(6):1024-1036.

Provine, W. B. 1971. The origins of theoretical population genetics. Chicago, IL: University of Chicago Press.

Provine, W. B., and E. S. Russell. 1986. Geneticists and race. American Zoologist 26(3):857-887.

Raffington, L., and D. W. Belsky. 2022. Integrating DNA methylation measures of biological aging into social determinants of health research. Current Environmental Health Reports 9(2):196-210.

Raffington, L., D. W. Belsky, M. Kothari, M. Malanchini, E. M. Tucker-Drob, and K. P. Harden. 2021. Socioeconomic disadvantage and the pace of biological aging in children. Pediatrics 147(6):e2020024406.

Reardon, J. 2009. Race to the finish: Identity and governance in an age of genomics. Princeton, NJ: Princeton University Press.

Reich, D. 2018. Who we are and how we got here: Ancient DNA and the new science of the human past. Oxford, United Kingdom. Oxford University Press.

Roberts, D. 2011. Fatal invention: How science, politics, and big business re-create race in the twenty-first century. New York: The New Press.

Rocha, C. S., R. Secolin, M. R. Rodrigues, B. S. Carvalho, and I. Lopes-Cendes. 2020. The Brazilian Initiative on Precision Medicine (BIPMed): Fostering genomic data-sharing of underrepresented populations. npj Genomic Medicine 5(1):42.

Rosenberg, N. A. 2021. A population-genetic perspective on the similarities and differences among worldwide human populations. Human Biology 92(3):135-152.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Ruiz-Linares, A., K. Adhikari, V. Acuña-Alonzo, M. Quinto-Sanchez, C. Jaramillo, W. Arias, M. Fuentes, M. Pizarro, P. Everardo, F. de Avila, J. Gómez-Valdés, P. León-Mimila, T. Hunemeier, V. Ramallo, C. C. Silva de Cerqueira, M. W. Burley, E. Konca, M. Z. de Oliveira, M. R. Veronez, M. Rubio-Codina, O. Attanasio, S. Gibbon, N. Ray, C. Gallo, G. Poletti, J. Rosique, L. Schuler-Faccini, F. M. Salzano, M. C. Bortolini, S. Canizales-Quinteros, F. Rothhammer, G. Bedoya, D. Balding, and R. Gonzalez-José. 2014. Admixture in Latin America: Geographic structure, phenotypic diversity and self-perception of ancestry based on 7,342 individuals. PLoS Genetics 10(9):e1004572.

Sachidanandam, R., D. Weissman, S. C. Schmidt, J. M. Kakol, L. D. Stein, G. Marth, S. Sherry, J. C. Mullikin, B. J. Mortimore, D. L. Willey, S. E. Hunt, C. G. Cole, P. C. Coggill, C. M. Rice, Z. Ning, J. Rogers, D. R. Bentley, P.-Y. Kwok, E. R. Mardis, R. T. Yeh, B. Schultz, L. Cook, R. Davenport, M. Dante, L. Fulton, L. Hillier, R. H. Waterston, J. D. McPherson, B. Gilman, S. Schaffner, W. J. Van Etten, D. Reich, J. Higgins, M. J. Daly, B. Blumenstiel, J. Baldwin, N. Stange-Thomann, M. C. Zody, L. Linton, E. S. Lander, and D. Altshuler. 2001. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409(6822):928-933.

Sadarangani, M., A. Marchant, and T. R. Kollmann. 2021. Immunological mechanisms of vaccine-induced protection against COVID-19 in humans. Nature Reviews Immunology 21(8):475-484.

Sanders, E. R. 1969. The Hamitic hypothesis; its origins and functions in time perspective. Journal of African History X:521-532.

Sankar, P., and M. K. Cho. 2002. Toward a new vocabulary of human genetic variation. Science 298(5597):1337-1338.

Schor, P. 2017. Counting Americans: How the US census classified the nation. Oxford: Oxford University Press.

Shah, J. B., D. Pueschl, B. Wubbenhorst, M. Fan, J. Pluta, K. D’Andrea, A. P. Hubert, J. S. Shilan, W. Zhou, A. A. Kraya, A. Llop Guevara, C. Ruan, V. Serra, J. Balmaña, M. Feldman, P. J. Morin, A. Nayak, K. N. Maxwell, S. M. Domchek, and K. L. Nathanson. 2022. Analysis of matched primary and recurrent brca1/2 mutation-associated tumors identifies recurrence-specific drivers. Nature Communications 13(1).

Smedley, A., and B. D. Smedley. 2012. Race in North America: Origin and evolution of a worldview. 4th ed. Boulder, CO: Westview Press.

Snowden, F. M., Jr. 1983. Before color prejudice: The ancient view of blacks. Cambridge, MA: Harvard University Press.

Stepan, N. 1982. The idea of race in science: Great Britain, 1800-1960. Palgrave MacMillan.

Stern, A. M. 2015. Eugenic nation: Faults and frontiers of better breeding in modern America. Berkeley: University of California Press.

Strickberger, M. W. 1985. Genetics. 3rd ed. New York: Macmillan.

Takezawa, Y., K. Kato, H. Oota, T. Caulfield, A. Fujimoto, S. Honda, N. Kamatani, S. Kawamura, K. Kawashima, R. Kimura, H. Matsumae, A. Saito, P. E. Savage, N. Seguchi, K. Shimizu, S. Terao, Y. Yamaguchi-Kabata, A. Yasukouchi, M. Yoneda, and K. Tokunaga. 2014. Human genetic research, race, ethnicity and the labeling of populations: Recommendations based on an interdisciplinary workshop in Japan. BMC Medical Ethics 15(1).

TallBear, K. 2013. Native American DNA: Tribal belonging and the false promise of genetic science. Minneapolis: University of Minnesota Press.

U.K. Biobank. 2022. Apply for access. https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access (accessed December 5, 2022).

UN (United Nations). 2017. Ethnocultural characteristics. https://unstats.un.org/unsd/demographic/sconcerns/popchar/popcharmethods.htm (accessed October 19, 2022).

Van Ausdale, D., and J. R. Feagin. 1996. Using racial and ethnic concepts: The critical case of very young children. American Sociological Review 61:779-793.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Venter, J. C., M. D. Adams, E. W. Myers, P. W. Li, R. J. Mural, G. G. Sutton, H. O. Smith, M. Yandell, C. A. Evans, R. A. Holt, J. D. Gocayne, P. Amanatides, R. M. Ballew, D. H. Huson, J. R. Wortman, Q. Zhang, C. D. Kodira, X. H. Zheng, L. Chen, M. Skupski, G. Subramanian, P. D. Thomas, J. Zhang, G. L. Gabor Miklos, C. Nelson, S. Broder, A. G. Clark, J. Nadeau, V. A. McKusick, N. Zinder, A. J. Levine, R. J. Roberts, M. Simon, C. Slayman, M. Hunkapiller, R. Bolanos, A. Delcher, I. Dew, D. Fasulo, M. Flanigan, L. Florea, A. Halpern, S. Hannenhalli, S. Kravitz, S. Levy, C. Mobarry, K. Reinert, K. Remington, J. Abu-Threideh, E. Beasley, K. Biddick, V. Bonazzi, R. Brandon, M. Cargill, I. Chandramouliswaran, R. Charlab, K. Chaturvedi, Z. Deng, V. Di Francesco, P. Dunn, K. Eilbeck, C. Evangelista, A. E. Gabrielian, W. Gan, W. Ge, F. Gong, Z. Gu, P. Guan, T. J. Heiman, M. E. Higgins, R. R. Ji, Z. Ke, K. A. Ketchum, Z. Lai, Y. Lei, Z. Li, J. Li, Y. Liang, X. Lin, F. Lu, G. V. Merkulov, N. Milshina, H. M. Moore, A. K. Naik, V. A. Narayan, B. Neelam, D. Nusskern, D. B. Rusch, S. Salzberg, W. Shao, B. Shue, J. Sun, Z. Wang, A. Wang, X. Wang, J. Wang, M. Wei, R. Wides, C. Xiao, C. Yan, A. Yao, J. Ye, M. Zhan, W. Zhang, H. Zhang, Q. Zhao, L. Zheng, F. Zhong, W. Zhong, S. Zhu, S. Zhao, D. Gilbert, S. Baumhueter, G. Spier, C. Carter, A. Cravchik, T. Woodage, F. Ali, H. An, A. Awe, D. Baldwin, H. Baden, M. Barnstead, I. Barrow, K. Beeson, D. Busam, A. Carver, A. Center, M. L. Cheng, L. Curry, S. Danaher, L. Davenport, R. Desilets, S. Dietz, K. Dodson, L. Doup, S. Ferriera, N. Garg, A. Gluecksmann, B. Hart, J. Haynes, C. Haynes, C. Heiner, S. Hladun, D. Hostin, J. Houck, T. Howland, C. Ibegwam, J. Johnson, F. Kalush, L. Kline, S. Koduru, A. Love, F. Mann, D. May, S. McCawley, T. McIntosh, I. McMullen, M. Moy, L. Moy, B. Murphy, K. Nelson, C. Pfannkoch, E. Pratts, V. Puri, H. Qureshi, M. Reardon, R. Rodriguez, Y. H. Rogers, D. Romblad, B. Ruhfel, R. Scott, C. Sitter, M. Smallwood, E. Stewart, R. Strong, E. Suh, R. Thomas, N. N. Tint, S. Tse, C. Vech, G. Wang, J. Wetter, S. Williams, M. Williams, S. Windsor, E. Winn-Deen, K. Wolfe, J. Zaveri, K. Zaveri, J. F. Abril, R. Guigo, M. J. Campbell, K. V. Sjolander, B. Karlak, A. Kejariwal, H. Mi, B. Lazareva, T. Hatton, A. Narechania, K. Diemer, A. Muruganujan, N. Guo, S. Sato, V. Bafna, S. Istrail, R. Lippert, R. Schwartz, B. Walenz, S. Yooseph, D. Allen, A. Basu, J. Baxendale, L. Blick, M. Caminha, J. Carnes-Stine, P. Caulk, Y. H. Chiang, M. Coyne, C. Dahlke, A. Deslattes Mays, M. Dombroski, M. Donnelly, D. Ely, S. Esparham, C. Fosler, H. Gire, S. Glanowski, K. Glasser, A. Glodek, M. Gorokhov, K. Graham, B. Gropman, M. Harris, J. Heil, S. Henderson, J. Hoover, D. Jennings, C. Jordan, J. Jordan, J. Kasha, L. Kagan, C. Kraft, A. Levitsky, M. Lewis, X. Liu, J. Lopez, D. Ma, W. Majoros, J. McDaniel, S. Murphy, M. Newman, T. Nguyen, N. Nguyen, M. Nodell, S. Pan, J. Peck, M. Peterson, W. Rowe, R. Sanders, J. Scott, M. Simpson, T. Smith, A. Sprague, T. Stockwell, R. Turner, E. Venter, M. Wang, M. Wen, D. Wu, M. Wu, A. Xia, A. Zandieh, and X. Zhu. 2001. The sequence of the human genome. Science 291(5507):1304-1351.

von Dungern, E., and L. Hirschfeld. 1962. Concerning heredity of group specific structures of blood. Transfusion 2(1):70-74.

Vyas, D. A., L. G. Eisenstein, and D. S. Jones. 2020. Hidden in plain sight—Reconsidering the use of race correction in clinical algorithms. New England Journal of Medicine 383(9):874-882.

Wagner, J. K., J. H. Yu, J. O. Ifekwunigwe, T. M. Harrell, M. J. Bamshad, and C. D. Royal. 2017. Anthropologists’ views on race, ancestry, and genetics. American Journal of Physical Anthropology 162(2):318-327.

Waldman, S., D. Backenroth, É. Harney, S. Flohr, N. C. Neff, G. M. Buckley, H. Fridman, A. Akbari, N. Rohland, S. Mallick, I. Olalde, L. Cooper, A. Lomes, J. Lipson, J. Cano Nistal, J. Yu, N. Barzilai, I. Peter, G. Atzmon, H. Ostrer, T. Lencz, Y. E. Maruvka, M. Lämmerhirt, A. Beider, L. V. Rutgers, V. Renson, K. M. Prufer, S. Schiffels, H. Ringbauer, K. Sczech, S. Carmi, and D. Reich. 2022. Genome-wide data from medieval German Jews show that the Ashkenazi founder event pre-dated the 14th century. Cell 185(25):4703-4716.e4716.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×

Watson, M. S., M. A. Lloyd-Puryear, and R. R. Howell. 2022. The progress and future of US newborn screening. International Journal of Neonatal Screening 8(3):41.

Wellcome Trust Case Control Consortium. 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661-678.

West, K. M., E. Blacksher, and W. Burke. 2017. Genomics, health disparities, and missed opportunities for the nation’s research agenda. JAMA 317(18):1831.

Wilder, C. S. 2013. Ebony and ivy: Race, slavery, and the troubled history of America’s universities. New York: Bloomsbury Press.

Williamson, S. H., M. J. Hubisz, A. G. Clark, B. A. Payseur, C. D. Bustamante, and R. Nielsen. 2007. Localizing recent adaptive evolution in the human genome. PLoS Genetics 3(6):e90.

Wimmer, A. 2015. Race-centrism: A critique and a research agenda. Ethnic and Racial Studies 38(13):2186-2205.

Yudell, M. 2014. Race unmasked: Biology and race in the twentieth century. New York: Columbia University Press.

Yudell, M., D. Roberts, R. DeSalle, S. Tishkoff, and 70 signatories. 2020. NIH must confront the use of race in science. Science 369(6509):1313-1314.

Zhou, W., M. Kanai, K.-H. H. Wu, H. Rasheed, K. Tsuo, J. B. Hirbo, Y. Wang, A. Bhattacharya, H. Zhao, S. Namba, I. Surakka, B. N. Wolford, V. Lo Faro, E. A. Lopera-Maya, K. Läll, M.-J. Favé, J. J. Partanen, S. B. Chapman, J. Karjalainen, M. Kurki, M. Maasha, B. M. Brumpton, S. Chavan, T.-T. Chen, M. Daya, Y. Ding, Y.-C. A. Feng, L. A. Guare, C. R. Gignoux, S. E. Graham, W. E. Hornsby, N. Ingold, S. I. Ismail, R. Johnson, T. Laisk, K. Lin, J. Lv, I. Y. Millwood, S. Moreno-Grau, K. Nam, P. Palta, A. Pandit, M. H. Preuss, C. Saad, S. Setia-Verma, U. Thorsteinsdottir, J. Uzunovic, A. Verma, M. Zawistowski, X. Zhong, N. Afifi, K. M. Al-Dabhani, A. Al Thani, Y. Bradford, A. Campbell, K. Crooks, G. H. de Bock, S. M. Damrauer, N. J. Douville, S. Finer, L. G. Fritsche, E. Fthenou, G. Gonzalez-Arroyo, C. J. Griffiths, Y. Guo, K. A. Hunt, A. Ioannidis, N. M. Jansonius, T. Konuma, M. T. M. Lee, A. Lopez-Pineda, Y. Matsuda, R. E. Marioni, B. Moatamed, M. A. Nava-Aguilar, K. Numakura, S. Patil, N. Rafaels, A. Richmond, A. Rojas-Muñoz, J. A. Shortt, P. Straub, R. Tao, B. Vanderwerff, M. Vernekar, Y. Veturi, K. C. Barnes, M. Boezen, Z. Chen, C.-Y. Chen, J. Cho, G. D. Smith, H. K. Finucane, L. Franke, E. R. Gamazon, A. Ganna, T. R. Gaunt, T. Ge, H. Huang, J. Huffman, N. Katsanis, J. T. Koskela, C. Lajonchere, M. H. Law, L. Li, C. M. Lindgren, R. J. F. Loos, S. MacGregor, K. Matsuda, C. M. Olsen, D. J. Porteous, J. A. Shavit, H. Snieder, T. Takano, R. C. Trembath, J. M. Vonk, D. C. Whiteman, S. J. Wicks, C. Wijmenga, J. Wright, J. Zheng, X. Zhou, P. Awadalla, M. Boehnke, C. D. Bustamante, N. J. Cox, S. Fatumo, D. H. Geschwind, C. Hayward, K. Hveem, E. E. Kenny, S. Lee, Y.-F. Lin, H. Mbarek, R. Mägi, H. C. Martin, S. E. Medland, Y. Okada, A. V. Palotie, B. Pasaniuc, D. J. Rader, M. D. Ritchie, S. Sanna, J. W. Smoller, K. Stefansson, D. A. van Heel, R. G. Walters, S. Zöllner, A. R. Martin, C. J. Willer, M. J. Daly, and B. M. Neale. 2022. Global biobank meta-analysis initiative: Powering genetic discovery across human disease. Cell Genomics 2(10):100192.

Zuberi, T. 2003. Thicker than blood: How racial statistics lie. Minneapolis: University of Minnesota Press.

Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 21
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 22
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 23
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 24
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 25
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 26
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 27
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 28
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 29
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 30
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 31
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 32
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 33
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 34
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 35
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 36
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 37
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 38
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 39
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 40
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 41
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 42
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 43
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 44
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 45
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 46
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 47
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 48
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 49
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 50
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 51
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 52
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 53
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 54
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 55
Suggested Citation:"1 Population Descriptors in Human Genetics Research: Genesis, Evolution, and Challenges." National Academies of Sciences, Engineering, and Medicine. 2023. Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field. Washington, DC: The National Academies Press. doi: 10.17226/26902.
×
Page 56
Next: 2 A Multiplicity of Descriptors in Genetics and Genomics Research »
Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field Get This Book
×
 Using Population Descriptors in Genetics and Genomics Research: A New Framework for an Evolving Field
Buy Paperback | $25.00 Buy Ebook | $20.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Genetic and genomic information has become far more accessible, and research using human genetic data has grown exponentially over the past decade. Genetics and genomics research is now being conducted by a wide range of investigators across disciplines, who often use population descriptors inconsistently and/or inappropriately to capture the complex patterns of continuous human genetic variation.

In response to a request from the National Institutes of Health, the National Academies assembled an interdisciplinary committee of expert volunteers to conduct a study to review and assess existing methodologies, benefits, and challenges in using race, ethnicity, ancestry, and other population descriptors in genomics research. The resulting report focuses on understanding the current use of population descriptors in genomics research, examining best practices for researchers, and identifying processes for adopting best practices within the biomedical and scientific communities.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!