Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Using Population Descriptors in Genetics and Genomics Research A New Framework for an Evolving Field Committee on the Use of Race, Ethnicity, and Ancestry as Population Descriptors in Genomics Research Board on Health Sciences Policy Health and Medicine Division Committee on Population Division of Behavioral and Social Sciences and Education Consensus Study Report PREPUBLICATION COPYâUncorrected Proofs
NATIONAL ACADEMIES PRESS 500 Fifth Street, NW, Washington, DC 20001 This project has been funded with federal funds under Contract No. HHSN263201800029I (75N98021F00009) between the National Academy of Sciences and the Department of Health and Human Services, National Institutes of Health: All of Us Research Program; National Cancer Institute; National Heart, Lung, and Blood Institute; National Human Genome Research Institute; Eunice Kennedy Shriver National Institute of Child Health and Human Development; National Institute of Dental and Craniofacial Research; National Institute of Diabetes and Digestive and Kidney Diseases; National Institute of Environmental Health Sciences; National Institute of Nursing Research; National Institute on Aging; National Institute on Drug Abuse; National Institute on Minority Health and Health Disparities; NIH Office of Behavioral and Social Sciences Research; and NIH Office of Science Policy. Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project. International Standard Book Number-13: 978-0-309-XXXXX-X International Standard Book Number-10: 0-309-XXXXX-X Digital Object Identifier: https://doi.org/10.17226/26902 This publication is available from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334- 3313; http://www.nap.edu. Copyright 2023 by the National Academy of Sciences. National Academies of Sciences, Engineering, and Medicine and National Academies Press and the graphical logos for each are all trademarks of the National Academy of Sciences. All rights reserved. Printed in the United States of America. Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2023. Using population descriptors in genetics and genomics research: A new framework for an evolving field. Washington, DC: The National Academies Press. https://doi.org/10.17226/26902. PREPUBLICATION COPYâUncorrected Proofs
The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president. The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. John L. Anderson is president. The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president. The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine. Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org. PREPUBLICATION COPYâUncorrected Proofs
Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the studyâs statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committeeâs deliberations. Each report has been subjected to a rigorous and independent peer-review process, and it represents the position of the National Academies on the statement of task. Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies. Rapid Expert Consultations published by the National Academies of Sciences, Engineering, and Medicine are authored by subject-matter experts on narrowly focused topics that can be supported by a body of evidence. The discussions contained in rapid expert consultations are considered those of the authors and do not contain policy recommendations. Rapid expert consultations are reviewed by the institution before release. For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo. PREPUBLICATION COPYâUncorrected Proofs
COMMITTEE ON THE USE OF RACE, ETHNICITY, AND ANCESTRY AS POPULATION DESCRIPTORS IN GENOMICS RESEARCH1 ARAVINDA CHAKRAVARTI (Cochair), Director, Center for Human Genetics and Genomics; Muriel G & George W Singer Professor of Neuroscience and Physiology, Professor of Medicine, New York University Grossman School of Medicine CHARMAINE ROYAL (Cochair), Robert O. Keohane Professor of African & African American Studies, Biology, Global Health, and Family Medicine & Community Health; Director, Center on Genomics, Race, Identity, Difference and Center for Truth, Racial Healing & Transformation, Duke University KATRINA ARMSTRONG, Executive Vice President, Health and Biomedical Sciences; Dean of the Vagelos College of Physicians and Surgeons and the Faculties of Health Sciences; Harold and Margaret Hatch Professor of the University; Columbia University MICHAEL BAMSHAD, Professor and Chief, Division of Genetic Medicine, Department of Pediatrics; Allan and Phyllis Treuer Endowed Chair, Genetics and Development, University of Washington and Seattle Childrenâs Hospital LUISA N. BORRELL, Distinguished Professor, Department of Epidemiology & Biostatistics, Graduate School of Public Health & Health Policy, City University of New York KATRINA CLAW, Assistant Professor, Department of Biomedical Informatics, School of Medicine; Faculty, Colorado Center for Personalized Medicine, University of Colorado Denver â Anschutz Medical Campus CLARENCE C. GRAVLEE, Associate Professor, Department of Anthropology, University of Florida MARK D. HAYWARD, Professor of Sociology; Centennial Commission Professor in Liberal Arts; Faculty Research Associate, Population Research Center, University of Texas at Austin RICK KITTLES, Senior Vice President for Research; Professor of Community Health and Preventive Medicine, Morehouse School of Medicine SANDRA SOO-JIN LEE, Professor of Medical Humanities & Ethics; Chief of the Division of Ethics, Department of Medical Humanities & Ethics, Vagelos College of Physicians & Surgeons, Columbia University 1 See Appendix F, Disclosure of Unavoidable Conflict of Interest. v PREPUBLICATION COPYâUncorrected Proofs
ANDRÃS MORENO-ESTRADA, Professor, Advanced Genomics Unit, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV), Mexico ANN MORNING, James Weldon Johnson Professor of Sociology, New York University JOHN P. NOVEMBRE, Professor, Department of Human Genetics, Department of Ecology & Evolution, University of Chicago MOLLY PRZEWORSKI, Professor, Department of Biological Sciences, Department of Systems Biology, Columbia University DOROTHY E. ROBERTS, George A. Weiss University Professor of Law & Sociology; Raymond Pace & Sadie Tanner Mossell Alexander Professor of Civil Rights; Professor of Africana Studies; Director, Penn Program on Race, Science & Society, University of Pennsylvania SARAH A. TISHKOFF, David and Lyn Silfen University Professor, Departments of Genetics and Biology; Director, Center for Global Genomics & Health Equity, University of Pennsylvania GENEVIEVE L. WOJCIK, Assistant Professor of Epidemiology, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health Study Staff SARAH H. BEACHY, Study Director SAMANTHA SCHUMM, Associate Program Officer LEAH CAIRNS, Study Codirector (until October 2022) KATHRYN ASALONE, Associate Program Officer MEREDITH HACKMANN, Associate Program Officer LYDIA TEFERRA, Research Assistant APARNA CHERAN, Senior Program Assistant (from June 2022) MICHAEL K. ZIERLER, Science Writer ANDREW M. POPE, Senior Director, Board on Health and Sciences Policy (until July 2022) CLARE STROUD, Senior Director, Board on Health and Sciences Policy (from July 2022) MALAY K. MAJMUNDAR, Director, Committee on Population vi PREPUBLICATION COPYâUncorrected Proofs
Reviewers This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manu- script remain confidential to protect the integrity of the deliberative process. We thank the following individuals for their review of this report: WENDY CHUNG, Columbia University DANA A. GLEI, Georgetown University EVELYNN M. HAMMONDS, Harvard University CHANITA HUGHES-HALBERT, University of Southern California BENJAMIN NEALE, Harvard Medical School NEIL R. POWE, University of California, San Francisco ERICA RAMOS, Genome Medical ALIYA SAPERSTEIN, Stanford University THE REVEREND ROBERT JEMONDE TAYLOR, Duke Cancer Institute Community Advisory Council SHARON F. TERRY, Genetic Alliance HONGYU ZHAO, Yale University vii PREPUBLICATION COPYâUncorrected Proofs
viii REVIEWERS Although the reviewers listed above provided many constructive com- ments and suggestions, they were not asked to endorse the conclusions or recommendations of this report nor did they see the final draft before its release. The review of this report was overseen by SUSAN J. CURRY of the University of Iowa and LINDA C. DEGUTIS of the Yale School of Public Health. They were responsible for making certain that an independent ex- amination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests entirely with the au- thoring committee and the National Academies. PREPUBLICATION COPYâUncorrected Proofs
Acknowledgments The study committee and project staff acknowledge that the National Academies of Sciences, Engineering, and Medicine is physically housed on the traditional land of the Nacotchtank (Anacostan) and Piscataway Peoples, past and present. The committee and staff honor with gratitude the land itself and the people who have stewarded it throughout the gen- erations. They honor and respect the enduring relationship that exists between these peoples and nations and this land. The committee and staff thank these peoples for their resilience in protecting this land and aspire to uphold our responsibilities to their example. The committee and staff also acknowledge the countless number of people who have participated, both willingly and unwillingly, in biomedical research, as well as those who have raised the issues addressed in this study for many years. The study committee and project staff would like to thank the study sponsorâthe 14 institutes, program, and offices of the National Institutes of Healthâfor their leadership on this issue and for their vision and com- mitment to developing and supporting this project. The committee and staff express their gratitude to the many experts who shared their diverse perspectives and advice with the committee throughout the process and during the public sessions. The committee is grateful for the staff within the Health and Medicine Division who provided support and guidance for the project, along with their collaborators in the Division of Behavioral and Social Sciences and Education. ix PREPUBLICATION COPYâUncorrected Proofs
PREPUBLICATION COPYâUncorrected Proofs
Contents LIST OF BOXES, FIGURES, AND TABLES xv PREFACE xvii ABBREVIATIONS xix SUMMARY 1 SECTION I: PAST AND CURRENT USE OF POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH 17 Section I Overview, 19 1 POPULATION DESCRIPTORS IN HUMAN GENETICS RESEARCH: GENESIS, EVOLUTION, AND CHALLENGES 21 The Study of Human Genetic Variation, 21 What is a Study Using Genetic Information Trying to Accomplish? 23 Classification of Genomics Study Types, 26 Features of Human Genome Variation, 27 Population Classification Schemes in Genetics and Genomics Research, 29 Attempts to Address the Use of Race, Ethnicity, and Ancestry in the Genomic Era, 38 xi PREPUBLICATION COPYâUncorrected Proofs
xii CONTENTS Why Is This Study Important? Why Another Study? Why Now? 41 What Is the Goal of This Report? 44 References, 47 2 A MULTIPLICITY OF DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH 57 Introduction, 57 A Range of Descent-Associated Population Descriptors, 60 The Importance of Environmental Factors in Genetics and Genomics Research, 76 References, 82 SECTION II: RECOMMENDATIONS 93 Section II Overview, 93 3 GUIDING PRINCIPLES 95 Introduction, 95 Principles, 96 Synergy Among and Tension Between Guiding Principles, 99 References, 99 4 REQUISITES FOR SUSTAINED CHANGE 101 Introduction, 101 Typological Thinking, 102 Environmental Factors, 105 Community Engagement, 107 References, 111 5 GUIDANCE FOR SELECTION AND USE OF POPULATION DESCRIPTORS IN GENOMICS RESEARCH 147 Introduction, 113 The Importance of Transparency and Specificity When Selecting and Reporting Population Descriptors, 114 Conclusion and Recommendations, 115 Tools for Selecting and Using Population Descriptors in Genetics and Genomics Research, 116 Considerations for Harmonization of Population Descriptors across Studies, 135 References, 139 PREPUBLICATION COPYâUncorrected Proofs
CONTENTS xiii 6 IMPLEMENTATION AND ACCOUNTABILITY 147 Introduction, 147 Implementation Across the Genomics Research Ecosystem, 148 Recommendations, 157 Mechanisms of Accountability, 158 Parting Thoughts, 159 References, 160 APPENDIXES A STUDY APPROACH AND METHODS 163 B GLOSSARY 177 C TABLE OF INTERNATIONAL PROGRAMS 185 D DECISION TREE FOR THE USE OF POPULATION DESCRIPTORS IN GENOMICS RESEARCH 191 E COMMITTEE AND STAFF BIOSKETCHES 201 F DISCLOSURE OF UNAVOIDABLE CONFLICT OF INTEREST 215 PREPUBLICATION COPYâUncorrected Proofs
PREPUBLICATION COPYâUncorrected Proofs
List of Boxes, Figures, and Tables BOXES S-1 Key Terminology and Definitions, 4 1-1 Race, Science, and Society: A Reference List, 34 1-2 Statement of Task, 46 2-1 Key Terminology and Definitions, 59 5-1 Key Terminology for This Chapter, 117 5-2 Common Data Elements for Researchers to Include as Metadata to Help Harmonize Across Studies, 136 5-3 Concise Language for Genetic Similarity: The Abbreviation + -Like System, 137 6-1 Example Checklist that Funders of Genetics and Genomics Research Can Implement for Researchers, 151 6-2 Example Checklist that Journals Can Implement for Genomics Researchers, 154 xv PREPUBLICATION COPYâUncorrected Proofs
xvi LIST OF BOXES, FIGURES, AND TABLES FIGURES S-1 A framework for change, 6 2-1 Visualization of genealogical vs. genetic ancestry, 62 2-2 U.S. Census race and ethnicity categories over time (1790â2020), 74 II-1 A framework for change, 94 D-1 Decision tree for the use of population descriptors in genomics research, 192 TABLES S-1 Recommended approaches for the use of population descriptors by genomics study type, 14 1-1 Comparison of Classification Schemes Used in Three Studies Using Genetics from Three Distinct Global Contexts, 36 5-1 Recommended Approaches for the Use of Population Descriptors by Genomics Study Type, 121 A-1 Literature Search Terms, 65 C-1 International Programs, 185 PREPUBLICATION COPYâUncorrected Proofs
Preface Human genetics studies that assess the contributions of genes to pheno- types can be conducted either using relatives or groups of distantly related individuals (âunrelatedâ in a colloquial sense). In both instances, geneticists search for a pattern of genetic (sequence or allelic) variation that can dis- tinguish between different forms of a phenotype say individuals with sickle cell anemia from those with sickle cell trait, based on the known rules of Mendelian genetic transmission. Although the expected similarity or dis- similarity of closely related individuals largely depend on gene transmission rules, that between more distantly related individuals mostly depend on their remote ancestral histories, such as where, when, and how their com- mon ancestors arose. This information is partially captured by affiliation of an individual to a population; however, how a population should be defined for any specific question in genetics research is less clear. Nevertheless, for any human genetics research, now extended to entire genomes, it is critical to clearly describe who is selected for a study, why, and how. Researchers also need to specify the criteria used to describe participants, including the use of population descriptors. Unfortunately, genetics studies have not named individuals consistently or in a principled manner, often reflexively using race and ethnicity without great thought or justification. Though seldom studied, measures of the environments associated with study indi- viduals and groups are also germane to our understanding of genetic traits and disorders and need to be included. In recent years, genetic information has become far more accessible. The number of human genetics and genomics studies is rapidly increasing, and many such studies are led by investigators who were not primarily xvii PREPUBLICATION COPYâUncorrected Proofs
xviii PREFACE trained in human genetics. While this study focuses mainly on knowledge from human genetics and genomics, we acknowledge that knowledge from many other sources (oral, archaeological, traditional, community, etc.) serve to inform our identities, history, relationships to other humans and our traits and diseases. It is time for us to reshape how genetics studies are conceptualized, conducted, and interpreted. This commissioned study describes an effort to clarify the scientific ra- tionales for describing research participants and their group labels. We start with an historical view of how we got to our current state, then proceed to examine how else we could achieve our scientific aims, and follow with our recommendations and suggested implementations, to improve genetic and genomic science. Our overarching goal is to motivate researchers to con- sider when population descriptors are necessary, which ones are appropriate for a specific type of genetics study design, whether multiple descriptors are necessary, and what additional information is needed for genetic dissection of phenotypes. Accordingly, this report is divided into two sections, the first is âPast and Current Use of Population Descriptors,â and the second section is âRecommendations.â Aravinda Chakravarti and Charmaine Royal, Cochairs, Committee on the Use of Race, Ethnicity, and Ancestry as Population Descriptors in Genomics Research PREPUBLICATION COPYâUncorrected Proofs
Abbreviations AAA American Anthropological Association AAPA American Association of Physical Anthropologists AFR African âsuperpopulationâ AMA American Medical Association APA American Psychological Association APOE4 apolipoprotein E gene BBJ BioBank Japan BIPMed Brazilian Initiative on Precision Medicine CDC Centers for Disease Control and Prevention CDE common data element CEPH northern and western European ancestry in Utah CEU northern European in Utah CKB China Kadoorie Biobank CONSORT Consolidated Standards of Reporting Trials COREQ COnsolidated criteria for REporting Qualitative research COVID-19 coronavirus disease 2019 DNA deoxyribonucleic acid EUR European âsuperpopulationâ FAPESP Research Innovation and Dissemination Centers funded by the São Paulo Research Foundation xix PREPUBLICATION COPYâUncorrected Proofs
xx ABBREVIATIONS GBR British in England and Scotland GIH Gujaratis sampled in Houston gnomAD Genome Aggregation Database GWAS genome-wide association study HAALSI Health and Aging in Africa: A Longitudinal Study of an INDEPTH Community in South Africa HAAO 3-hydroxyanthranilate 3,4-dioxygenase HapMap International Haplotype Map Project HGP Human Genome Project HIV human immunodeficiency virus HLA human leukocyte antigen HMD Health and Medicine Division INDEPTH International Network for the Demographic Evaluation of Populations and Their Health JAMA Journal of the American Medical Association KBP Korean Biobank Project KYNU kynureninase LD linkage disequilibrium MeSH Medical Subject Headings mRNA messenger RNA MXB Mexican Biobank NAD nicotinamide adenine dinucleotide NHGRI National Human Genome Research Institute NHLBI National Heart, Lung, and Blood Institute NIH National Institutes of Health NIMHD National Institute on Minority Health and Health Disparities NYU New York University OMB Office of Management and Budget PCA Principal component analysis PCORI Patient-Centered Outcomes Research Institute PCSK9 proprotein convertase subtilisin/kexin type 9 PEL Peruvian in Lima, Peru PERSIAN Prospective Epidemiological Research Studies in Iran PREPUBLICATION COPYâUncorrected Proofs
ABBREVIATIONS xxi PGS polygenic scores PKU phenylketonuria PRISMA Preferred Reporting Items for Systematic Review and Meta-Analysis PUR Puerto Rican in Puerto Rico QBB Qatar Biobank RNA ribonucleic acid SALL4 spalt like transcription factor 4 TOPMed Trans-Omics for Precision Medicine TSI Tuscans of Italy UCLA University of California, Los Angeles UK United Kingdom UKB United Kingdom Biobank UMAP Uniform manifold approximation and projection UN United Nations U.S. United States WHO World Health Organization YRI Yoruba in Ibadan, Nigeria 1000G 1000 Genomes Project; the international 1000 genomes sequence variation project PREPUBLICATION COPYâUncorrected Proofs
PREPUBLICATION COPYâUncorrected Proofs