Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
D Decision Tree for the Use of Population Descriptors in Genomics Research 191 PREPUBLICATION COPYâUncorrected Proofs
192 POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH PART A A: WHAT IS THE SOURCE OF YOUR DATA? Novel data/Internal to your collaboration Externally generated pre-existing data Collect individual-level data following norms of Community Engagement. Provide clear instructions on how downstream users can respect consent and Is the data individual- or group-level? any collaborative agreements with study participants regarding population descriptors. Individual-level Data Group-level Data Are data consented for broad re-use in research? Review consents/community Evaluate whether using existing agreements provided by sources of group-level descriptors will produce the pre-exisiting data. valid and trustworthy results or will YES NO be misleading While following existing consent structures, use the available meta-data (e.g. geographic origin data, ethnicity data) to form the Then Then population descriptors. If necessary Are the available pre-existing to create new labels, share and group-level descriptors appropriate With consent, See B STUDY describe the formation of individual- for your study design? collect additional PURPOSE to guide level new labels/descriptors when population use of population publishing. descriptor types descriptors. Go to B STUDY PURPOSE and data along non-genetic YES NO dimensions, anticipating multiple possible study types during re-use. Go to B STUDY Then Then PURPOSE and C ENVIRONMENT Proceed with analysis. Do not proceed with your Emphasize any limitations study. of the pre-existing descriptors in all communications of results. If modifications to the group-level data were made, share and describe the formation of the new labels when publishing. FIGURE D-1â Decision tree for the use of population descriptors in genomics re- search. The decision tree follows on the next several pages. PREPUBLICATION COPYâUncorrected Proofs
APPENDIX D 193 PART B B: WHAT IS THE PURPOSE OF YOUR STUDY? Elucidating Cellular and Understanding Studying Human Gene Discovery Trait Prediction Biological Health Disparities Evolutionary History Mechanisms Go to B1 Go to B2 Go to B3 Go to B4 Go to B5 FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
194 POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH ï¾ Preferred descriptors PART B1 B: WHAT IS THE PURPOSE OF YOUR STUDY? � Should not be used ? May be used in some cases Elucidating Cellular Trait Prediction and Biological Understanding Health Studying Human Gene Discovery Mechanisms Disparities Evolutionary History Go to B2 Go to B3 Go to B4 Go to B5 Is the inheritance of the trait complex or Mendelian? Complex Mendelian Use a genetic relatedness matrix (i.e. pedigree- informed or based on genetic similarity) and/or If the trait is only seen in Could there be a need to If the variant frequency will be compared to frequencies factor loadings (e.g. principal simplex families (i.e. likely recruit more individuals to in a panel of reference components) to determine due to de novo mutations)... the study before analysis? populations⦠and describe study population as well as control for genetic background Then Then ï¾ Genetic similarity � Genetic ancestry Population descriptors not Possibly no population needed. Type the variants YES NO descriptor needed themselves. ï¾ Genetic similarity � Race � Ethnicity/Indigeneity � Genetic ancestry (continental/large-scale) � Geography Then Then Kinship/descent-associated descriptors may be Use a pedigree, genetic relatedness matrix useful to find additional carriers (e.g. close or (i.e. pedigree informed or based on genetic distant relatives) if done at a fine scale (e.g. genetic similarity) and/or factor loadings (e.g. similarity, geographic origins, ethnicity of already principal components) to control for genetic sampled carriers). background/analyze transmission patterns. ï¾ Geographic origins ï¾ Genetic similarity ï¾ Ethnicity/Indigeneity � Genetic ancestry ï¾ Genetic similarity ï¾ Genetic ancestry (fine-scale only) � Race � Genetic ancestry (continental/large-scale) For all types If you need to control for environment (e.g. as a covariate, to prevent spurious associations and/or to improve power), then go to C ENVIRONMENT FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
APPENDIX D 195 ï¾ Preferred descriptors PART B2 B: WHAT IS THE PURPOSE OF YOUR STUDY? � Should not be used ?? May be used in some cases Elucidating Cellular Understanding Health Studying Human Gene Discovery and Biological Disparities Evolutionary History Trait Prediction Mechanisms Go to B1 Go to B3 Go to B4 Go to B5 Is the inheritance of the trait complex or Mendelian? Complex Mendelian ï¾ Genetic similarity � Genetic ancestry Is the genetic basis (e.g. large effect locus and/or modifier loci) identified and validated? YES NO Then Then Population descriptors no longer Genetic similarity could be useful needed. Type the variants proxy of similar modifier loci themselves ï¾ Genetic similarity � Race � Genetic ancestry � Ethnicity/Indigeneity � Genetic ancestry � Geography For all types If you need to control for environment (e.g. due to gene by environment interaction), then go to C ENVIRONMENT FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
196 POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH ï¾ Preferred descriptors PART B3 B: WHAT IS THE PURPOSE OF YOUR STUDY? � Should not be used ? May be used in some cases Elucidating Understanding Health Studying Human Gene Discovery Trait Prediction Cellular and Disparities Evolutionary History Biological Mechanisms Go to B1 Go to B2 Go to B4 Go to B5 Are you interested in genetic or environmental perturbations to underlying mechanisms? Genetic Perturbations to Environmental Neither Pathway Perturbations No population descriptors ï¾ Genetic similarity necessary given mechanisms You might consider are expected to be universal measures of genetic Go to C ENVIRONMENT similarity to identify individuals with similar allele frequencies and evolutionary histories FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
APPENDIX D 197 ï¾ Preferred descriptors PART B4 B: WHAT IS THE PURPOSE OF YOUR STUDY? � Should not be used ? May be used in some cases Elucidating Cellular Studying Human Gene Discovery Trait Prediction and Biological Understanding Evolutionary History Mechanisms Health Disparities Go to B1 Go to B2 Go to B3 Go to B5 Are the genetic data being used to⦠Control for genetics Study impact of Understand health disparities (although not the primary environmental exposures mediated by social focus of the research processes of racism question) ï¾ Genetic similarity � Genetic ancestry Go to C ENVIRONMENT YES NO Then Then ï¾ Genetic similarity ï¾ Genetic similarity ? Genetic ancestry � Genetic ancestry Go to C ENVIRONMENT Go to C ENVIRONMENT FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
198 POPULATION DESCRIPTORS IN GENETICS AND GENOMICS RESEARCH ï¾ Preferred descriptors PART B5 B: WHAT IS THE PURPOSE OF YOUR STUDY? � Should not be used ? May be used in some cases Elucidating Cellular Understanding Health Gene Discovery Trait Prediction and Biological Disparities Studying Human Mechanisms Evolutionary History Go to B1 Go to B2 Go to B3 Go to B4 In general ï¾ Genetic ancestry ï¾ Genetic similarity ï¾ Geography ? Ethnicity/Indigeneity � Race No population descriptor may be needed (e.g. in an ancestral recombination graph) If ancient DNA⦠Then Avoid conflating cultural and genetic group namings - see Eisenmann et al 2018 FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
APPENDIX D 199 Preferred descriptors PART C C: ARE YOU TRYING TO⦠? Should not be used May be used in some cases Understand the impact of an Control for the environment environmental effect and may need to control for genetic background Genetic similarity Do you have observations of the specific relevant environmental variables necessary to understand your trait? YES NO Then Are you using pre-existing data, such as from a biobank? Incorporate these variables. YES NO Then Then Carefully consider use of Collect information for as proxy variables if needed. many potential environmental Possible proxies: factors as possible and describe their source. ? Geography ? Ethnicity/Indigeneity ? Race (for only a subset of health disparities studies) FIGURE D-1âContinued PREPUBLICATION COPYâUncorrected Proofs
PREPUBLICATION COPYâUncorrected Proofs