The panel was tasked with making recommendations on measures of sex, gender identity, and sexual orientation, with attention to how these recommendations may be applied differently in three specific settings: surveys and research studies, administrative settings, and clinical settings. To address the setting-specific question, we considered two primary aspects of data collection for each setting:
- the purpose of collecting sex, gender identity, and sexual orientation data; and
- why data collection would change in each setting.
In this chapter we discuss the purpose of data collection in each setting to determine which measures are relevant, and we touch on how the purpose and characteristics of each setting might influence how these data are collected.
The committee first categorized data collection contexts into the three settings included in the statement of task (surveys and research, administrative settings, and clinical settings) based on the purpose of data collection and the subject of the data being collected. Though these three settings at first seem very different, in practice the committee found that some data collection contexts served multiple purposes and thus defied easy classification. For example, clinical trials could be considered under both research and clinical settings, while vital statistics could be considered under both
research and administrative settings. Classifying health data was also complicated because the purpose and mode of data collection for public health purposes and for clinical trials can be similar to surveys and research. However, data collected in medical settings, clinical trials, and public health surveillance often include detailed medical and biometric information whose measurement and interpretation may vary by sex traits and sex assigned at birth in ways that are not relevant for general population surveys or nonhealth-related research.
The panel ultimately defined surveys and research settings to include population enumeration, social research, and demography. Health surveys were classified with public health surveillance, medical records, and clinical trials under the broad heading of clinical settings to account for the role sex traits and sex assigned at birth, alongside gender, may play in all these contexts. We defined administrative settings as reflecting two distinct data collection realms: (1) vital statistics and other data collection for the purpose of legal identification and (2) program and personnel administration. These administrative settings are distinguished from each other because, although vital statistics data are often used for research purposes, their use for legal identification and other administrative purposes mean that they often need to meet regulatory or legal requirements that do not apply to other data collection contexts.
The most common data collection method for enumeration, social research, and demographic data collection is general population surveys. General population surveys are characterized by their ability to enumerate and collect data from representative samples of the population. They describe demographics at a high level and provide generalizable data about large groups in the population. As such, surveys aim to obtain consistent, comparable, and reliable information about a population as a whole. Federal statistical agencies, social scientists, demographers, and policy makers alike depend on data from general population demographic surveys and censuses for a variety of purposes. For example, data from these sources are used to assess social and political attitudes; develop research, policy, program, and funding priorities; and assess population-level disparities in order to identify groups most at risk of negative outcomes and to plan responses accordingly.
Collecting data on sex, gender, and sexual orientation can inform national population estimates, allow for prevalence estimation within and among geographic or sociodemographic categories, and allow for statistical comparisons on socioeconomic, demographic, and survey-specific topics,
(e.g., health measures, crime victimization, unemployment, and program participation; see Federal Committee on Statistical Methodology, 2021). Such variables are also used to compute statistical weights and as demographic controls and covariates in statistical models. The lack of data collection on sexual orientation, gender identity, transgender experience, and intersex status as demographic measures in the decennial census or other large-scale federal population surveys, such the American Community Survey, means there is no “gold standard” against which data collections can perform weighting adjustments or assess data quality and nonresponse bias for LGBTQI+ populations.
Although general population surveys in the United States have not consistently included measures to identify LGBTQI+ populations, over the past two decades a number of surveys have introduced measures of sexual orientation and—to a lesser extent—gender identity. Most population surveys that are not focused on collecting health-related information include topics that represent the social and behavioral aspects of an individual’s life, which suggests that gender, rather than sex traits, is more relevant for understanding these outcomes in the population. Even in surveys that collect health-related information, this information is generally collected to assess health and health disparities in the population, as well as the role played by interpersonal and structural determinants of health. Therefore, the more important measures to gather in these surveys are those associated with proximal and distal minority stressors: gender identity, transgender experience and identity, intersex status, and sexual orientation. In these survey contexts, data about specific sex traits are needed only in circumstances in which knowledge of these traits is necessary to accurately direct skip patterns for survey questions, interpret responses, or calculate values for composite measures.
The wording of questions on general population surveys has to be understood by the population as a whole and short enough to be administered in a reasonable amount of time to maintain respondents’ interest and participation. Therefore, to produce high-quality estimates of LGBTQI+ populations, general population survey measures that are used to collect sexual orientation, gender identity, transgender experience and identity, and intersex status must reduce respondent burden by being simple to administer and understandable to both members of LGBTQI+ communities and the general population who are not LGBTQI+. When survey data collection efforts are focused on LGBTQI+ populations, however, it is less important that measures be comprehensible to those who are not LGBTQI+. In these circumstances, using community-specific terminology allows for better measures of the diverse array of identities in these populations.
Vital Statistics and Legal Identification
In the context of vital statistics and legal identification, there are multiple purposes for data collection because vital statistics—such as birth, death, and marriage certificates—are used in two primary ways: (1) by researchers to generate population estimates and conduct research related to the demographic characteristics and health of the population and (2) by individuals to establish identity and make legal or financial claims.1 Other personally identifiable legal and administrative documents, such as passports, Social Security records, and Internal Revenue Service files, are also used by researchers and demographers to study characteristics of the population and are sometimes linked to survey data and other administrative records.2 When used for research purposes, these data are typically deidentified and aggregated to protect personal privacy; however, when used for other purposes, this information is often directly linked to a specific individual. This combination of data needs and uses that cross between public and private domains highlights both the importance of consistent measurement to facilitate data quality and linkage across domains, and the need to establish a clear rationale and process for collecting these data to ensure that individuals are not required to disclose personal information in ways that may put them at risk for discrimination or violence (Ashley, 2021).
For vital statistics and legal identification, the only measure of sex, gender identity, or sexual orientation that is routinely collected is a single binary measure of sex or gender. As in other domains, there is considerable variation in whether data collection fields, internal coding, and public reporting explicitly reference “sex,” “gender,” some combination of the two, or neither. The clearest designations are the single measure of sex of an infant or decedent on original birth certificates and death certificates, respectively, both of which are completed by a proxy respondent and based on physical examination of one or more of the individual’s sex traits (that
1 The Supreme Court recently noted that birth certificates are “more than a mere marker of biological relationships,” they are “a form of legal recognition” (Marisa N. Pavan, et al. v. Nathaniel Smith. 582 U.S. Supreme Court of The United States. No. 16-992; cited in Epps, 2018). The case involved a dispute in Arkansas over whether female spouses of women who give birth should be listed as parents on a child’s birth certificate. The Court ruled for the plaintiffs because male spouses who are not the child’s biological father have routinely been listed as parents on birth certificates.
2 Examples are the National Longitudinal Mortality Study (2014), Mortality Disparities in American Communities (2017), and the U.S. Census Bureau’s Small Area Health Insurance Estimates Program; see https://www.census.gov/data/datasets/time-series/demo/sahie/estimatesacs.html.
may not all correspond to the same sex). In other instances, sex or gender information is not queried explicitly but rather inferred from gendered relationship status terms that are built into the form’s design, such as when parents are identified as “mother” and “father” on their child’s birth certificate or spouses are designated as “bride” and “groom” on marriage licenses.3 Thus, many of the problems that plague the measurement of sex or gender that are discussed in Chapter 2, including a lack of precision in terminology and failure to include gender identities beyond the female/male binary, also arise in administrative data collection.
Overall, in the vital statistics, data collection on transgender experience or sexual orientation is rare, while data collection on intersex status is nonexistent. This is partly because many of these documents serve foremost as forms of identification, and sexual orientation, transgender experience, transgender identity, and intersex status are not necessary for identification purposes. Similarly, most protected characteristics that were once thought to be necessary for identification, such as race and ethnicity, are no longer included on U.S. identity documents (including the public versions of birth certificates) because doing so facilitated segregation and discrimination (Adair, 2019; Erhardt, 1962). In contrast, data on sex or gender are routinely collected and reported on identity and other administrative documents in ways that may facilitate sex segregation in such settings as the military, restrooms, education, and athletics (Cohen, 2011). In general, data on protected characteristics, including sex and gender, are collected and widely reported across a range of administrative data without a clear and documented purpose (Ashley, 2021).
In vital statistics data, a clear distinction can be made between the data needed for statistical purposes, such as monitoring population health, and the data used for individual identity documents. For example, the U.S. standard birth certificate was revised in 1949 to include a line that specifically demarcates the fields above the line as ones that appear on certified birth certificate copies and the fields below the line as for statistical purposes only. At that time, both race and parents’ marital status were moved below the line (Shteyler, Clarke, and Adashi, 2020, citing Wipfler, 2016). This approach separates the information necessary to fully document the circumstances surrounding a “live birth” from information provided for the purposes of individual identification. Similar “lines of demarcation” also appear on marriage and death certificates.
The data below this line of demarcation are collected purely for research and population estimates and are generally reported in aggregate, which makes disclosure of the individual data unlikely. This separation
opens the possibility that information about sexual orientation, gender identity, transgender experience and identity, and intersex/DSD [differences in sex development] status could be added to vital statistics data to document social and economic disparities in treatment and outcomes experienced by LGBTQI+ people without increasing their risk of discrimination or mistreatment.
The possible implementation of routine data collection of measures of sexual orientation, gender identity, and transgender experience in vital statistics data poses several challenges that may affect data quality, particularly for measures of identity. One challenge is the role of proxy reporting, in which the respondent is providing information on someone else, a practice that is necessary when collecting data on infants and decedents. The primary concern with proxy reporting is that the resulting data will not reflect how a person would have identified if they could have responded for themselves, which can depend on the degree to which the respondent may have personal knowledge about the person whose data are being reported. For example, while spouses may report their own or their partner’s characteristics in marriage records, death records are generally completed by a more distal proxy respondent (e.g., a physician or a coroner) who may never have met the decedent. Proxy reporting is likely to result in undercounts for marginalized populations: evaluations of such proxy reporting of racial and ethnic identity on death records has found misclassification rates of more than 40 percent for American Indians and Alaska Natives (National Center for Health Statistics, 2016), which has limited the use of these data for research and public health surveillance for this population.
Similar concerns about proxy reporting for sexual orientation and gender identity data have prompted some to suggest that a system that links death certificates with electronic health records would be preferable to better reflect an adult decedent’s self-identification (Mays and Cochran, 2019). Other researchers are exploring the feasibility of using proxy reports to collect sexual orientation and gender identity data for death certificates (Haas et al., 2019); California recently passed legislation to initiate a 3-year pilot study of collecting both sexual orientation and gender identity data on death certificates (Bajko, 2021; California Legislative Assembly, 2021). Such data collection would enable much-needed research on mortality disparities faced by sexual and gender minorities, but additional pilots in more jurisdictions would be needed to demonstrate widespread feasibility.
Data collection in the context of vital statistics is further complicated by wide variations in jurisdictional control over both the collection and associated statistical standards. The National Center for Health Statistics (NCHS) publishes federal guidelines for data collection on birth, marriage,
and death certificates.4 However, the actual design of certificates is determined at the state level and implemented variously in hospitals, funeral homes, and local government offices (see, e.g., National Research Council, 2009; Hahn et al., 2002). These data are reported electronically by state administrative agencies to NCHS, which then produces standardized vital statistics data for the United States.
The number and variety of jurisdictions for the same type of vital statistics record creates a patchwork of practices across the United States. At least 14 U.S. states have also begun to allow nonbinary designations on birth certificates,5 which is currently implemented through petitions to change the original birth certificate designation, either by an adult or by the parents of a newborn child. Only “female” and “male” are currently available for designation at birth on the standard U.S. birth certificate, and NCHS data record standards do not currently recognize a third category, such as nonbinary, on birth certificate data when they are transferred electronically to the federal government. Moreover, because the United States does not have a population registry system, when an original birth certificate is modified to reflect an individual’s current gender, this information is not transferred to the federal government or reflected in national vital statistics data.
Policies and practices for collection of sex and gender data related to legal identification are also changing rapidly but unevenly across the United States. For example, the federal government recently announced that it is now accepting self-identified “female” and “male” designations for all U.S. passport applications (U.S. Department of State, 2021).6 The U.S. Department of State subsequently announced that the first passport with a nonbinary designation had been issued (Reuters, 2021) with plans to expand availability once the relevant changes can be implemented in its data systems.7 At least 18 states allow nonbinary designations on driver’s licenses or state IDs (as of October 2021), though they vary in whether a change can be obtained through self-identification alone or whether medical
4 Although each state determines how its vital statistics data will be collected, state-level data are reported to and aggregated at the national level by the National Center for Health Statistics (NCHS). To improve comparability across states, NCHS publishes standardized birth certificate (https://www.cdc.gov/nchs/data/dvs/birth11-03final-acc.pdf) and death certificate (https://www.cdc.gov/nchs/data/dvs/DEATH11-03final-ACC.pdf) forms, and a standard marriage license (https://www.cdc.gov/nchs/data/misc/hb_marr.pdf) that individual states can choose to adopt.
5 The Movement Advancement Project maintains information on state identity document laws and policies: see https://www.lgbtmap.org/equality-maps/identity_document_laws.
6 Prior to this change, “female” and “male” designations required external attestation from a recognized source, such as a birth certificate or documentation from a medical professional, and could not be based solely on a self-report.
7 As of January 2022, no timeline for the implementation had been announced.
documentation or other legal requirements must also be met.8 These policy changes have implications beyond the context of vital statistics and legal identification: broader recognition of nonbinary designations in administrative settings could also raise awareness of gender identity terminology and prompt broader acceptance of this language.
Similar policy changes are occurring around the world, with at least a dozen countries implementing nonbinary designations on passports and other legal identification documents or beginning the process of removing gender from their identity documents entirely (BusinessTech Staff, 2021; González Cabrera, 2021; Holzer, 2018). As these changes are made, it is important for policy makers and data systems administrators to recognize that barriers to changing these legal sex and gender markers—and inconsistency across documents—can restrict the ability of transgender people to travel and vote and may subject them to harassment and discrimination if the accuracy of their identity documents is called into question (Fielding, 2020; Quinan and Bresser, 2020). For immigrants to the United States who are recognized as nonbinary in their country of origin, navigating the complex patchwork of identification practices across jurisdictions that may inconsistently recognize their gender can affect their ability to conduct the business of their everyday lives.
In summary, unlike data collected in surveys and research settings, the collection and coding of data in vital statistics and legal identification settings are constrained by regulatory and legal policies and requirements. Furthermore, most of these data collected are linked to specific individuals for the purposes of establishing identity, and thus, reported demographic characteristics could be used to facilitate segregation, discrimination, and violence against individuals. In these settings, information on sexual orientation, gender identity, transgender experience and identity, and intersex/DSD status are generally not collected, because they are not needed for purposes of identification. At the same time, data on sex or gender are often collected without a clearly designated purpose and without clarity regarding which of the two constructs is of interest. The collection of sex and gender data on legal identification documents can have negative repercussions for people whose recorded sex or gender on their documents does not match their current gender, which underscores the need to ensure that both the collection and reporting of this information is done with a clear purpose that outweighs the potential harms.
The introduction of sexual orientation, gender identity, transgender experience, and intersex/DSD status to vital statistics data could potentially
8 Several states have also attempted to explicitly bar such changes. The Movement Advancement Project provides up-to-date information on state policies: see https://www.lgbtmap.org/equality-maps/identity_document_laws.
be done at low risk to individuals when information that is collected for research purposes can be clearly separated from data that are used for legal identification purposes. Although the need for proxy reporting of identity on death records could affect the quality of these reports, as noted above, there are preliminary efforts to assess whether measures of sexual orientation and gender identity can be feasibly collected for research purposes. If such efforts demonstrate this can be done, it will provide important data for population enumeration and monitoring discrimination of LGBTQI+ people. Meeting those needs is likely to require national standards for data collection on sex, gender identity, and sexual orientation.
Program and Personnel Administration
Program and personnel administration includes data systems that facilitate the functioning of many systems: employment; schools and other educational institutions; child welfare departments; the criminal justice system; and federal- and state-funded programs providing social and human services related to health, insurance coverage, housing, employment, credit and other economic resources, and nutrition. Administrators in these systems and programs collect data on the people they encounter for multiple reasons: to maintain records and ensure individuals are receiving appropriate services; to describe the populations of people needing and using services, including demographic characteristics that may relate to disparities in access, quality, or outcomes; to determine access or assignment to “sex-segregated” facilities or programs, such as bathrooms, prisons, detention centers, locker rooms, sports teams, sex-segregated schools; and to determine eligibility for funding and programs.
Measuring sex or gender, sexual orientation, and transgender experience is important in these contexts because of documented disparities, discrimination, and barriers in access to services in all of them (NASEM, 2018, 2020). In addition to well-documented discrimination and disparities among women—both cisgender and transgender—across many of these areas (Wilson et al., 2021), there are demonstrated disparities for people identified as lesbian, gay, bisexual, transgender, and questioning in incarceration (Wilson et al., 2017; Meyer et al., 2016; Marksamer and Tobin, 2014), housing (Wilson, O’Neill, and Vasquez, 2021; O’Neill, Wilson, and Herman, 2020; Romero, Goldberg, and Vasquez, 2020; Wilson et al., 2020), child welfare (Irvine and Canfield, 2016; Wilson and Kastanis, 2015), and education (Aragon et al., 2014). Some studies show disparities for sexual minority populations as defined by same-sex sexual behavior in prisons and jails (Zaller, et al. 2020; Brinkley-Rubinstein et al., 2019; Harawa et al., 2018) and for gender nonconforming populations in relation to food insecurity and in such settings as foster care, prisons and jails, homeless shelters, and
health care (Russomanno and Jabson Tree, 2020; Ecker, Aubry, and Sylvestre, 2019; Eisenberg et al., 2019; Glick et al., 2019; Lagos, 2018; Streed, McCarthy, and Haas, 2018; Gonzales and Henning-Smith, 2017; Irvine and Canfield, 2016; Wilson et al., 2014).
Due to the wide range of contexts that are covered under program and personnel administration, it was not possible for the committee to evaluate the full range of data collection practices that are currently in use in all of them. Even in the same context, data collection practices can vary widely. For example, although the Equal Employment Opportunity Commission requires all private-sector employers with 100 or more employees to report specific demographic data describing their workforce by sex, employers have flexibility in designing their data collection tools, and there is considerable variation in whether their data collection fields and internal coding explicitly reference sex, gender, some combination of the two, or neither. For applicants, some employers collect data on gender, transgender, or sexual orientation identities.9 These practices are consistent with Supreme Court rulings, which have found that the legal prohibition of discrimination based on sex extends more broadly to protections against discrimination based on sexual orientation and gender identity.10 To our knowledge, information on intersex status has not been collected in any administrative setting.
As is the case for vital statistics and legal identification data, an important feature of these administrative data is the ability to link this information to a specific individual. Although these data are often collected to monitor and measure disparities in treatment, this linkage to identifiable individuals can also contribute to segregation, discrimination, and harassment of individuals. Thus, collection of information on sex, gender identity, transgender experience and identity, sexual orientation, and intersex/DSD status in this setting could put individuals at risk if their data is disclosed or misused. When combined with the well-documented disparities faced by cisgender women and LGBTQI+ people, it underscores the importance of only collecting the minimum data that are necessary to meet specific administrative goals and to ensure protections are in place that restrict the use of these data to the furthering of those goals. For example, in some settings, such as in employment records or applications for social or business services, asking about an aspect of sex, such as sex assigned at birth, can be considered invasive or inappropriate by transgender people who do not wish to disclose this information to an employer, business contact, or social services coordinator. In other settings, however, asking about a specific
9 For example, the Biden administration employment application form asks applicants to report their gender, transgender identity, and sexual orientation: see: https://www.whitehouse.gov/get-involved/join-us/.
definition of sex, such as sex assigned at birth, may be important: these settings include programs and residential facilities that allow for assignment based on gender identity and yet are required to provide or connect people to health care, such as detention centers and child welfare case management. In these contexts, sex assigned at birth can serve as an imperfect but necessary proxy for specific health care needs.
In addition to ensuring only necessary data are collected, the possibility of disclosure can be minimized in administrative settings by enacting data protections that restrict data access and making disclosure voluntary for respondents. For example, in employment-related contexts, there are legal restrictions on when and how data on protected characteristics, such as age, gender, race, and ethnicity, may be collected. It is illegal for employers ask about this information on employment applications because it can be used to facilitate discrimination in hiring. However, the collection of this information about employees allows employers to monitor their hiring practices and identify potential discriminatory behaviors. It also facilitates mandatory reporting on employee characteristics to the federal government for large employers. For this reason, many employers ask applicants to voluntarily complete a form that asks for information about protected characteristics, which the employer then keeps separated from job application materials.
There is another significant form of data collection in many administrative systems that is consistently needed and likely never used by people outside the system itself: case management notes. On the one hand, a case management file is an opportunity to ask more detailed questions, provide space for personalized labels, and add flexibility to document shifts in identity over time, and thus it may provide a rich source of information both about individual identities and service needs and about population trends more broadly. On the other hand, this level of detail also makes case management files difficult to use outside of the specific purpose for which these data are collected because they require either significant staff time to manually review and extract data or the application of technological approaches, such as natural language processing, that are not currently widely used.
Data collected in case management files may be directly collected by staff from individuals, and they are often used by staff to inform interpersonal interactions and guide the provision of services. As in other administrative settings, this access can leave individuals vulnerable to mistreatment, and some respondents may prefer not to disclose this information due to fear of mistreatment or loss of services. For this reason, data collection and use of case management files requires high levels of competence among staff when they ask about or discuss sexuality and gender, particularly when sexual orientation and gender identity questions are open ended. It remains unclear whether any groups have tested the efficacy of proposed
In administrative settings, data collection often serves purposes that require data users to be able to link data to a specific individual. This ability to link data heightens the risk of disclosure of individual information and of mistreatment of vulnerable populations through segregation, harassment, discrimination, and violence. LGBTQI+ populations are at increased risk of disparate treatment across a wide range of administrative contexts. Although the collection of data on sex, gender identity, and sexual orientation can facilitate mistreatment, these data are also necessary to document its occurrence, as well as design and implement policies and procedures to counter it. For this reason, it is important that data collection in administrative settings serve a clearly defined purpose, be limited to data that are needed to support that purpose, and minimize the likelihood of data disclosure or misuse.
Clinical settings include a wide variety of contexts in which sex, gender, and sexual orientation are critical for health and well-being, at both individual and population levels: they include health surveys, public health surveillance, clinical trials, and medical records. In health surveys and public health surveillance, these data are critical for identifying and addressing disparities between groups on health-related outcomes and understanding the social determinants of health. In clinical trials and other biomedical research, these data can help ensure that research questions and findings apply across the diversity of natural population variation in sex, gender, and sexual orientation. In medical care settings, collecting these data is important for building trusting relationships between providers and patients, promoting culturally appropriate care (Bi, Cook, and Chin, 2021), identifying and tracking health conditions and risk factors at both individual and population levels (Sell and Krims, 2021), improving the quality and safety of health care systems (Bonvicini, 2017), and facilitating the processing of administrative functions, such as billing.
Stratifying clinical performance data by social risk category—including not only sexual orientation, gender identity, and intersex status, but also
such factors as race, ethnicity, and socioeconomic status—is a foundational step for improving the quality of care and advancing health equity for marginalized populations (Chin, 2021, 2020). Increasingly, public and private payers are reporting stratified clinical performance data and linking the results to financial rewards or penalties, and The Centers for Medicare & Medicaid Services’ Innovation Center (2021) lists stratified performance data as a pillar in its strategy to advance health equity. Thus, collecting data on sex, gender identity, and sexual orientation in medical settings is important for quality improvement and the advancement of health equity.
In health contexts, each of these characteristics may be independently relevant: gender identity; sex assigned at birth; transgender experience and identity; intersex status; sex traits, including chromosomes, gonads, internal and external genitalia, secondary sex characteristics, and hormones; the components of sexual orientation, identity, behavior, and attraction; and gender pronouns. It is important not to use any one as a proxy for any other one (see Chapters 1 and 2). Although sex assigned at birth may provide additional information beyond gender identity that is useful for improving care (Burgess et al., 2019), it is insufficient as a proxy measure for sex traits, because specific sex traits can have direct effects on the risk for or manifestation of a range of health conditions (Traglia et al., 2017), ranging from acute abdominal pain (Kim and Kim, 2018) to genetic disorders (Traglia et al., 2017), cancers (Dorak and Karpuzoglu, 2012), infertility, and osteoporosis (Dy et al., 2011). Ranges of some laboratory values, such as hemoglobin concentration or clinical decision tools (e.g., atherosclerotic cardiovascular disease), are interpreted within sexually bivariate ranges that reflect the effect of sex traits on physiologic processes. Anatomic inventories have been proposed as more specific strategies for collecting data regarding sex traits, though these questions may not be relevant or practical in all health care contexts (Grasso et al., 2021). Measures of gender identity and transgender experience can also be independently relevant for assessing patient risk. For example, transgender people demonstrate a higher prevalence of cardiovascular disease than cisgender people (Streed et al., 2021; Caceres et al., 2020).
At the heart of effective patient care is a strong, trusting relationship between clinicians and patients that facilitates clear communication and shared decision making. Collection of data related to sexual orientation, gender identity, transgender experience and identity, and intersex/DSD status by health care providers is critical to fostering that trust and providing care that is respectful and culturally appropriate (Bi, Cook, and Chin, 2021; Cook, Gunter, and Lopez, 2017). This can only be achieved if health care professionals engage in reflection, empathy, and partnership with patients; understand the effects of exposure to marginalization and discrimination; recognize and reduce their personal biases (Vela et al., 2022);
and be sensitive with terminology and language, such as using appropriate pronouns and avoiding invasive questions about identity when they are not relevant for providing care (Suen et al., 2022; Knutson et al., 2016).
As in administrative settings, however, information collected in clinical settings is linked to a specific identifiable individual and informs interpersonal interactions, so the collection of this data can also leave individuals vulnerable to mistreatment. Too often, implicit and explicit bias from health care professionals and discrimination by health care delivery organizations harm LGBTQI+ people (National Public Radio, Robert Wood Johnson Foundation, and Harvard T.H. Chan School of Public Health, 2017; Peek et al., 2016). Robust nondiscrimination policies, training of health care providers, and other structural changes to health care delivery organizations to promote better care for marginalized populations have become increasingly recognized as essential to the provision of high-quality care (Bi, Cook, and Chin, 2021; Cook, Gunter, and Lopez, 2017; DeMeester et al., 2016; U.S. Department of Health and Human Services, 2013).
The collection and use of data on sex traits, sexual orientation, gender identity, and intersex status are challenging areas for health care delivery organizations, health plans, and payers. These data have to fill a complex set of needs that include measurement of disparities to improve population health, information for health care and health services research, enabling respectful patient-provider interactions, and identifying sex-trait-related differences to provide appropriate health care to individuals. These needs create many points of access to the data in these systems. Although health data are protected from unauthorized disclosure by the Health Insurance Portability and Accountability Act (HIPAA), it remains crucial for organizations to have policies in place that clearly establish procedures and conditions under which authorized access to the data is granted to those within health care systems in order to minimize the possibility that providing the data can result in an individual’s mistreatment.
When data are collected in clinical settings, it is important to consider which data elements are needed for patient screening and population health purposes (i.e., demographic analysis) and which are needed for specific clinical purposes (e.g., Pap tests are indicated only for people with a cervix). It is then necessary to develop privacy protections around the disclosure and sharing of these data both in and outside of the clinical context. Unlike population surveys, clinical settings provide many points of contact in which information can be collected, which necessitates the development of work flows and organizational policies that identify when, how, and by whom data are collected to ensure that patient privacy is adequately protected (Antonio et al., 2022).
Data in clinical settings are most commonly collected and accessed through electronic health record systems, which have utility not only for
clinical care, but also for research and public health purposes. These systems have different interfaces for collecting data on sex, gender identity, transgender experience and identity, intersex status, and sexual orientation. The underlying terminology involves international code sets, such as Systematized Nomenclature of Medicine12 and Health Level 7 International (HL7),13 that affect how these data are collected and transmitted among systems. The U.S. Office of the National Coordinator for Health Information Technology has identified various terms that can be used to capture sex assigned at birth, gender identity, and sexual orientation in electronic health records,14 but little work has been done on how to measure intersex status as a demographic measure.
Although electronic patient health records are an important source, it is important to note that administrative decisions do not entirely depend on the data that have been collected by the health care provider. Despite what information is entered into the patient’s medical record, health insurers may make their decision to cover some procedures on the basis of the sex that is noted on the patient’s insurance policy. This sex designation may or may not be the same as the patient’s sex assigned at birth. In the electronic health records systems used by many institutions in health care, a common additional data element is “administrative sex/gender,” which refers to the designation of people as male, female, or another gender for such activities as hospital room assignments and insurance billing. This data element cannot be readily mapped onto self-identified sex or gender and cannot be considered a demographic measure, but it nevertheless affects the treatment individuals receive in a clinical setting.
The panel was tasked with making recommendations on measures of sex, sexual orientation, and gender identity with attention to how these recommendations can be applied differently in three settings: surveys and research studies, administrative settings, and clinical settings. We found that the most relevant factors that distinguished these settings were the use of the data, the identifiability of respondents, and the risk of data disclosure. In considering the collection of data on sex, gender identity, and sexual orientation, it is important to recognize that LGBTQI+ people are often subject to mistreatment, segregation, harassment, discrimination, and violence; consequently, reporting this information may pose risks to respondents in some situations. Because of this potential risk, we strongly advise
that respondents always be able to opt out of providing this information, particularly in contexts where their responses can be linked to personally identifiable information and where the risk of disclosure is high. Even when individuals are not at risk of being identified, such as when data are made publicly available in aggregated form, there is the potential for this data to be misused or misinterpreted to justify harmful treatment or policies. Thus, it is important to weigh the need for and benefits of collecting these data with the risk of harm that doing so may pose to respondents.
The three broad settings differ from each other in important ways that can affect the collection of sex, gender identity, and sexual orientation data. The first—and most important—factor is the potential for data disclosure. In considering this factor, surveys and research settings can be distinguished from the other settings because the information is usually reported in aggregated form or with personally identifiable information removed, which means the possibility of disclosure is low. In contrast, information that is collected in administrative and clinical settings can generally be linked to a specific individual. When LGBTQI+ individuals can be identified as such, it increases the risk that they can be targeted as members of these communities and suffer harm.
In clinical settings, data privacy is protected through HIPAA, which imposes penalties for the disclosure of medical record data. However, these protections cannot prevent mistreatment by those within the health care system, so it is imperative that in this setting there are clear organizational policies and workflows in place to control the collection of, access to, and use of these data. Although such clear legal protections against disclosure do not exist in all administrative contexts, clearly defined plans to restrict unauthorized access to the data need to be in place before they are collected, particularly when such data are used to inform interpersonal interactions. When this information is collected and reported in the identification portion of vital statistics records or in other identification documents, it may be inappropriate and potentially harmful to collect data that enable the identification of sexual and gender minority populations.
Another way in which the three settings differ is the purpose that the data collection serves. Although sex, gender identity, and sexual orientation data can be used to document group-based disparities in treatment and outcomes in all three data collection settings, data collection also serves a unique purpose in each setting that informs the specific measures of sex, gender identity, and sexual orientation that are collected:
- When data collection is conducted solely for the purposes of establishing identity, measures of gender identity, sexual orientation, transgender experience or identity, or intersex/DSD status are not
- needed, and their collection could facilitate segregation, harassment, and discrimination.
- When data are collected to improve interpersonal interaction and communication between case managers or health care personnel and service recipients or to provide appropriate services and care, measures of gender identity, sexual orientation identity or behavior, and transgender experience or identity may be relevant. In some circumstances, information on sex as a biological variable may also be needed as an imperfect proxy for sex traits in order to establish need or qualification for specific programs and services.
- Data that are collected to ensure that an individual receives appropriate health care services require the inclusion of detailed measures of sex, gender identity, and sexual orientation, including information about specific sex traits, intersex/DSD status, sex assigned at birth, gender identity and pronouns, transgender experience and identity, and sexual orientation identity, attraction, and behavior.
- When data are collected to enumerate populations and conduct research that elucidates the structural mechanisms through which population-based disparities are created and could be addressed, measures that can identify the relevant sexual and gender minority populations, such as sexual orientation identity, gender identity, transgender experience, and intersex status are the most relevant.
Even when collecting data on sex, gender identity, and sexual orientation is relevant in a specific context, data collection efforts need to balance the benefits of the data with the risks associated with unauthorized data disclosure and the potential misuse of data by those with authorized access. Protections need to be in place to minimize the risk to individuals of providing this information, particularly when it can be linked to specific individuals. When possible, data should be deidentified and aggregated. These protections serve not only to protect sexual and gender minorities, but also to ensure the collection of reliable data that accurately reflect the experiences of these populations.
This page intentionally left blank.