Part of the panel’s charge was to develop principles and guidelines for evaluating measures of sex, gender identity, and sexual orientation and making modifications to these measures to tailor them for specific data collection circumstances and populations. The panel developed the five guiding principles for data collection.
- People deserve to count and be counted (inclusiveness). A key purpose of data collection is to gather information that can help researchers, policy makers, service providers, and other stakeholders understand diverse populations and create policies, programs, and budgets that meet these populations’ needs. Both quantitative and qualitative data, regardless of how they are collected, reflect the identities and experiences of people and communities that deserve to be heard and respected. Everyone should be able to see themselves, and their identities, represented in surveys and other data collection instruments.
- Use precise terminology that reflects the constructs of interest (precision). Sex, gender, and sexual orientation are complex and multidimensional, and identifying the components of these constructs that are of interest and measuring them using appropriate terminology is critical for collecting high-quality data. Questions should clearly reflect which component(s) of sex, gender, and sexual orientation are being measured in order to maximize the reliability of
- the description of the U.S. population, and one construct should not be used as a proxy for another.
- Respect identity and autonomy (autonomy). Questions about dimensions of identity, by definition, are asking about a person’s sense of self. Data collection has to allow respondents to self-identify whenever possible, and any proxy reporting should reflect what is known about how a person self-identifies. All data collection activities also have to require well-informed consent from potential respondents, with no penalty for those who opt out of sharing personal information about themselves or other household members. This principle encompasses data collection for legal documents intended for individual identification, and external authorization or attestation should not be required when someone reports, or wishes to change, their gender identity.
- Collect only necessary data (parsimony). Data collection is not an end unto itself: data should only be gathered in pursuit of a specific and well-defined goal, such as documenting or understanding disparities and inequities between populations or meeting legal reporting requirements, and data that are not essential to achieve that goal should not be collected.
- Use data in a manner that benefits respondents and respects their privacy and confidentiality (privacy). After collection, aggregate data should be analyzed at the most granular level possible, and research findings should be shared with respondents and their communities to ensure that they benefit from the data they have shared. Throughout all the steps of analysis and dissemination, data on sex, gender, and sexual orientation, which may be sensitive and vulnerable to misuse, has to be analyzed, maintained, and shared only under rigorous privacy and confidentiality standards. Similarly, when data are collected within tribal nations, preapproved tribal research and data collection, analytic, and dissemination protocols need to be followed to ensure data integrity and community benefit and to ensure that rigorous privacy and confidentiality standards are upheld.
These principles establish criteria that can be used to assess the measures of sex, gender identity, and sexual orientation presented in this report. The panel focused on identifying measures of these concepts that would be appropriate for use in the general population. We recognize that they may not be adequate for use with specific subpopulations, such as within LGBTQI+ communities, and that these measures may need to be adapted or modified for use in those communities. These criteria can also be used when considering modifications to the recommended measures. They are
in keeping with standard practices for ethical data collection in human subjects, such as those developed as evaluation criteria by the Office of Management and Budget (1997) for reviewers to use when considering revisions to federal measures of race and ethnicity.
The panel developed these principles and criteria at the outset of our task and then modified and refined them throughout our deliberative process to ensure that our recommendations adhere to them.
As noted in Chapter 1, data on sex and gender have often been conflated, though they are conceptually distinct and may differ from each other (Westbrook and Saperstein, 2015). For example, some surveys ask a single question, “Are you male or female?” that is sometimes referred to as a measure of sex (e.g., National Health Interview Survey), other times as a measure of gender (e.g., Pew Research Center, California Health Interview Survey), and sometimes variably referred to as both in the same survey (e.g., Health and Retirement Survey, General Social Survey). Although the terms “male” and “female” are conceptually sex-specific terminology, and the terms “woman,” “man,” “girl,” and “boy” are conceptually gender-specific terminology, in practice, clear differentiation between sex and gender response categories is not common in large population surveys and other data collections. Moreover, most people do not recognize a conceptional distinction between sex terminology and gender terminology (Hall et al., 2021; Schudson, Beischel, and van Anders, 2019; Pryzgoda and Chrisler, 2000), which is likely both a cause and a consequence of continued conceptual conflation and inconsistent use of terminology in data collection and everyday life (Stuhlsatz, Bracey, and Donovan, 2020). This conflation suggests that use of the appropriate terminology may not be sufficient to signal to respondents what they are being asked to report.
When the question stem wording does not specify the information being collected (i.e., sex or gender), respondents must decide which to report, and the resulting data will conflate these concepts. Data users will be unable to determine whether the data reflects sex or gender for any given respondent, which may lead to mismeasurement among those for whom sex and gender differ: in fact, this occurred with data from the U.S. Department of Veterans Affairs after a 2011 directive required medical providers to provide care based on gender identity (Burgess et al., 2019). Thus, it is important that efforts to collect sex and gender data are precise and make it clear to respondents which information is being collected and why.
Beyond the conflation of the concepts of sex and gender, surveys that use a single measure of sex or gender do not capture the underlying complexity and fluidity of these concepts. As noted in the panel’s conceptual
definitions in Chapter 1, individuals may have sex traits or gender characteristics (identity, expression, social and cultural expectations) that internally correspond to different sex or gender categories, respectively. For such people, a single overall measure of sex (or gender) will serve as imperfect proxies for them, creating opportunities for misinterpretation and misuse. And because both sex traits and gender characteristics may also change over time, there are opportunities for misinterpretation and misuse. For example, when these concepts are treated as immutable—such as when longitudinal surveys assume that sex and gender are stable over time and collect this information only during the first interview and then carry this information forward over time (e.g., the Panel Study of Income Dynamics, the Medical Expenditure Panel Survey, and the Health and Retirement Survey)—this can lead to misinterpretation or misapplication of these data to subsequent waves of data collection. Consideration of the experiences of two populations that fall under the LGBTQI+ umbrella, people with intersex traits and transgender people, highlights the problems with this approach.
Individuals with variations in sex traits, including sex chromosomes, sex hormones, reproductive anatomy, and secondary sex traits, with which a person is born or naturally develops are referred to as people with intersex traits. Biologically, intersex variations are highly heterogeneous, can involve any sex trait, and may not be apparent from an external examination. Those that result in obvious external anatomic diversity, sometimes called “ambiguous genitalia,” are relatively uncommon, accounting for about 1 in 2,000 (0.05%) births (Blackless et al., 2000). Some children may be identified as having intersex traits through prenatal testing. While experts report that this is occurring more commonly than previously, the frequency of this is unknown (Smet, Scott, and McLennan, 2020). Most people with intersex traits are born with genitals that appear to be male or female; consequently, the majority of people with intersex traits are not identified as having an intersex variation until later in life, often in adolescence or adulthood. Some people with intersex traits may go undiagnosed entirely, and most children born with any intersex trait are assigned a binary sex at birth.
When a child is born with genital differences, the process of assigning sex at birth is highly complex. Best practices recommend that a team of medical, surgical, and mental health experts work together with the child’s family to recommend a binary sex assignment (Finlayson, 2021). Clinicians consider available research on gender identity outcomes along with the child’s anatomy, sex chromosomes, hormone exposure, and likely puberty, as well as the family’s individual culture and values. For some children, this process may involve genetic testing and exploratory surgery over the course of months, during which many parents experience high levels of stress and uncertainty. The end goal is to recommend a sex assignment that
reflects the gender with which the child is most likely to identify, with the understanding that this may shift over time. In fact, evidence suggests that people with intersex traits are far less likely to have cisgender experiences than people without intersex traits.1 For example, one systematic review and meta-analysis found that the overall rate of gender dysphoria2 among persons with intersex variations was 15 percent, with variability among specific conditions (Babu and Shah, 2021). This is markedly higher than is found in the general population, in which even the highest estimates of prevalence using the broadest definitions of gender dysphoria range from 0.5 to 1.3 percent (Zucker, 2017).
Thus, experiences within the intersex/DSD (differences in sex development) population highlight the complexity of defining sex, as well as how differences between sex traits can emerge over time. Similar complexities arise for transgender people, whose gender is different from the sex they were assigned at birth. Some transgender people pursue medical gender affirmation, which may change some sex traits, making a single measure of sex a poor proxy for other sex traits or for gender.
Like sex, gender is a multidimensional concept, and therefore single measures are unlikely to capture its complexity. Conceptually, gender comprises identity, expression, and social status and norms, and without explicit direction regarding the dimension on which they should base their response, respondents may report their gender on the basis of any of these dimensions, although these dimensions may differ and may be fluid across social contexts. In many ways, the measurement of gender remains in its nascent stages, with research proceeding primarily along the lines of developing a two-step measure that seeks to identify transgender populations by separately assessing sex assigned at birth and gender identity. A more limited line of research has focused on the development of measures of gender expression that broadly fall into two types: continuum measures of femininity and masculinity (e.g., Gender Identity in the U.S. Surveillance, 2014) and classification into categories such as androgynous, butch, femme, or gender nonconforming (Malatino and Stoltzfus-Brown, 2020).
1 There is considerable diversity in the intersex/DSD (difference of sex development) population on the point of whether intersex is an identity. One of the only population-based studies of intersex people (Rosenwohl-Mack et al., 2020) asked respondents to report their current gender identity. Respondents were allowed to select “all that apply”; more than 60 percent of respondents selected “Intersex” as their gender identity. Thus, it appears that many intersex people see intersex as a gender identity.
2 Gender dysphoria refers to “clinically significant distress or impairment related to a strong desire to be of another gender, which may include desire to change primary and/or secondary sex characteristics. Not all transgender or gender diverse people experience dysphoria” (American Psychiatric Association, 2013).
Although scholars have long stressed the need to distinguish between sex and gender in social and medical research (Annandale and Hunt, 1990; Bird and Rieker, 1999), there is growing recognition of the potential harms that can arise from mismeasurement (e.g., when gender identity is reported as sex or vice versa) or misuse (e.g., using binary sex as a universal proxy for sex traits). These concerns are particularly acute in health care, where clinical decisions are sometimes tied to sex-related differences and where gender identity affects social interactions between health care professionals and patients in ways that can affect the quality of care (Morrison, Dinno, and Salmon, 2021; Clayton and Tannenbaum, 2016; Heidari et al., 2016).
The European Association of Science Editors convened in 2016 to discuss how to ensure better representation of sex and gender in medical and social research. They issued the Sex and Gender Equity in Research (SAGER) guidelines for more systematic collecting and reporting of sex and gender in research (Heidari et al., 2016). These guidelines recommend that researchers report and justify the use of both sex and gender data, as well as detail the implications of each in research discussions whenever possible. Acknowledging the problems with the measurement of sex in medical records, the guidelines further recommend that medical records that include a measure of sex also include information documenting how that information was collected (e.g., through patient self-report or genetic testing) in order to better assess its suitability and reliability for specific uses in medical contexts. The SAGER guidelines have been endorsed by groups such as the U.K. Commission on Publication Ethics and have been influential in changing publication standards for research journals internationally and (more unevenly) in the United States (Hankivsky, Springer, and Hunting, 2018).
In recent years there has also been a movement among international statistical agencies to develop measures for collecting data on gender. In 2019, the Economic Commission for Europe Conference of European Statisticians issued an in-depth review of the measurement of gender identity that noted that though sex and gender are often “used interchangeably in everyday life unless the distinction is made clear in the context” (United Nations Economic Commission for Europe, 2019, p. 3), in English, they are conceptually separate dimensions. The English-speaking countries of Australia, Canada, and New Zealand and the countries of the United Kingdom, have all begun to revise their national data collection standards, and in some cases, their national census, to include measures of gender (Australian Bureau of Statistics, 2021, 2020; Office of National Statistics, 2021; Statistics Canada, 2021; Stats NZ, 2021). Each of these countries has moved towards standards that require their data collection agencies to collect data on gender by default.
In the United States, the National Institutes of Health (NIH) draws a similar conceptual distinction between sex and gender. Consistent with
an Institute of Medicine (2001) report focused on exploring the biological contributions to human health, NIH policy requires that differences in health-related risks and outcomes related to sex as a biological variable be considered to strengthen the rigor of science (Clayton, 2018) and improve understanding of sex differences. In the guidance, NIH describes sex and gender (National Institutes of Health, 2015, pp. 1–2):
[Sex is] a biological variable defined by characteristics encoded in DNA, such as reproductive organs and other physiological and functional characteristics. Gender refers to social, cultural, and psychological traits linked to human males and females through social context. In most cases, the term “sex” should be used when referring to animals. Both sex and gender and their interactions can influence molecular and cellular processes, clinical characteristics, as well as health and disease outcomes.
CONCLUSION 1: Gender encompasses identity, expression, and social position. A person’s gender is associated with but cannot be reduced to either sex assigned at birth or specific sex traits. Therefore, data collection efforts should not conflate sex as a biological variable with gender or otherwise treat the respective concepts as interchangeable. In addition, in many contexts, including human subjects research and medical care, collection of data on gender is more relevant than collection of data on sex as a biological variable, particularly for the purposes of assessing inclusion and monitoring discrimination and other forms of disparate treatment.
Although the distinction between gender as a social construction and sex as a biological factor can seem clear on its face, in practice, aspects of gender shape most experiences in everyday life, from internalized psychological processes to structural constraints (e.g., through sexism and other forms of gender discrimination). It is difficult to disentangle the independent effects of sex and gender on other outcomes because of their combined biological and environmental or contextual influences. Gender-based social structures and expectations can influence behaviors and both create or magnify differences that might otherwise appear to be based in biology due to correlations with sex as a biological variable; however, these processes can only be understood if measures of gender are also routinely collected by default.
RECOMMENDATION 1: The standard for the National Institutes of Health should be to collect data on gender and report it by default. Collection of data on sex as a biological variable should be limited to
circumstances where information about sex traits is relevant, as in the provision of clinical preventive screening or for research investigating specific genetic, anatomical, or physiological processes and their connections to patterns of health and disease. In human populations, collection of data on sex as a biological variable should be accompanied by collection of data on gender.
The panel acknowledges that sex as a biological variable is often meaningful to measure in surveys, research studies, and clinical settings as it can affect the health and well-being of people or populations in terms of reproductive anatomy, biologic mechanisms linked to hormones, cell physiology, metabolism, and chromosomal configurations in biological systems. However, because these aspects of sex may differ from each other and do not exclusively determine gender, standard binary measures of sex are an inadequate proxy for the primary measurement of gender and sex traits, especially among sexual and gender diverse populations.
To address our statement of task, we attend to the constructs of sex and gender by focusing on the measurement of sex assigned at birth, gender identity, and intersex status in self-reported data collection efforts. Together, these measures allow for the identification of individuals for whom binary measures of sex serve as a poor proxy for sex traits, as well as those for whom sex and gender may be different. These measures do not represent the full complexity of either sex or gender, but they do improve on the measurement of gender by distinguishing gender identity from other dimensions of gender, as well as from sex assigned at birth. Providing measures of sex and gender that allow both intersex and transgender people to accurately represent themselves are important steps in aligning measurement practices with the diversity of human experience.