As noted in Part I of this report, existing measures tend to conflate the constructs of sex and gender, typically by asking a single question that attempts to reflect one or the other or perhaps some combination of the two. One of the principles guiding the panel’s recommendations is to provide questions that use precise terminology and ensure construct validity by measuring what they say they are measuring. Key to that effort is distinguishing measures of “gender” from measures of “sex.”
This chapter begins by discussing our use of relevant sex and gender terminology and establishing how the measures we focus on in this chapter relate to broader constructs. We then review existing approaches that incorporate varying measures of sex and gender and use differing strategies for enumerating transgender people. We then offer our recommendations for measurement practices that include both cisgender and transgender people and conclude with recommendations for future research.
Measures of gender include questions about a person’s gender identity, which reflects their internal understanding of their own gender, as well as questions about gender expression, which is how a person expresses their gender to others. Questions about gender identity can be designed with categorical responses, asking whether people identify as men, women, or another gender, such as nonbinary, or they can be gradational, asking people to place themselves on scales of femininity and masculinity (Lindqvist et al., 2021; Magliozzi et al., 2016). The latter approach draws on a long-standing
line of research in psychology that demonstrates femininity and masculinity should not be seen as polar “opposites” because people can also be low on both scales, high on both scales, or somewhere in between (Bem, 1974).
Incorporating gender scales has to date been most common to measure gender (non)conformity in studies of health disparities (Hart et al., 2019) and to understand growing gender diversity among U.S. youth and young adults (Ho and Mussap, 2019; Johfre and Saperstein, 2019; Lowry et al., 2018; Wilson et al., 2017; Wylie et al., 2010). There also are recent efforts in survey research to combine traditional categorical approaches with gender scales to better represent diversity both within gender identity categories and among them (e.g., Alexander et al., 2021). However, such studies typically require multi-item modules that make them less feasible for use in general population surveys or administrative data systems. Furthermore, they are not intended to distinguish between transgender and cisgender people in order to provide representative estimates of the transgender population. Thus, in this chapter we focus our review of gender measures on the most promising measures of categorical gender identity that can identify transgender people and that have been used most frequently in general population assessments of adults.
Measures of sex can include self-reported items that reference a person’s sex as it was assigned on their original birth certificate or as it is currently represented on their legal documents. These classifications in government records are only rough categorical proxies for more detailed and often continuous measures of sex traits, including aspects of anatomy (such as internal organs or external genitalia), physiology (such as hormone milieu), or genetics (such as chromosomes). Studies of sex traits show that human variation is not fully captured by a male–female binary distinction (Montañez, 2017). However, until recently, these were the only designations offered on most U.S. identity documents, including passports and driver’s licenses (see Chapter 3).
Because of their ubiquity in many data systems, binary sex categories have often been used in general survey research and in administrative and health contexts to describe and explain differences that may have roots in biology, social norms, or some combination of the two.1 Direct measures of sex traits better represent specific biological mechanisms that can produce observed sex differences, but such measures are not commonly used, even in health research and clinical settings (see Chapter 3). We consider sex trait measures further in Chapter 7, in the context of promising approaches to enumerate intersex populations. In keeping with the panel’s recommendation to collect gender by default (see Chapter 2) that emphasizes the importance of measuring gender, we limit our review of sex measures to the role
a measure of sex assigned at birth can play as part of a broader strategy of improved gender measurement and the enumeration of both cisgender and transgender people.
As noted in Part I, the absence of construct validity in most measures of sex and gender also contributes to using inconsistent terminology to describe binary distinctions between females and males (sex terms) or men and women (gender terms), in both research reports and everyday speech. Many of the measures we review in this chapter use a combination of sex and gender terminology in their question wording and answer options that continues to conflate the two constructs. For example, the sex terms of female and male frequently appear as responses to questions about both sex assigned at birth and gender identity. The practice of using sex terms in gender identity questions makes it challenging to maintain consistent terminology in our discussion; it also raises concerns about construct validity for these items. Given our focus in this chapter on improving gender measurement to include transgender and cisgender people, when not discussing a specific measure that uses sex terms, we use the gender terms, men and women, especially when discussing the conceptual underpinnings of different measures and the interpretation of resulting data.
Currently, there are two types of transgender-inclusive measures of gender: one-step measures that attempt to identify transgender people using a single question and two-step measures that include a broader measure of gender identity and try to enumerate transgender and cisgender people. Two-step measures consist of a two-question sequence—commonly asking for sex assigned at birth and current gender, though other variations exist—that is intended to be used as a pair. When cross-tabulated, these two-step measures provide counts of cisgender women and men, transgender women and men, and people who identify using terms outside of the gender binary, such as nonbinary, genderqueer, and, for some American Indians and Alaska Natives (AIAN) populations who may identify as Two-Spirit or as their tribal or culturally linguistic-specific term.2 We first review the two-step measurement approach and contrast it with existing one-step measures.
The two-step measurement approach started as a pair of questions, one about sex assigned at birth and the other about gender identity, used for screening purposes to identify transgender people in health research settings (Melendez et al., 2006; Kenagy, 2005). This design was quickly adopted for the purposes of public health research in LGBTQI+ community settings
2 Such terms represent culturally distinct “third” or nonbinary gender identities, such as fa’afafine in Samoan, Māhūwahine in Native Hawaiian, or Nádleehi in Navajo/Diné.
(e.g., Sausa et al., 2009), before being tested more broadly as a way to identify both cisgender and transgender people among U.S. adults (Reisner et al., 2014; Tate et al., 2013). A two-step measure was first tested for use in a general population survey as part of the California Health Interview Survey (Grant et al., 2015), and similar versions have since been included on several federally sponsored national surveys, including the Census Bureau’s high-profile Household Pulse Survey. The National Center for Health Statistics will also begin fielding a two-step measure on its flagship survey, the National Health Interview Survey, beginning in 2022. Table 6A-1 in the annex to this chapter provides information on two-step measures used in these and other surveys.
Although sex assigned at birth is an imperfect proxy for anatomical, genetic, and physiological sex traits, it has utility in health contexts—including survey research, clinical trials, public health surveillance, and medical settings—for purposes ranging from clinical decision support to exploring the role of sex traits in health status and the etiology of disease. In addition, asking for the sex assigned to someone at birth, instead of just a person’s “sex,” avoids problems inherent in assuming that sex is an absolute and static representation of sex traits by grounding the question in the experience of having been labeled with a sex, rather than identifying with it. As such, a two-step measure that includes sex assigned at birth is being collected for the National Institutes of Health’s major precision medicine research initiative, the All of Us program,3 as well as on national and local case report forms for surveillance of conditions such as HIV/AIDS4 and COVID-19.5 A two-step measure has also been incorporated in clinical data systems such as the U.S. Department of Veterans Affairs’ electronic medical record (EMR)6 and is reflected in standards for EMR terminology codified by the U.S. Office of the National Coordinator for Health Information Technology.7
In terms of construct validity, the two-step design is a clear improvement over previous “Are you male or female?” measures for several reasons. First, it clearly distinguishes between two key constructs—sex assigned at birth and gender identity—and measures each of them directly. Second,
5 For example, see https://www.dhhs.nh.gov/dphs/cdcs/covid19/covid19-reporting-form.pdf. The GenderSci Lab at Harvard University notes, however, that many jurisdictions still do not collect data inclusive of transgender and nonbinary identities in COVID-19 case reports: https://www.genderscilab.org/blog/unknown-covid19-gendersex-reporting.
it allows for enumeration of both cisgender and transgender people by recognizing that a person’s gender identity can either differ from or be the same as the sex they were assigned at birth. Third, it offers gender identity responses that acknowledge a range of potential identities rather than constraining everyone to identify with binary categories, in line with the diversity of gender identity and expression that is well documented across cultures (Devun, 2021; LaFleur, 2021; Thorne et al., 2019; Snorton, 2017; Pruden and Edmo, 2016; Stryker, 2008; Fieland et al., 2007; Walters et al., 2006; Meyerowitz, 2002).
The two-step measurement approach also was designed to reflect the broadest definition of the transgender population, which categorizes as “transgender” any person whose gender identity is different from their sex assigned at birth, regardless of whether they identify with the word “transgender.” This definition is often called “transgender experience” (e.g., Puckett et al., 2020) or “transgender history” (as in the Scottish census question, described below). Not everyone with transgender experience or history expressly identifies as “transgender”: they may identify simply as men or women, or they may describe their gender identity using terms outside of the man/woman binary, such as genderqueer, genderfluid, gender-nonconforming, nonbinary, agender, bigender, or Two-Spirit.
In one study of a two-step measure that offered the gender identity responses of “male,” “female,” or “transgender,” half of the respondents categorized as transgender were recorded as such because they selected a binary gender identity different from the sex that was assigned to them at birth; the other half chose the term “transgender” to describe their gender identity (Truman et al., 2019). Thus, people who explicitly endorse the term “transgender” to describe themselves are a subset of the larger group of people who have transgender experience. Most two-step designs account for this key distinction by measuring both sex assigned at birth and gender identity so that transgender people can be identified either directly by their gender identity response or indirectly by providing different responses for the two items.
A variety of one-step measures have also been used to try to identify transgender people. A common one-step approach simply adds “transgender” as a response option to a binary sex or gender question, resulting in a measure such as, “Are you male, female, or transgender?”8 This type of measure substantially underperforms in identifying people with transgender experience, because many of such respondents select female or male to describe themselves (Schilt and Bratter, 2015; Tate et al., 2013). One study conducted in a clinical setting found that the proportions of transgender men and women nearly doubled when a two-step measure was used rather
than a single question that required respondents to choose whether to use the term “transgender” to identify themselves (Tordoff et al., 2019; for similar results among youth, see Kidd et al., 2021). A second variation on a one-step measure, such as one used by the Gallup Poll, includes “transgender” alongside “gay,” “lesbian,” and “bisexual” responses to attempt to count the collective LGBT population. Both approaches are problematic, for two reasons: (1) they fail to account for people with transgender experience who do not use the term “transgender” to describe themselves, and (2) they conflate transgender experience with either gender identity or sexual orientation, which are different constructs (Grant et al., 2015).
Another approach to identifying transgender people asks, “Do you consider yourself transgender?” with yes or no response options. A version of this question has been used on the Behavioral Risk Factor Surveillance System (BRFSS) since 2014 (Flores et al., 2016; Conron et al., 2012).9 If used as a stand-alone measure of the transgender population, however, questions like this may not count all men and women with transgender experience because, as noted above, not all people with transgender experience identify as transgender. The one-step approach also does not work well in some survey modes: Using a single item on an online general population survey—with or without providing respondents with a definition of transgender—results in a much higher estimate of people who identify as transgender than is found in other surveys with interviewer-assisted modes (Saperstein and Westbrook, 2021). The “Do you consider yourself transgender?” question also had low test-retest reliability relative to subsequent responses for the same individuals on both a two-step measure and another similar one-step transgender identity question (Saperstein and Westbrook, 2021). Together, these findings suggest a higher rate of “false positives” for this question format in an online self-completion context. Due to the low prevalence of transgender people in the population, even a small number of errors among cisgender people would have an outsized effect on population estimates for transgender people.
Our review of existing one-step approaches underscores two major challenges for enumerating transgender populations: (1) devising a measure that is inclusive not only of people who identify explicitly as transgender but also people with transgender experience; and (2) avoiding false positives from cisgender respondents who do not understand the question. The two-step design minimizes both types of measurement error by using question
9 The BRFSS measure includes a follow-up question for people who answer yes to allow for more detailed responses of transgender male-to-female; female-to-male; and transgender, gender nonconforming. Since 2019, it has also included a question about sex assigned at birth as an optional item. See https://www.cdc.gov/brfss/data_documentation/pdf/BRFSS-SOGI-StatBrief-508.pdf.
wording and answer options that make it easier for cisgender people to provide appropriate responses and by accounting for both transgender identity and transgender experience.
Figure 6-1 illustrates this distinction by showing how a two-step approach provides a more comprehensive count of the broader universe of transgender people than existing single-item alternatives. As discussed above, the population of people who identify as transgender either in response to a single question (such as “Are you male, female, or transgender?”) or on the gender identity item of the most common two-step measure are a subset of the people who would answer “yes” to the question “Do you consider yourself transgender?” Both sets of people who endorse a transgender identity on single-item measures are, in turn, a subset of the people who would be categorized as transgender based on their responses to a two-step measure that includes both the sex assigned at birth and gender identity questions. Although this two-step measure can also miss some transgender people if, for example, they prefer not to answer the question about sex assigned at birth and do not identify explicitly as transgender on the second step, the two-step approach provides better conceptual and empirical fit to the task of enumerating transgender people than one-step measures that are currently in use.
Importantly, the two-step approach also enumerates cisgender people, including separate counts for cisgender men and cisgender women. Even in the absence of the type of false positives described above, a stand-alone item, such as “Do you consider yourself transgender?,” does not provide a count of cisgender people because the people who answer “no” can be a combination of cisgender people and some people with transgender experience. The question “Are you male, female, or transgender?” also fails to count cisgender people because the male and female responses can be selected by both cisgender people and people with transgender experience. In essence, accounting for the difference between sex assigned at birth and gender identity is a key component to being able to measure gender for both cisgender and transgender people.
The panel acknowledges that space on forms and survey questionnaires, as well as respondents’ time, are not infinite and that, all else equal, a single-item measure would be preferable to reduce both costs and respondent burden. However, given the evidence, we cannot endorse existing single-item approaches to measuring gender or identifying transgender people.
Evidence of Two-Step Measure Performance in the United States
Over the past decade, the two-step measurement approach has been tested extensively and used in the United States among both transgender and cisgender people and English-speaking adults of all ages (Interagency
Technical Working Group on Sexual Orientation and Gender Identity Items in the Household Pulse Survey, 2021; Saperstein and Westbrook, 2021; Suen et al., 2020; Burgess et al., 2019; Holzberg et al., 2019; Smith and Son, 2019; Truman et al., 2019; Haider et al., 2018; Federal Interagency Working Group, 2016a, 2016b; Lombardi and Banik, 2016; Deutsch and Buchholz, 2015; Grant et al., 2015; Reisner et al., 2015). Research to date has focused on the use of a two-step measure in general population surveys and in health contexts and has established that adults generally do not find questions about their sex assigned at birth or gender identity particularly sensitive or difficult to answer.
In terms of overall feasibility, a randomized multisite trial of adding sexual orientation and gender identity questions to patient intake forms in several Midwestern medical clinics found that 82 percent of the middle-aged and older adults surveyed endorsed the importance of collecting gender identity data, and just 3 percent expressed discomfort with the sexual orientation and gender identity questions they were asked; the same proportion expressed discomfort with other questions not about sexual orientation and gender identity (Rullo et al., 2018). Similarly, research finds considerably more positive than negative feedback from respondents who answered a two-step measure in online general population surveys (Medeiros et al., 2020). The two-step approach is designed primarily for self-completion, though behavior coding has also demonstrated its ease of use in interviewer-assisted telephone surveys (Jans et al., 2015). Previous research on the feasibility of the two-step measure for proxy reporting in the United States also suggests it will perform as well as reporting of other demographic characteristics in such contexts (Holzberg et al., 2019).
The two-step design performs as well or better than other standard demographic items on measures of nonresponse and test-retest reliability. As noted above, a two-step measure has been implemented in several federally sponsored national surveys, including the National Crime Victimization Survey (starting in 2016), the General Social Survey (starting in 2018), and the Household Pulse Survey (beginning in August 2021). Strong evidence of this measure’s feasibility, across all such surveys, has been consistently low nonresponse rates, generally on the order of 1 percent per item. Table 6-1 shows the nonresponse rates and the prevalence rates for the transgender population in national samples of U.S. adults. For example, Truman et al. (2019) reported a combined item nonresponse of 1.3 percent for both sex assigned at birth and gender identity in the 2016 National Crime Victimization Survey (NCVS). Nonresponse rates in the General Social Survey (GSS) are lower for each item, and the combined nonresponse dropped from 0.9 to 0.3 percent between 2018 and 2020. Nonresponse rates in other large national studies with nonprobability samples are similar, with a 1.3 percent nonresponse rate for sex assigned at birth in the All of Us program, and
TABLE 6-1 Item Nonresponse Rates and Transgender Population Prevalence for Two-Step Measures
|Item Nonresponse Percentage|
|Sex Assigned at Birth||–||0.9||0.3||–|
|Transgender Prevalence Rate|
|Transgender Identity Alone||–||0.1||0.3||0.3|
NOTES: Transgender experience responses include people who selected a binary gender that differed from their sex assigned at birth and people who selected a transgender identity regardless of their sex assigned at birth. Respondents who selected the residual gender identity option (e.g., “none of these”) are not included in the estimates shown because their specific identities either were not collected or not publicly available; they may or may not correspond to nonbinary identities. Those represent an additional 0.18% of responses in the National Crime Victimization Survey (NCVS), 0.07% of responses in the 2018 GSS, 0.39% for the 2020 GSS and 1.1% for Pulse. The Pulse Survey is a nonprobability sample and should not be interpreted as providing nationally representative population estimates. The Census Bureau imputes missing sex assigned at birth in Pulse data (see Jesdale 2021a). Because of this, nonresponse rates are not available for this survey and the transgender calculation removes cases with imputed sex assigned at birth which otherwise produces an outsized number of respondents with transgender experience (Conron and O’Neill 2021). NCVS does not report transgender identity responses alone, though they note that 51.7% of respondents with transgender experience were coded as such because they selected “transgender” as their identity (Truman et al., 2019). Nonresponse rates calculated by committee members from publicly accessible online data tools for the General Social Survey (GSS) and detailed data tables for the U.S. Census Pulse Survey.
a 1.6 percent nonresponse rate for gender identity in the Household Pulse Survey. These studies indicate that, as measured by item nonresponse, sex assigned at birth and gender identity are an order of magnitude less sensitive that other common questions, such as personal earnings and household income (Saperstein and Westbrook, 2021; Grant et al., 2015).
When general population surveys began testing the two-step approach, concerns were raised that the inclusion of questions about sex assigned at birth or gender identity would negatively affect overall survey completion because of the perceived sensitivity of the questions or because the
presence of nonbinary response options would offend significant numbers of respondents. This concern prompted many surveys to include their two-step measures toward the end of the questionnaire to minimize breakoffs (where respondents stop answering entirely and fail to complete the rest of the survey). More recently, however, some surveys have begun placing the two-step measure either toward the beginning or in the middle of the questionnaire, with other demographic questions. For example, the GSS changed the placement of the two-step measure between its 2018 and 2020 waves and, as noted above, item nonresponse declined in 2020, when the questions were asked with other demographic items midway through the survey.10 In the Household Pulse Survey, sex assigned at birth and gender identity are included as the sixth and seventh questions overall (between questions about education and marital status and questions about sexual orientation and household size).11 These developments in overall survey placement, along with consistently low item nonresponse, underscore both the feasibility and acceptability of implementing a two-step gender measure in the United States.
There is also emerging evidence on test-retest reliability for the two-step approach. For example, stability rates were 99.4 percent for sex assigned at birth and 98.7 percent for gender identity responses between the 2018 and 2020 GSS (Saperstein, 2022). These stability rates are higher than previous studies have found for self-identified race, ethnicity, and religion, and they are comparable to stability rates for reported country of birth and interviewer-classified sex and gender (see Smith and Son, 2011). The panel is not aware of any research that directly compares the reliability of self-reported sex assigned at birth with the official designation on an original birth certificate. However, responses for sex assigned at birth generally would not be expected to change over time, except in rare cases where someone found out as an adult that their sex was recorded differently than they had previously been told. Responses to gender identity are expected to be fluid over the life course, but for an as-yet-unknown proportion of people. Including repeated measures of gender identity in longitudinal surveys would help clarify the degree of fluidity researchers can expect and provide appropriate recognition that a person’s gender identity can change over time. At present, however, it appears that relatively few respondents are likely to change their reported gender identity, even over the span of several years.
There are other areas of two-step measure performance that would benefit from further study, and there is notable variation in the specific question wording and answer options used both among general population
surveys and between survey research and clinical contexts. We discuss these outstanding issues and the challenges they raise for measurement in detail below. However, the panel recognizes that the same could be said for many demographic items already in regular use, and we do not believe it is appropriate to hold questions about sex assigned at birth or gender identity to higher testing and performance standards than have been used for other items before their widespread adoption. We conclude that there is sufficient evidence in the United States to support using a two-step approach to measuring gender for general population enumeration and research among English-speaking adults, as well as in health contexts, including medical settings, clinical trials, and public health surveillance.
Comparison with Other English-Speaking Countries
Although the United States has built up a strong evidence base in support of using a two-step measure of gender in assessments of the general adult population, it lags other English-speaking countries in implementing such measures in the census and other flagship national surveys, such as the American Community Survey. For example, in 2021 both Canada (Statistics Canada, 2021) and the United Kingdom (Office of National Statistics, 2021) implemented two-step approaches to measuring sex assigned at birth and gender identity in their national censuses. Australia debuted a nonbinary sex question for its 2021 count (Australia Bureau of Statistics, 2020) and published updated national standards in January 2021 that recommended the two-step approach (Australian Bureau of Statistics, 2021). New Zealand also revised its national statistical standard in 2021 to include a two-step measure (Stats NZ, 2021) and has announced it will be implementing new questions for its next census in 2023.12
Canada replaced the existing binary sex question on its census form with one that specifies sex assigned at birth and added a second question about current gender. The answer options for the sex assigned at birth component are male and female, while the responses for the current gender component are male, female, and a free-text option. The second question also specifies that current gender may be different from what is indicated on legal documents. Results from the 2021 Canadian census have yet to be released, but a 2019 content test of the same items found that 0.07 percent of the population provided a nonbinary gender identity,
12 Other countries have also revised their censuses to provide nonbinary sex or gender categories. Nepal was the first to do so, in 2011, and India (2011) and Pakistan (2017) followed. However, these countries have not released counts of nonbinary residents, citing privacy concerns (United Nations Economic Commission for Europe, 2019), and their censuses rely more on enumerators than does the U.S. census, which likely results in the undercounting of nonbinary people.
with a total transgender population estimate of 0.35 percent. This result was comparable to the estimated transgender population from the 2018 Survey of Safety in Public and Private Spaces, 0.24 percent, which was Canada’s first national survey to use a two-step measure. In the 2019 test, the combined rate of nonresponse and invalid responses to the new gender identity question was 0.10 percent on electronic questionnaires. On paper questionnaires, nonresponse rose to 1 percent for sex assigned at birth and 8 percent for current gender, with most of the latter concentrated among people over the age of 70.
The 2021 census of England and Wales also used a two-question approach but adopted a different format. After its pretesting, the Office of National Statistics (ONS) recommended the existing sex question remain largely unchanged, except for reversing the answer options so female is listed before male and adding a note that a gender identity question would follow later in the questionnaire. The new gender identity question asked: “Is the gender you identify with the same as your sex registered at birth?” The responses offered were yes and no; for those who responded “no,” a free-text field asked for their current gender. Although this question uses different wording than other two-step approaches, it similarly aims to account for both transgender identity and transgender experience. Reporting for sex was mandatory by law but reporting gender identity was voluntary and limited to ages 16 and up. Results from this census are not yet available.
Overall, measurement recommendations across these countries are quite similar. They all are either testing or actively using a two-step gender measure that references sex assigned at birth and gender identity, and they all affirm the importance of providing inclusive counts of both cisgender and transgender people in national statistics. They also all endorse the principle that gender should be the default construct for data collection rather than sex (or sex assigned at birth) alone. Two key points of divergence are whether to offer a nonbinary response to sex assigned at birth and how best to account for both transgender identity and transgender experience.
In terms of implementing a nonbinary measure of sex, to date, only Australia added “non-binary sex” as a third response option for its 2021 census. Both Canada and the United Kingdom opted to retain binary male/female responses, while New Zealand continued to recommend binary responses but noted in the text of its revised standards that offering the option “Another term” with a write-in line may be necessary to account for birth registrations “that are neither male nor female.” As discussed in the next chapter, both Australia and New Zealand also provide optional questions on variation in sex characteristics in their revised national standards to better account for intersex populations. No country includes “intersex” as a response option for sex assigned at birth.
Recommendations and practices also diverge on how to provide inclusive measures of the transgender population. Australia explicitly
recommends against using a single-item transgender identity question and also discourages including “transgender” as a response option in a general gender question, in both cases citing concerns about data quality. As part of its precensus planning for England and Wales, the United Kingdom’s ONS reported testing several question formats to gauge transgender identity and experience: it found that nonresponse was higher with a question that used the term “transgender” (5%, compared with 1% for a question that did not use the term) and that a much lower percentage of respondents deemed it an “acceptable” question to ask (71%, compared with 90% for the two-step measure). Despite this, Scotland is planning to include a combined transgender identity and experience question as the second step in its 2022 census after finding that respondents found it acceptable in their pretesting. Scotland’s new census form13 asks first, “What is your sex?” with female and male response options, and then asks, “Do you consider yourself to be trans, or have a trans history?” with no and yes as response options. For those who respond “yes,” a free-text field is offered for them to “describe your trans status (for example, non-binary, trans man, trans woman).” In settings where it is important to identify transgender people but where privacy protections are deemed inadequate for collecting data on sex assigned at birth, New Zealand endorses asking “Do you consider yourself to be transgender?” in addition to a gender identity question that offers the answer options of female, male, and a free-text field for another gender.
The panel carefully considered the range of recommendations offered by these countries along with evidence on item performance, when available. As in the United States, there is overall support for a two-step measurement approach, but considerable variation in the specific format of the two items. With results from the latest census rounds in these countries still pending, we could not factor in the new measures’ performance in those important and high-profile contexts. Given the absence of that evidence, along with variation among countries in the changes that are allowed to sex/gender markers and in options for nonbinary registration in government documents, as well as potential differences in the specific gender terminology locally deemed “acceptable,” the panel gave more weight to the results of testing the two-step measures conducted in the United States. However, when relevant to the rationale for our recommended measures, we include comparisons to these countries’ approaches in our discussion below.
In this section we present our recommended two-step question for gender identity that makes it possible to identify those with transgender experience, followed by a discussion of considerations for the recommended measure. The subsequent two sections separately discuss the two components of the measure: sex assigned at birth and gender identity.
RECOMMENDATION 4: The panel recommends that the National Institutes of Health use the following pair of questions assessing sex assigned at birth and gender identity:
Q1: What sex were you assigned at birth, on your original birth certificate?
(Prefer not to answer)
Q2: What is your current gender? [Mark only one]
- [If respondent is AIAN:] Two-Spirit
- I use a different term: [free text]
(Prefer not to answer)
Overall Measure Considerations
In keeping with our recommendation to collect gender by default, this two-step gender measure, which allows respondents to report both binary and nonbinary gender identities and can identify both cisgender and transgender people, can replace existing standalone questions that measure either sex or gender. This two-step measure is best included with other demographic measures, such as race, ethnicity, and age, given the demonstrated acceptability of answering these questions among the general population, as indicated by the very low rates of nonresponse noted above. In addition to the issue of placement with demographic questions, other general considerations about this measure include maintaining the questions as a pair, how to address skip patterns, the order in which to present the two components, and whether to include a confirmation question. Below, we discuss each of these considerations in turn and provide a summary in Table 6-2.
The two-step measure is designed to be presented in sequence and used in tandem to produce counts for both transgender and cisgender people. For cisgender people, the count will include cisgender men (male at birth, male current gender) and cisgender women (female at birth, female current gender). For people with transgender experience, the count will include transgender men (female at birth, male current gender) and transgender women (male at birth, female current gender),14 transgender-identified people (who report any sex at birth and expressly choose to identify using the term “transgender”) and people with other nonbinary gender identities who report any sex at birth and select either Two-Spirit or “I use a different term”15 to write in a different gender identity.
In analyzing the responses to the two-step measure, we advise that data from the two-step measure not be collapsed to report counts of females, males, and transgender people (leaving the cisgender categories unmarked), but data for each item can also be analyzed separately to examine disparities along different dimensions, such as for all people who were assigned male at birth or all people who selected “female” for their current gender. However, we recommend that the questions are always asked as a pair: we do not recommend collecting sex assigned at birth as a standalone item in any data collection context.
We note that some data collections routinely use a single question with female/male response options to drive skip patterns: for example, the American Community Survey skips a question about recent births for people who are reported as male. In such cases, we suggest substituting sex assigned at birth responses to program skip logics.16 In cases for which detailed information on physical sex traits is needed (such as some health surveys or clinical settings), sex assigned at birth alone may not be adequate to drive survey skip patterns, and other methods, such as an organ inventory, may be more appropriate (see Chapter 3).
In general population surveys, the two-step measure typically has been
14 We remind readers that we use the gender terms cisgender man and cisgender woman and transgender man and transgender women to reflect that the underlying concept being measured is gender, even when some gender identity answer options are provided using the sex terminology, female and male.
15 Analysts cannot not assume that all write-in responses for “I use a different term” will represent nonbinary gender identities. Write-ins need to be examined to determine whether responses represent nonbinary identities, are consistent with existing binary categories (and can be recoded accordingly), or need to be treated as (uncodable) missing data. The latter may include “protest” write-ins or other off-topic responses (see Jaroszewski et al., 2018; Saperstein and Westbrook, 2021).
16Jans et al. (2015) offer sample wording for transition text by the California Health Interview Survey in its interviewer script before a section of the survey on prostate screening: “These next questions may be relevant to you because you were assigned male at birth. If not, please let me know and I will skip them.”
ordered with sex assigned at birth asked first, followed by gender identity. This represents a chronological order from past to present and helps to provide context for the second question for cisgender people who may not be used to thinking about gender as an identity. Smaller-scale cognitive interview studies among transgender people have reported that some respondents expressed concerns about the question order, and suggested the opposite order—with gender identity first—because it conveys more respect for self-identification (e.g., Lombardi and Banik, 2016). There is evidence from several small split-panel studies that varying the question order did not yield significantly different response distributions (Amaya, 2020; Sanderson and Immerwahr, 2019 as cited in Federal Committee on Statistical Methodology, 2020; Saperstein and Westbrook, 2021). Previous research has also suggested that ensuring the two items are presented on the same page, especially in online formats, may resolve ordering concerns for some (Bauer et al., 2017). However, there is insufficient evidence to demonstrate that switching the order avoids issues of sensitivity, as other transgender people may find it offensive when sex assigned at birth is asked as the second step because it could be seen as a check on gender identity. Thus, the panel’s recommendation follows the question ordering with sex assigned at birth first and gender identity second because it has been used successfully in both general population surveys and health contexts; however, further research on this point is warranted.
Several general population surveys, including the NCVS and the Household Pulse Survey, follow the two-step measure with a confirmation question to reduce false positive responses. The exact wording of the confirmation question varies, but it typically asks: “Just to confirm, you said you were assigned [X] at birth and you currently identify as [Y], is that correct?” It is intended to reduce false positives by giving people who gave differing responses to sex assigned at birth and current gender a chance to revise their responses. The panel suggests including a follow-up confirmation question in automated data collections. However, other than cost-effectiveness, there is insufficient evidence that supports the practice of asking a confirmation question only of people who will be categorized as transgender because they gave different answers to the two components (while ignoring potential false negatives). For this topic, too, future research is warranted to consider the unequal survey burden this practice places on an already marginalized population.
Considerations for Component of Sex Assigned at Birth
Specific formatting considerations for the sex assigned at birth question in our recommended measure include the wording of the question stem, the wording and ordering of response options, and whether to include a free-text option.
TABLE 6-2 Summary of Findings on Recommended Two-Step Gender Measure and Evaluation Criteria
|Question Stem||Question Responses||Evaluation Criteria||Evaluation|
|Which of the following best represents how you think of yourself?||
(Prefer not to answer)
Previous use in population-based data collection
|Testing: comprehension and validity||
|Populations included in testing||
|Testing: errors and nonresponse||
|Adjustments to previously tested item included in recommended measure||
|Weaknesses and challenges||
Our recommended question includes a reference to “your original birth certificate” in the question stem. This wording is not universal across general population surveys that have asked questions on sex assigned at birth and is not used in the All of Us Program, in particular. The panel considered the merits of including this reference and concluded it helps to clarify the question, particularly for people who may have changed the sex designation on their original birth certificate (see, e.g., Miller, Wilson, and Ryan, 2021). Being explicit that the question is asking for sex assignment on a specific government record also helps to distinguish it from the second step self-identification question about current gender.
We recommend that only two categories—female and male—be offered for sex assigned at birth, as they are the only options available on original birth certificates in the United States. We recognize that some U.S. measures of sex assigned at birth include a third category of intersex (e.g., All of Us and the 2018 GSS); however, intersex is not currently an available designation at the time of birth in the United States. It is standard practice in the United States for children born with intersex variations to be assigned either male or female at or shortly after birth, and the majority of adults with intersex variations identify as male or female (Shteyler, 2021; Rosenwohl-Mack et al., 2020; Almasri et al., 2018; Lee et al., 2006).17 At present, “X” designations and other nonbinary sex markers are only available in a subset of states on a case-by-case basis by petition (see Chapter 3). Thus, for now, offering just two sex responses is consistent with current U.S. vital statistics practice for original birth certificates.
For responses to the question on sex at birth, many surveys list male first and female second. However, this ordering has little justification and lacks consistent scientific rationale. Current best practices recommend randomizing response options for survey items, but this is generally emphasized for other purposes, such as when listing candidates on ballots or when varying which end of an agree–disagree scale appears first. There is no existing body of evidence that explicitly tests response ordering for demographic questions, though it is common practice to order nominal response categories either alphabetically or by expected population size to help respondents make sense of otherwise unordered lists. Listing female first fits both criteria. This response order is currently being used by the All of Us program, following pretesting (Cronin et al., 2019). It was also used in the 2018 GSS, following a survey experiment that randomly varied the answer order in an online sample of U.S. adults (Saperstein et al., 2019). The ONS also opted to change the answer order for the 2021 census of England and Wales after extensive testing, prompted by focus-group respondents highlighting inconsistent response ordering across questionnaire
17 See also Courtney Finlayson’s 2021 public testimony to the panel.
items (Office for National Statistics, 2021). Consequently, our recommendation lists female first, with a caveat that optimal answer option ordering for all demographic questions would benefit from further empirical study.
Under the panel’s guiding principle of allowing respondents the autonomy to self-identify, we generally recommend offering open-ended write-in response options for identity questions. However, “sex assigned at birth, on your original birth certificate” asks respondents to report what was recorded on an official government record; therefore, our recommendation does not include a free-text option to provide other responses. However, our recommendation does include the option for people to say they do not know what sex they were assigned at birth or to decline to answer. Whether these responses should be provided explicitly depends on the context. Signaling that people have the right to opt-out of responding is especially important in settings in which voluntary consent has not been established and responses could be individually identifying (such as employment records and college or grant applications). In such settings, and in line with our guiding principle affirming autonomy, our recommendation uses the wording “prefer to not answer” rather that such alternatives as “decline to state” or “refused.” For data collections where respondents can easily skip over items if they do not wish to answer, it is not necessary to provide an explicit “prefer not to answer” response.
Specific Considerations for the Component of Current Gender
As for the component of sex assigned at birth, specific considerations for the current gender component of the recommended measure include the wording of the question stem, the wording and ordering of response options, and whether to include a free-text option. Other considerations related to implementing the gender identity measure, such as whether to allow for multiple responses, are covered in the section below on recommended research areas.
In considering the wording on gender, the panel considered other approaches, such as “How do you describe yourself” or “Do you consider yourself to be….” We concluded it was important to include the word “gender,” given the guiding principle of using precise terminology. The recommended question stem also contains the qualifier “current,” which is used in several surveys using the two-step method, including the NCVS, GSS, and the National Longitudinal Study of Adolescent to Adult Health (Add Health), and helps to convey that the response can change over time.
Other surveys specify gender identity in the question stem (the National Intimate Partner and Sexual Violence Survey) or offer additional text noting that “gender is how you feel inside” (High School Longitudinal Study of 2009) or that current gender “may be different from sex assigned at
birth and may be different from what is indicated on legal documents” (2021 Canadian census). The panel appreciates the specificity of these items, but incorporating lengthy definitional text in the question wording adds cognitive burden, which can reduce respondent comprehension (Yan and Tourangeau, 2008; Holbrook et al., 2006; Knauper et al., 1997). Such clarifications could be provided instead in telephone interviewer scripts or incorporated as information help boxes in automated data collection. The panel also weighed concerns that using subjective language, such as “consider yourself to be,” might inadvertently minimize gender identity by implying it is less factual than sex assigned at birth. This concern, combined with the goal of limiting both questionnaire space and cognitive burden, made the simple and straightforward wording of “What is your current gender?” most appealing among the existing alternatives.
The panel’s recommended question uses female and male answer options, which is intended to keep the response categories consistent between sex assigned at birth and current gender. Several U.S. surveys have opted to offer gender terms instead (All of Us and the 2018 GSS), and Australia recommends offering both (female and woman, male and man). The U.S. surveys that include gender terms did so after extensive pretesting, though the panel is not aware of published evidence that directly compares either respondent comprehension of the two types of response labels or overall item performance. In the absence of other evidence, the panel’s recommendation of female/male terminology is in keeping with response options that are used by the majority of current two-step measures. As we note below, further research on this issue is needed given concerns about conceptual conflation and construct validity.
The panel’s recommended answer options for current gender include “transgender,” based in part on research conducted using the 2016 NCVS, which showed that including a transgender category as an answer option in the second step is necessary to fully enumerate the transgender population in the United States (Truman et al., 2019). The panel considered measures that used a “nonbinary” response either in addition to, or instead of, a “transgender” response option, but decided against recommending this, given the absence of any published evidence of testing a “nonbinary” answer option in general population surveys of adults. This, too, is a subject for future research. Overall, given the impossibility of providing a truly exhaustive list, the panel preferred a shorter list of responses that is augmented with a write-in option for those who do not wish to identify with these listed responses.
The panel does not recommend including more detailed subcategories of transgender experience or identity (e.g., transgender male/man or
transgender female/woman),18 particularly in general population surveys, for several reasons. First, even in large-scale surveys, the sample sizes for these responses are likely to be small and, for data privacy reasons, would need to be aggregated for reporting, especially at lower levels of geography, to ensure the confidentiality of individuals’ information. Second, regardless of data collection setting, the panel decided it was inconsistent to label some responses as intended only for transgender men and women while leaving the “female” and “male” responses, presumably intended for cisgender people, unmarked (i.e., they are not labeled equivalently as “cisgender woman” or “cisgender man”). Such a formulation would recognize transgender men and women, but it would incorrectly imply that they cannot (or should not) identify with existing binary categories. Furthermore, the specific wording of these response options is inconsistent across the data collection systems that do use them, and the various sex and gender terms may not be universally understood (e.g., some people might interpret “transgender man/male” to mean a transgender woman who was assigned male at birth). Finally, if respondents want to specify that they use such terms as “transgender woman,” “transgender man,” or any other, they can do so through the write-in response.
The panel’s recommended wording, “I use a different term,” as the lead-in response to the write-in option follows Australia’s recommendation and is in keeping with the guiding principle of allowing respondents to self-identify. The panel concluded that this wording is better and more affirming than other options typically used in the United States, such as “Something else,” “Other,” or “None of these,” which can have negative and dehumanizing connotations for populations not reflected in a response list. The New Zealand standards recommend “Another gender (please state),” which also offers a more neutral alternative for the write-in response.
Finally, as with sexual orientation, the panel’s recommendation includes the Two-Spirit category in general population surveys, provided they collect racial identification data prior to gender identity and can be automated to ensure that this response option is available only to AIAN respondents. When this is not possible, respondents who want to identify explicitly as Two-Spirit can write in that response. We note that the Indian Health Service recommends using a two-step measure with the inclusion of a Two-Spirit response option, as well as a write-in gender response option in its strategic plan for gender-affirming care (NPAIHB 2020). In addition, in the only national study of people who identify as AIAN LGBTQI+ Two-Spirit, 28 percent of the sample identified their gender as Two-Spirit and not as female, male, or transgender (HONOR Project; Cassels et al.,
18 We note that such detailed response options are common practice in electronic medical records.
2010). Although small sample sizes may prohibit detailed analysis of these responses in most general population surveys (which are not designed to oversample AIAN respondents19), the panel concluded that the aims of signaling inclusion and explicitly acknowledging Indigenous identities were paramount. Accurate AIAN data, inclusive of gender representation, are crucial as policy makers need national datasets to shape funding allocations and develop policy interventions in order to specifically serve AIAN communities as mandated by federal law.
The panel’s recommended two-step gender measure is the result of a review of currently available evidence. We concluded there is sufficient evidence to support asking for sex assigned at birth and current gender as part of a two-step approach to gender measurement for general population enumeration and research among English-speaking adults, as well as in health contexts, including medical settings, clinical trials, and public health surveillance. We also identified a number of potential challenges to even the best existing measures. Although the two-step approach is the strongest option available for a gender measure that identifies both transgender and cisgender respondents, additional research is needed to address concerns expressed by transgender people, scholars, and other stakeholders about some aspects of its operationalization to date (see, e.g., Glick et al., 2018). In addition to the issues already noted above, these include the following concerns:
- whether the sex assigned at birth component should be replaced with a question about transgender identity in nonresearch and nonhealth administrative settings;
- whether the current gender question should be measured as “mark all that apply”;
- the need to reevaluate and expand answer options over time, particularly with regard to nonbinary responses; and
- the need for further assessment of item performance across all survey modes, including proxy reporting, in languages other than English, for all major U.S. racial and ethnic populations, and among youth.
19 The National Congress of American Indians (2021) issued a report decrying inadequate federal data collection strategies stating that the state of current data collection methods leads to invisibility of AIANs in national studies. The report called for oversampling of AIAN populations for large-scale studies of the general U.S. population not only to ensure accurate data, but also to meet federal mandates and trust responsibilities to AIAN communities and populations.
In this section we briefly review these additional considerations and offer recommendations for future research.
Alternative Measures to Sex Assigned at Birth
Sex assigned at birth has specific uses in large population surveys and health-related contexts, both for enumerating transgender and cisgender populations and for guiding clinical research and practice. In other contexts, however, collecting information about sex assigned at birth may be considered invasive or unnecessary, particularly when it is directly associated with an identifiable individual and not covered by privacy laws, such as the Health Insurance Portability and Accountability Act (HIPAA). These settings may include employee records, beneficiary files for services unrelated to health care or health insurance coverage, and applications for employment or credit and other services. In these circumstances, it may be inappropriate to ask transgender people to disclose their sex assigned at birth, but still important to distinguish between transgender and cisgender people in order to ensure access to appropriate services and to monitor disparate treatment.
As noted above, the New Zealand statistical standards recommend a modified two-step measure that first asks about gender identity broadly (with the answer options “male”; “female”; and “another gender, please state”) and then asks the yes/no question “Do you consider yourself transgender?” for circumstances where respondent privacy cannot be assured or where it is otherwise undesirable or unnecessary to ask about sex assigned at birth (Stats NZ, 2021). A similar option is currently being used on the employment application form for the Biden Administration.20 Other alternatives to using sex assigned at birth to enumerate transgender people could include a modified two-step approach that combines a current gender question with a second question like those used in either the Scottish census or the census of England and Wales: “Do you consider yourself to be trans, or have a trans history?” “Is the gender you identify with the same as your sex registered at birth?” Both of these have the potential to offer more inclusive counts of the transgender population because they more explicitly include people with transgender experience who do not identify with the term “transgender.”
More research is needed to identify circumstances in which collecting data on sex assigned at birth as part of a two-step gender measure is inadvisable and where alternative question designs may be preferable (see, e.g., Alpert et al., 2021). Alternative measures that do not rely on sex assigned at birth may also become necessary, more broadly, if sex designation is moved
below the line of demarcation on original birth certificates (see Shteyler, Clarke, and Adashi, 2020). If that occurs, a person’s sex assigned at birth would become available only for research purposes, and future generations would not see such designations on their birth certificates.
Another topic that needs research is assessment of the performance and acceptability of the recommended two-step measure in comparison with alternative approaches on enrollment forms for nonhealth-related services and programs. Such research needs to cover different domains, such as employment, education, social services, business services, and criminal justice, where concerns about identifiability and disclosure may be different. As noted above, the transgender populations identified using different measurement approaches cannot be assumed to be identical or directly comparable: people who might be classified as transgender using the panel’s recommended two-step method may not endorse a transgender identity on an alternative measure. Considerations of when to use different approaches require assessment of the relative importance of generating more complete counts of the transgender population and respect for individual privacy around either sex assigned at birth or transgender experience.
Allowing Multiple Response Options
Our recommended current gender question is “mark only one,” which aligns with most of the gender identity measures used in federally sponsored surveys (see Appendix A). Allowing for only one response greatly simplifies coding, classification, and tabulation. However, the panel recognizes a conceptual limitation in the recommended measure for transgender people who identify with binary gender categories but who also want to indicate that they would use the term “transgender” to describe themselves (Miller, Wilson, and Ryan, 2021). The panel’s recommended measure requires a forced choice between endorsing a binary gender identity and a transgender identity. Although there is no evidence that this forced choice decreases the feasibility of fully enumerating the transgender population in the United States, the panel recognizes the tension caused by the misalignment between the construct conceptualization and a forced-choice measurement of gender identity.
Some large-scale surveys, such as All of Us, allow respondents to select more than one answer option for their current gender. Allowing multiple responses for gender identity is also a feature of New Zealand’s revised national standard. Slight alterations to the recommended question wording, such as “Which of the following best describes your current gender?,” could also be considered for surveys that want to retain the ease of analysis of the “mark only one” option while acknowledging that the existing responses are not mutually exclusive. Alternatively, surveys could address
the conceptual misalignment directly by offering forced-choice response options that are mutually exclusive (at least at a given time), such as “man,” “woman,” “nonbinary,” and “I use a different term.” Testing of these options would further improve the construct validity of the recommended current gender item.
Incorporating Nonbinary Responses
As noted above, the panel considered including “nonbinary” as a current gender response option, either in addition to or instead of “transgender.” Current research shows that “nonbinary” and other similar labels are selected almost as frequently as “transgender” in samples of LGBTQI+ populations, and the response has been offered explicitly in a range of contexts (e.g., Tordoff et al., 2019), particularly among youth (Vivienne et al., 2021). Nonetheless, no systematic research has demonstrated that the category “nonbinary” is a commonly agreed upon term to represent nonbinary identities across cultures and age groups in the United States, and there is no published evidence to date that it performs well among cisgender people (i.e., that it does not increase false positives, other errors, or nonresponse rates). In addition, “nonbinary” and other similar terms include a range of identities that can reflect resistance to binary gender identities rather than endorsing a transgender identity per se, and thus these terms may be used by people who are or are not transgender (see, e.g., Wilson and Meyer, 2021; Thorne et al., 2019; Streed, McCarthy, and Haas, 2018).
In this respect, it is important to note that the panel makes a distinction between a current gender question for use in the general population and the collection of gender identity in LGBTQI+ community settings because the recommended measure may not capture the full range of gender identities in the LGBTQI+ population. Also, while the recommended write-in option may allow respondents to self-report their identities in their own words, these descriptors may be harder to standardize and aggregate, and efforts to do so may result in representing respondents in ways they did not intend (e.g., recategorizing someone who writes in “woman of trans experience” into the “transgender” category). As a result, community-based studies focused on the diversity in LGBTQI+ populations may require additional categories that are reflective of language, race and ethnicity, tribal affiliation, region, and age.
We note that a nonbinary response option for sex assigned at birth, in addition to female and male, may need to be considered in the future. As noted above and discussed in more detail in Part I, there have been rapid changes to state laws regarding amendments to birth certificates and other legal identity documents over the past few years, and more than a dozen states currently allow adults—or parents of newborns—to change a birth
certificate classification to nonbinary. It is unclear how many parents may have filed these amendments in the states that allow them. Nevertheless, some current toddlers are being raised with a nonbinary sex at birth or no marker at all (Newberry, 2019), which will have implications not only for how parents report them on current surveys, but also how they will report themselves in the future.
Overall, the answer options for both aspects of this measure may need to be periodically revisited in light of new developments. These developments may include not only changes in federal and state laws related to identity documents and the collection of vital statistics, but also changes in the pattern and prevalence of write-in gender identities. Studies of new gender identities as they emerge, particularly in LGBTQI+ populations (e.g., Vivienne et al., 2021; Frohard-Dourlent et al., 2017), will also aid in coding and classifying write-in responses in ways that most accurately reflect respondent intent.
Although self-reporting in surveys is preferred and in keeping with the panel’s principle to respect identity and autonomy, many U.S. official statistics are derived from surveys that depend on a single household respondent to provide proxy responses for other household members. Examples include the decennial census, the American Community Survey, and the Current Population Survey. These surveys are the source for high-profile statistics, including the unemployment rate and Congressional reapportionment. Other English-speaking countries, including Canada, England, and Wales have collected gender identity using proxy reporting in their censuses, but little quantitative research has been conducted on proxy reporting of gender identity or transgender experience in the United States.
In 2016, the federal interagency work group on measures of sexual orientation and gender identity listed proxy testing among its highest research agenda priorities (Federal Interagency Working Group, 2016c). Several recent feasibility studies (Holzberg et al., 2019; Kuhne et al., 2019; Ortman et al., 2017) indicate both sexual orientation and gender identity can be successfully collected by proxy. However, there is no evidence on this issue among nationally representative probability samples. More research is needed using pilot tests, methods panels, and other small-scale quantitative experiments to understand the measurement error properties and best practices for collecting both sexual orientation and gender identity by proxy.
Expanding Research on General Population Surveys
Testing for the two-step gender measure has largely focused on general acceptance of the questions, understanding of response options, and item performance among the general population of English-speaking adults. However, testing has been either more limited or not done at all in other populations. The needed research includes testing of translations and nonresponse in languages beyond Spanish and examining response invariance and preferred answer options across people of all races and ethnic backgrounds, regardless of their native language.
Spanish speakers were included in cognitive interviews for the two-step approach, and the questions in federally sponsored national surveys have been offered in both English and Spanish. Collectively, the results indicate that questions about gender identity tend to perform better among Spanish speakers than questions about sexual orientation, both in terms of overall comprehension and item nonresponse. For example, even when older Spanish-speaking cisgender adults were unfamiliar with the term “transgender,” they were still able to find an appropriate response for their gender identity and report it without difficulty (Michaels et al., 2017).
There have been few studies of comprehension testing in languages other than Spanish, and a commonly cited challenge for setting international standards for two-step data collection is the absence of words distinguishing between “sex” and “gender” in some languages (United Nations Economic Commission for Europe, 2019). The California Health Interview Survey translated its two-step measure into Spanish, Vietnamese, Korean, Cantonese, Mandarin, and Tagalog, but small cell sizes for languages other than Spanish restrict detailed analysis of the results. Sanderson and Immerwahr (2019 as cited in Federal Committee on Statistical Methodology, 2020) analyzed nonresponse for the five non-English languages included in their 2018 survey of New Yorkers and found the highest rates of nonresponse among Russian speakers (2.6% for gender identity, 0.9% for sex assigned at birth) and the lowest rates among Bengali and Haitian Creole speakers (0% on either item); Spanish speakers fell somewhere in between (1.7% for gender identity and 0.8% for sex assigned at birth) and were most notable for being the only linguistic minorities who identified as transgender. In contrast, Canada reported finding a lower proportion of nonbinary individuals from French-language questionnaires than English-language questionnaires in its 2019 content test, despite qualitative pretests showing the concepts were understood by French-speaking nonbinary people. Thus, translation and cultural equivalency remain open questions for research, particularly for the specific combination of question wording and answer options included in the panel’s recommended current gender question.
Evaluating Best Practices for Gender Identity Measures Among Youth
As noted above, the panel found insufficient evidence to support setting age standards for asking about current gender. The other countries we considered tended to set age minimums on their gender identity items, recommending they be asked only of people ages 16 and older. When explicitly justified, this was noted as being determined by the age of majority or when someone was legally considered an adult for the purposes of consent and was similar to age minimums for asking about sexual orientation. In the United States, some studies have asked gender identity questions of younger children. For example, in 2017, the Centers for Disease Control and Prevention piloted a single-item question to identify transgender students on the Youth Risk Behavior Survey (YRBS), which samples both middle and high school students (see Johns et al., 2019). Cognitive interviews indicated the item performed well, and it is now included among the YRBS optional question list for use along with a binary sex question. In 2019, a two-step measure was also tested in clinical settings among adolescents aged 12 to 18; in this pilot test, clinics that used the two-step measure identified six times more adolescents as transgender or gender diverse (1.3% at the pilot clinics and 0.2% at other clinics), and less than 2 percent of adolescents found the question confusing, offensive, or uncomfortable (Lau et al., 2021). This study, along with a pilot of the two-step approach among youth in foster care (Wilson et al., 2016), suggests younger children can understand and answer questions about their gender identity; however, for younger respondents, some answer options may need to be altered to include “boy” and “girl” alongside either male or man and female or woman. Further research is needed to appropriately adapt gender measures for children and adolescents.
RECOMMENDATION 5: To improve the quality and inclusivity of the recommended two-step gender measure—sex assigned at birth and current gender—the National Institutes of Health should fund and conduct research on the following topics:
- Explicit tests of the use of terminology for sex (female/male) and gender (woman/man) for current gender responses, along with optimal answer option ordering, and the utility of a confirmation question. Testing should also confirm optimal ordering of the two-step components in both survey research and in other settings.
- Alternative two-step gender measures that offer an inclusive count of both cisgender and transgender people for use in contexts where the privacy and confidentiality of sex assigned at birth responses cannot be assured or where specific information on sex assigned at
- birth is unnecessary but identifying transgender people for the purposes of service delivery or monitoring disparities is still desirable.
- Assessment of the inclusion of “nonbinary” as an answer option either instead of or in addition to “transgender” and of the utility of allowing multiple gender identity responses.
- Periodic reevaluation of write-in gender identity responses and how they change over time and may vary in different settings (e.g., among LGBTQI+ samples in comparison with general population samples and in clinical settings in comparison with surveys), as well as periodic reevaluation of listed response options.
- Evaluation of the utility of including a nonbinary response when asking about sex assigned at birth, particularly if nonbinary sex markers on birth certificates become more widely available, and consideration of how nonbinary gender identities should be counted in terms of cisgender or transgender status.
- Expanded testing of the recommended two-step gender measure beyond general population assessments of English-speaking adults, including updated translations and studies of response equivalence, as well as further testing among youth and in settings where a single respondent replies for all household members.
Table 6A-1 details examples of two-step gender measures in national and international surveys.
TABLE 6A-1 Examples of Two-Step Gender Measures in National and International Surveys
|First Item Stem||First Item Response Options||Second Item Stem||Second Item Response Options||Source(s)|
|What is your gender?||
||What sex were you assigned at birth, on your original birth certificate?||
||Add Health (Wave V)|
|What was your biological sex assigned at birth?||
||What terms best express how you describe your gender identity?
(Check all that apply)
||All of Us Program|
|Do you think of yourself as:||
||What sex was originally listed on your birth certificate?||
||CDC Recommendations (2020)|
|First Item Stem||First Item Response Options||Second Item Stem||Second Item Response Options||Source(s)|
|What sex were you assigned at birth? (For example, on your birth certificate.)||
||What is your current gender?||
|What sex were you assigned at birth, on your original birth certificate?||
||What is your current gender identity?
(Check all that apply)
||The GenIUSS Group (2014, Promising GI measure)|
|What sex were you assigned at birth (what the doctor put on your birth certificate)? (select one)||
||What is your gender? Your gender is how you feel inside and can be the same or different than your biological or birth sex.
(check all that apply)
||HSLS:09 (2016 follow-up)|
|What sex were you at birth?||
||Do you currently consider yourself to be:||
|What sex were you assigned at birth, on your original birth certificate?||
||Do you currently describe yourself as…?||
||NCVS, U.S. Census Pulse Survey|
|What was your sex at birth?||
||Do you consider yourself to be:||
|First Item Stem||First Item Response Options||Second Item Stem||Second Item Response Options||Source(s)|
|First, I’d like to confirm your gender. What sex were you assigned at birth, on your original birth certificate?||
||How do you describe your gender identity?||
|What sex were you assigned at birth, on your original birth certificate?||
||How do you describe yourself?||
||NORC recommendations for CMS (2017)|
|What sex were you assigned at birth, on your original birth certificate?||
||How do you currently describe your gender? (Check the ONE that best applies to you)||
|Is the person:||
||How [do/does] [you/Person’s name/they] describe [your/their] gender? Gender refers to current gender, which may be different to sex recorded at birth and may be different to what is indicated on legal documents. Please [tick/mark/select] one box||
|Sex: Census (2021)|
|Gender identity: Recommendations (January 2021)|
|First Item Stem||First Item Response Options||Second Item Stem||Second Item Response Options||Source(s)|
|What was this person’s sex at birth? Sex refers to sex assigned at birth.||
||What is this person’s gender?
Refers to current gender which may be different from sex assigned at birth and may be different from what is indicated on legal documents.
||Canada Census (2021)|
|What is your sex? A question about gender identity will follow later on in the questionnaire.||
||Is the gender you identify with the same as your sex registered at birth?||
||England and Wales Census (2021)|
|What was your sex at birth? (for example what was recorded on your birth certificate)||
||What is your gender?||
||New Zealand Recommendations (April 2021)|
|What is your sex?||
||Do you consider yourself to be trans, or have a trans history?||
||Scotland Census (2021)|
NOTES: Add Health, National Longitudinal Study of Adolescent to Adult Health; CDC, Centers for Disease Control and Prevention; GSS, General Social Survey; HSLS:09, High School Longitudinal Study of 2009; NATS, National Adult Tobacco Survey; NCVS, National Crime Victimization Survey; NHIVBS, National HIV Behavioral Surveillance System; NISVS, National Intimate Partner and Sexual Violence Survey; NORC, National Opinion Research Center; START, Survey of Today’s Adolescent Relationships and Transitions.
This page intentionally left blank.