Private-Sector Collection of Data on Race, Ethnicity, Socioeconomic Position, and Acculturation and Language Use
When an individual sees a physician or checks into a hospital, basic information on his or her health conditions and history, demographics, and insurance status is collected to provide physicians, nurses, and health care workers with health status and health care information and to process payment for services. Likewise, when an individual or family applies directly for health insurance or uses health insurance benefits, the health insurance company collects basic background information on the individual or family members and may use it to underwrite the insurance policy, to target outreach efforts that improve health and health care knowledge or aid in disease management, and to better understand the needs and health care services usage of the insurer’s enrollees.
Data from these systems are collected as part of a health service interaction, but they are also used for statistical purposes—that is, to understand and draw inferences about how health care services are utilized, who is utilizing them, and the effect treatments have on health status, among many other questions. Such private-sector record systems can provide a rich set of data to understand disparities. These records are especially important sources of data for understanding health care disparities because they contain information on health care treatment (diagnoses, services received, procedure codes, and billed charges) that could not be collected in a survey setting without significant costs and respondent burden. Data collected by hospitals are aggregated at the state and federal level for these statistical purposes; data collected by physicians and by health plans are aggregated to a lesser extent.
In this chapter, we describe some of these private-sector data systems and their approach to the collection of data on race, ethnicity, socioeconomic position (SEP), and language. Private-sector data collections do not fall under the same regulatory framework as federal and state-based data collection, and in this chapter we briefly review the legal environment for private-sector collection of racial and ethnic data. We then review which data on race, ethnicity, SEP, and language are collected by health insurers, providers and medical groups, and hospitals, and barriers to the collection of these data. The chapter ends with the panel’s conclusions and recommendations for the improvement of these data collection systems. In these recommendations, the panel encourages the Department of Health and Human Services to press for private-sector collection of data on race, ethnicity, socioeconomic position, and language use, to exercise leadership in setting standards for the collection of these data, and to develop mechanisms to make data linkages possible. Although these recommendations are made to promote data collection for statistical and research purposes in order to better understand and design programs to eliminate disparities in health and health care, the chapter also highlights how such data could be used by those organizations collecting the data to improve programs, services, outreach and treatment for individuals and groups to monitor and reduce disparities.
PRIVATE-SECTOR DATA COLLECTION SYSTEMS
We define private-sector data systems as those that collect data as part of a patient encounter with a medical professional at a hospital, clinic, nursing home, or medical group practice,1 as well as those that collect data as part of private health insurance enrollment or a claims submission process (not including enrollment or claims data from Medicare, Medicaid, or State Children’s Health Insurance Program [SCHIP]). In general, these data come from three sources: hospitals and nursing homes, health insurers,2 and private physicians and physician medical groups. It should be noted that these private record collection systems sometimes share data on a limited basis with states and with federal data systems; for example, the National Hospital Discharge Survey (NHDS), which was discussed in Chapter 4, is a federally sponsored survey of records collected by hospitals on
We do not include records of encounters at Veterans Health Administration or VA hospitals in this chapter, although data on veterans’ use of health care services have been used to study racial and ethnic disparities in health care (for example, Jha et al., 2001). We also do not include records of patient encounters at Indian Health Service centers (see Chapter 4).
Our definition of health insurers encompasses indemnity health insurers, managed care plans, health plans, and health maintenance organizations (HMOs).
inpatient care, and the two federal cancer registry systems—Surveillance, Epidemiology, and End Results system (SEER) and the National Program of Cancer Registries (NPCR)—collect some information from medical and laboratory records. Thus, the review in this chapter of the collection of data on race, ethnicity, SEP, and acculturation and language use by private record systems has implications for other data systems described in the previous chapters.
Hospitals collect information on patients during the admissions process. These data typically include background information on the patient (this information may include race, ethnicity, income, and/or education level); initial health conditions and symptoms, health insurance coverage, if any (including coverage through Medicaid, Medicare, or SCHIP); and services and treatment received while at the hospital. Information is kept on the patient’s treatment during the hospital stay and compiled in a document that is completed when the patient is discharged, called the discharge abstract.3 Many of these discharge abstracts are forwarded to state health agencies or to the federal government for statistical purposes. The National Hospital Discharge Survey collects medical records from a sample of nonfederal (which excludes military and VA hospitals), short-stay hospitals. The Healthcare Cost and Utilization Project collects hospital discharge data from 33 state organizations and generates a national data set, the Nationwide Inpatient Sample (NIS), which is a 20 percent sample of hospitals in the nation, and a State Inpatient Database (SID), which represents all acute-care discharges in the participating state.
There is some standardization of billing forms that hospitals use to bill for services provided. These include the Uniform Bill (UB92), which is currently used by hospitals, nursing facilities, and clinics to bill third-party insurers and government programs such as Medicaid, and the HCFA 1500, which is used to bill for professional services, such as physician or laboratory visits.
Health insurance data on individuals are collected when the individual both applies for health insurance (enrollment) and utilizes a health care service covered under the health insurance plan. Enrollment data include basic demographic information and may include medical history, employment status, and ability to pay premiums. Insurance claims forms typically include member (patient) identification information, dates of services, diagnoses, procedure codes, and billed charges.
Data collected from medical groups, physicians’ offices, and group practices are similar to those collected by hospitals. They may include
general information on the patient’s demographic characteristics (e.g., age, gender) as well as on treatments received, diagnoses, and payment status.
Private-sector data collection does not, in general, fall under the same rubric of laws and policies requiring the collection of racial and ethnic data as collections sponsored by the federal or state governments. Privately collected data that are part of federal programs (e.g., Medicare) do fall under the DHHS Inclusion Policy and the OMB standards thus apply. Furthermore, hospitals that seek reimbursement for Medicare or Medicaid fall under discrimination enforcement provisions of Title VI of the Civil Rights Act. Some states have laws regarding the collection of racial and ethnic data (see Youdelman, 2002, and NRC, 2003). The Health Insurance Portability and Accountability Act (HIPAA) and the Employee Retirement Income and Security Act (ERISA), both of which establish some national standards for the regulation of health care in the private sector, are other important parts of the legal backdrop for collecting racial and ethnic data that may affect private-sector health data collection. We briefly review the legal framework in this section.
The Civil Rights Act and Title VI
Title VI of the Civil Rights Act of 1964 prohibits discrimination on the basis of race or national origin in services rendered through federal programs. Compliance with Title VI is required in order for hospitals to receive reimbursement under Medicare and Medicaid programs,4 although it does not apply to physicians’ offices or group practices (see the paper by Nerenz and Currier in Appendix F).
The enforcement of the Civil Rights Act requires the documentation of the absence of discriminatory treatment. The law therefore requires hospitals (and health care providers, health plans, and other organizations that receive federal funds) to collect data and maintain records that can be used to monitor disparities (Perot and Youdelman, 2001) and show compliance with the law. Title VI does not, however, require any specific data collection. Rather, Department of Justice regulations that implemented Title VI require data collection to document compliance with the DOJ regulations5 (DHHS regulations on Title VI are less explicit). The courts have ruled that
specification and enforcement of requirements for data collection under Title VI are at the discretion of federal agencies that run the programs covered by the act.6
Title VI also prohibits discrimination on the basis of national origin. Because a person’s primary language has been accepted as a proxy for national origin, primary language data collection could be part of Title VI enforcement (Youdelman, 2002).
The legality of the collection of data on race, ethnicity, acculturation and language use, and SEP by health insurers has received much attention from federal agencies, advocacy groups, and researchers. Such data collection is seen by some as an important source of information both for measuring and understanding disparities in health care utilization and treatment and for identifying opportunities for targeted preventive programs. Yet fears of compromising patient privacy or of redlining by health insurers (the practice of charging different prices or denying coverage to individuals based on race, ethnicity, or national origin) leave room for a great deal of uncertainty in perceptions of data collection. Although disclosing confidential information and redlining are both illegal, such fears remain (and are not necessarily unfounded).
A recent review of the legalities of collecting data on race, ethnicity, and language concluded that there are no federal laws or regulations barring the collection of these data by health insurers (Perot and Youdelman, 2001). Only four states (California, Maryland, New Hampshire, and New Jersey) explicitly prohibit or partially prohibit the collection of racial, ethnic, SEP, and language data on insurance application forms in the individual or group insurance market (National Health Law Program, in press).7 Five other states have regulations that could discourage the collection of data on race, ethnicity, and SEP: in Connecticut, Iowa, Minnesota, South Dakota, and Washington, when a health plan seeks state approval for its application forms, those that ask about race or ethnicity may be either disapproved or scrutinized closely.
Health Insurance Portability and Accountability Act
Another component of the legal backdrop for the collection of data on race, ethnicity, and language from private-sector sources is HIPAA. HIPAA, among other things, imposes national standards for electronic transactions with which all health insurers, health care clearinghouses, and providers (e.g., hospitals and physicians) that conduct business electronically must comply. The law requires DHHS to use the HIPAA transaction and code set standards for all covered transactions, including certain claims and enrollments. Transaction sets help standardize the business interactions among health care providers, health plan payers, and health plan sponsors; code sets define the data element values used in the standard transactions. Every time health care providers electronically transmit a claim to an insurer and in turn to the Centers for Medicare and Medicaid Services (CMS), HIPAA requires the use of adopted standard transactions.
Under HIPAA, Designated Standards Maintenance Organizations (DSMOs) are responsible for maintaining the content and standards for covered transactions according to Implementation Guidelines. These guides define the content and code sets of standardized data for constructing a HIPAA-compliant transaction (45 CFR Part 162, see U.S. DHHS, 2002), as well as the elements that providers and insurance plans must report in electronic health care claims and benefits enrollment transactions.
The HIPAA Implementation Guidelines state that each element in a transaction standard must be designated by the industry as either required, situational, or not used for completion of the transaction. Under this classification system, required elements must be included on the standard transaction, situational elements are used in certain circumstances but not others and not used elements are not currently reported. In the health care claims and benefits enrollment transaction standards, the fields for race and ethnicity are currently designated as not used and therefore are not among the HIPAA requirements for data collection.
The final adopted standards were published in the Federal Register on February 20, 2003.8 Although data on race and ethnicity are not part of the HIPAA requirements for data collection under these adopted standards, there is a mechanism through which such data could become required. The DSMOs, in consultation with other committees with interests in the standards, called the Data Content Committees (DCCs),9 can recommend
These DCCs include the National Uniform Billing Committee, the National Uniform Claim Committee, the Workgroup for Electronic Data Interchange, and the American Dental Association.
changes to the standards. The secretary of DHHS can also modify any established standard, with some limitations regarding the criteria for which modifications can be made and the frequency of modifications (45 CFR Part 162, 2002). Any change the secretary proposes must be considered in consultation with the DCCs and, where relevant, the National Committee on Vital and Health Statistics.
This section describes current practices in collecting data on race, ethnicity, acculturation and language, and SEP by hospitals, health insurers, and medical groups, and identifies barriers to the collection of these data. The discussion is informed by two papers commissioned by the panel, both of which are included in the appendixes to this volume. Nerenz and Currier review racial and ethnic data collected by hospitals, health plans, and medical group practices. Bocchino discusses the results of interviews with health plans regarding their racial and ethnic data collection practices.
Data Collection by Hospitals
Hospitals play a major role in a community’s health care delivery system and the health of its workforce. As communities become more diverse, hospitals are challenged to enhance their capacity to design and implement programs and treatment protocols that reduce or eliminate health disparities. Racial and ethnic data provide an important foundation for designing such programs and protocols. There is some evidence that hospitals have recognized the need for the collection of data on race and ethnicity. The Health Research and Education Trust (HRET—an affiliate of the American Hospital Association), in collaboration with Michigan State University, surveyed a nationally representative sample of hospitals regarding their practices in collecting data on race and ethnicity (see the paper by Nerenz and Currier in Appendix F). Results from this survey show that hospitals collect these data for a number of reasons—for example:
to use for quality improvement measures,
to target hospital marketing efforts,
to fulfill requirements by law or regulation, and
to improve community relations.
The HRET and Michigan State survey provided a snapshot of data collection practices showing that 79 percent of responding hospitals said
they collected racial and ethnic data on their patients.10 The patient was the source of this information in a majority of hospitals, but a significant percent of hospitals reported that clerks code patient race and ethnicity by observation.
The infrastructure for collecting and using data on race and ethnicity in the hospital industry is undeveloped and faces several barriers. Because racial, ethnic, and other socioeconomic data are not needed to pay a claim, they are not universally or uniformly collected in hospital discharge abstracts, which are the major source of hospital patient diagnostic and treatment data for research and public health statistics. This lack of a standardized approach to the collection of patient’s racial, ethnic, and socioeconomic data is a major barrier to understanding and eliminating disparities, not only for hospitals but also for state and national data collection systems that rely on the data collected by local hospitals. The result is the under-reporting of information on race and ethnicity, variability in methods of capturing the data, and misclassification. A New York study compared records of individuals with two separate admissions to hospitals and found that they agreed 93 percent of the time, but that agreement for groups other than blacks and whites were lower (Blustein, 1994).
The lack of uniformity in data collection among hospitals that do capture racial and ethnic data is a barrier to making comparisons across providers and communities. Results from the HRET survey (discussed in the paper by Nerenz and Currier in Appendix F) show that while the majority of hospitals are collecting data on race, ethnicity, and language, they are collecting it very differently—findings that are consistent with other studies that document the quality of racial and ethnic data collection by hospitals. The HRET survey discovered that the majority (70 percent) of responding hospitals that collect racial and ethnic data did not see any drawbacks to collecting the data. However, drawbacks of data collection cited by the remaining 30 percent include:
problems associated with the quality and accuracy of the data,
discomfort of admitting clerk in asking the patient for the information,
concerns that patients might be insulted or offended if asked about their race and ethnicity,
patients often did not fit the given racial or ethnic categories,
fears that the data might not be kept confidential, and
concerns that the collection of racial and ethnic data might be used to profile patients and discriminate in the provision of care.
Of hospitals not collecting racial and ethnic data, the majority cited the belief that it was unnecessary. Other reasons given for not collecting the information included the time and resources involved in collecting and managing the data and concerns about the classification system.
A convergence of national awareness about health disparities and the increasing market and policy incentives for the hospital industry to respond have resulted in two important industry initiatives:
HRET has been working with a consortium of six hospitals and health systems to develop a uniform framework for collecting racial, ethnic, and primary language data in hospitals. As part of this effort, HRET has conducted site visits and, in collaboration with Michigan State University, conducted a survey of hospital data collection and use practices. The goal of this initiative is to inform the development of a systematic and uniform framework for the collection and use of data on race, ethnicity, and language data across hospitals.
The American Hospital Association (AHA) has added two new questions to its annual survey, which is a survey of the more than 6,000 AHA member hospitals. Beginning in 2003, hospitals were asked whether they collect information on the patient’s race, ethnicity, and primary language spoken. This information will be used to provide important baseline and trend information about data collection practices across hospitals over time.
The first of these two initiatives will be particularly useful for illuminating how the barriers to collecting these data (such as those identified by hospitals in the paper by Nerenz and Currier in Appendix F) are experienced and ultimately addressed.
Data Collection by Health Insurers
Very few health insurance companies collect data on race, ethnicity, acculturation and language, and SEP. A few recent studies have surveyed health insurers to learn about their practices in collecting these data. For example, Bocchino in Appendix G, interviewed 30 health insurers who were members of the American Association of Health Plans about their racial and ethnic data collection practices.11 This study found that of these
See her paper in Appendix G. Sixteen of these insurers were chosen for interview because they were known to have initiated race- and ethnicity-related projects. The remaining 14 were randomly selected from the 2002 AAHP Industry Survey respondents lists.
30 insurers, a quarter of them asked about race and ethnicity on enrollment forms completed by accepted applicants, but that these voluntary questions were frequently left unanswered.12 It was unclear from the study whether the data were collected solely for those requesting individual or family coverage or for employer-based coverage as well. In another small survey, Nerenz and Currier found that while few insurers collect racial and ethnic data, those that do use the information to prepare language translations of materials, for quality improvement purposes, and to inform disease management programs. Nerenz and Currier also report, however, that an informal survey in 1999 of large not-for-profit health insurers involved in the Health Maintenance Organization Research Network in 1999 found that most insurers did not collect data on the race and ethnicity of their members.
Bocchino reports that health insurers’ enrollment forms have included questions on language preference for a number of years, but that these fields are usually optional and are often left blank. One major health insurer shared with the panel its data collection practices for various insurance products. For some (but not all) products, this insurer collects primary language usage data from its members on an optional basis on application forms. For two products, this insurer collected salary information on application forms, but this was an exception to its standard practice. Otherwise, no other SEP data were collected by the groups in this plan.
Awareness of the potential utility of the collection of racial and ethnic data by health insurers was raised recently when Aetna, which insures 14 million individuals, announced that it would begin collecting racial and ethnic data from its members either once they were accepted for coverage or when they requested a change in coverage (Winslow, 2003). Since the initiative began in September 2002, about 64,000 enrollees in 13 states and the District of Columbia have been asked to voluntarily indicate their race, ethnicity, and language preference, and about 80 percent have provided the information.13 Aetna reports that the data will be used to “create more culturally focused disease management and wellness programs for our multicultural membership.”14 Aetna is the first health plan to publicly announce such an effort.
Other health insurers have initiated more limited efforts on racial and ethnic data collection and in support of cultural competence activities. These efforts and activities include the formation of CEO-level task forces,
the collection of racial and ethnic data on membership satisfaction surveys, the use of language preference data from enrollment forms to target health plan materials, and collaboration with state public health departments to link member files with state public health data files and surveys (as discussed in the paper by Bocchino).
Many health insurers report interest in collecting these data to target prevention and treatment programs. In her survey of AAHP health insurers, Bocchino reports that data on race, ethnicity, language preference, and other individual characteristics such as sex, age, education, and geographic location are important for health insurers to appropriately target information and programs to improve the health of enrolled members. In Bocchino’s study, health insurers that collect these data reported that when the data were provided by enrollees and recorded on their enrollment forms, they were in fact useful for identifying populations at higher risk for chronic conditions and for targeting appropriate preventive care programs. Health insurers have also used these data to support disease management activities.
There are, however, barriers to the collection of racial, ethnic, acculturation and language, and SEP data by health insurers. Perhaps the most significant obstacles are concerns about the legalities of collecting these data and about perceptions among those who provide the information that the data might be used for discriminatory purposes. In the Bocchino study, almost two-thirds of the responding insurers cited legal concerns as the most important deterrent to collection. Fremont and Lurie (see their paper in Appendix D) indicate that some plans fear that collecting such data would increase their exposure to litigation over potential privacy violations or violations of civil rights laws, or that potential members, or the employers who choose health insurers for their employees, would respond negatively to the collection of these data for fear that they may be used detrimentally.
Given concerns about perceived reasons for collecting racial and ethnic data at enrollment, several alternative methods of data collection have been suggested. One alternative is to collect these data on members once they have already enrolled, an approach that may allay fears that the data will be used to discriminate in the underwriting process. Another alternative is for HMOs and preferred provider organizations (products that have defined physician and hospital networks) to ask providers to collect these data at the point of service. However, some of the standardization and perception problems discussed above with regard to collecting these data on hospital medical records apply for this type of data collection also, since providers may be hesitant to ask these questions of patients, fearing the same perception problems and the perception of increased litigation exposure. Collecting such data at the point of service would mean that the data would need to be collected repeatedly—every time an individual used a service—whereas
it would be less burdensome for insurers to collect the data only once. Furthermore, collection at the point of service would mean that data would not be available for the large number of health members who are insured but have not used their coverage (Fiscella, 2002) and that information on race and ethnicity would be collected only on the subset of members who submit claims. Any characterization of the health insurers’ membership would thus be limited to those seeking care, or roughly two-thirds of a typical insurer’s membership. This would limit the generalizability of study results. A third option may be to conduct a separate survey of already enrolled members to collect these data. High costs and quality of the data are important considerations for this type of collection.
Health insurers interviewed by Bocchino cited the costs of collecting information and of updating systems to collect and store the information as a barrier, as well as the requirement for approval by various government agencies of any revisions on enrollment forms (e.g., a state insurance department).
Data Collection by Medical Groups
Very little is known about the collection of racial, ethnic, language, and SEP data by private physicians’ offices and group practices. While Nerenz and Currier found that some practices collect racial and ethnic data on their patients, the information is usually collected for their own purposes (e.g., internal quality improvement and disease management activities) and is therefore not standardized, consistently collected, or made available for public use.
Quality of Care Measures
The Consumer Assessment of Health Plans Survey (CAHPS) and the Health Plan Employer Data and Information Set (HEDIS) quality of care measures are systems of quality measurement that are widely used by health plans for their HMO products.15 Both of them are required as part of the voluntary accreditation process administered by the National Committee for Quality Assurance (NCQA), which sponsors development of the HEDIS measures. Many purchasers—including Medicare, the Federal Employees Health Benefits Program (FEHBP), and many private purchasers and state Medicaid programs—require reporting of one or both of these quality measures.
CAHPS was developed under a program sponsored by the Agency for Healthcare Research and Quality (AHRQ) with the objective of producing a standardized instrument for surveys of health plan members.16 CAHPS consists of a core of items that are widely applicable, together with several sets of supplementary items designed for special populations (e.g., children, Medicare beneficiaries, people with chronic diseases). The CAHPS quality items ask for overall ratings of care as well as for reports on more specific aspects of respondents’ experiences, such as waiting times or ease of obtaining particular services. The survey typically also collects very limited information on health status or conditions. Although it is primarily designed for HMO members, versions of the survey have been used to evaluate care in the Medicare fee-for-service sector and are being developed to evaluate medical groups, hospitals, and nursing homes. Since its introduction in 1998, CAHPS has been widely implemented, and therein lies its primary importance to this discussion, although much of the following description of its content is also applicable to the various other surveys used in hospitals, medical groups, and other health care institutions.
CAHPS includes limited content on race, ethnicity, SEP, and language use. In the standard instrument, race and ethnicity are measured by items that closely follow the standard OMB categories. The survey has been translated into Spanish and some other languages, and when these translated versions are used, they can serve as an indication of the respondent’s language preference. Concerns have been raised, however, about the precise equivalence of Spanish- and English-language versions of the survey.
The only SEP measure is an item on education. Education is widely used for “case mix” adjustment—that is, to adjust for the component of scores that is attributable to differences in the composition of the membership of the different insurers rather than to differences in quality. More-educated (presumably higher-SEP) members tend to give lower ratings, likely reflecting higher expectations rather than poorer care. The racial and ethnic measures, on the other hand, are not typically used for adjustment, but have been used for subgroup analyses. Geographical linkages to CAHPS data are also possible. Although the main purpose of CAHPS is to support comparative reporting on health insurers, individual-level data are collected by some survey sponsors and are compiled on a voluntary basis in the National CAHPS Benchmarking Database to be used in research.
The HEDIS quality of care measures are used to estimate rates of provision of selected screening and preventive and chronic disease treatment services to eligible populations, based on a combination of administrative data and medical record reviews. Thus they collect no data on race,
See http://www.ahcpr.gov/qual/cahpsix.htm for a description of this survey.
ethnicity, or SEP beyond those already included in these records. Analyses of race, ethnicity, and SEP effects in HEDIS measures have been conducted either using Medicare administrative data on race (Schneider, Zaslavsky, and Epstein, 2002) or through geographic linkages (Zaslavsky et al., 2000).
Data collected by private-sector groups could be invaluable sources for better understanding disparities in health care. These data could be used for statistical purposes by governments and private researchers to monitor the status and understand the causes of disparities in health care. As discussed by Fremont and Lurie, the data may also be directly useful to insurers, medical groups, and hospitals, which could use them to monitor differences in utilization of health care services.
For example, differences in utilization among members of the same health plan with the same coverage may reveal areas where improved quality of care is needed. Many health insurers provide targeted case management and support services for people with chronic conditions such as asthma, diabetes, and congestive heart failure. Racial and ethnic data could be used to develop culturally appropriate outreach for patient enrollment in these programs and ensure that follow-up support services are culturally appropriate. Moreover, data on race and ethnicity would enhance health insurer data sets and contribute to understanding disparities in preventive and palliative care among commercially insured populations. Finally, these enhanced data sets could be used to assess potential changes in service patterns associated with race or ethnicity among health care providers.
Hospitals could use data for quality improvement measures. Monitoring quality measures by race and ethnicity could identify areas for which quality improvement could be targeted. In addition, hospitals and large medical groups could use the data to gauge the need for services targeted to specific ethnic or minority groups (e.g., translators, educational materials) to improve quality of care. And consumers of health care services and health insurance could use information generated from these data, if it were publicly available, to make informed choices about the performance of services and health plans.
Educational programs developed by hospitals, insurers, or public health organizations may be more appropriately targeted to individuals and groups if information on race, ethnicity, and acculturation and language use are available to guide these efforts. Fremont and Lurie cite a study that found that a mass media campaign to educate the public about steps to take to avoid sudden infant death syndrome was less effective among black mothers than among white mothers because the messages were not appropriately targeted to black women (Malloy, 1998).
The panel’s review of current practices by the private sector—hospitals, health insurers, and medical group practices—has revealed that the collection of data on race, ethnicity, language, and SEP is not common and not standardized. When hospitals collect racial and ethnic data on their patients, reports show that the reporting is fairly complete. However, the data are not reported in a standardized format and accuracy for groups other than white and black is suspect. Few health insurers collect data on race, ethnicity, and language. For those that do, individuals often do not provide their information. Finally, even less is known about the racial and ethnic data collected by medical groups. The collection of SEP data is probably even more rare in these privately based data collection systems; the only such data collected by hospitals is the source of payment for patients. Health insurers rarely collect information on education or income.
The panel believes that an opportunity to learn more about disparities is missed because private medical and insurance organizations do not routinely collect information on race and ethnicity, acculturation and language, and SEP. The lack of data from these sources is a serious weakness in the current systems of health data collection. DHHS could remedy this problem by intervening to ensure that these data are collected uniformly. Health insurers and hospitals have expressed interest in collecting these data but worry that, without a federal or state mandate, the collection of the data will be greeted with suspicion. Federal leadership is needed to help legitimize and regularize the collection of these data across states and health systems.
RECOMMENDATION 6-1: DHHS should require health insurers, hospitals, and private medical groups to collect data on race, ethnicity, socioeconomic position, and acculturation and language.
The panel does not have the expertise to assess whether DHHS has the statutory authority necessary to require private entities to collect data on race, ethnicity, acculturation and language use, and SEP. However, there appear to be several possible options for DHHS to pursue such requirements through existing laws, regulations, and initiatives.
HIPAA is one such vehicle, although it does not currently require the collection of these data. The race and ethnicity elements in the standard set of claims and enrollment transactions are currently designated as “not used” and thus are not reported. The secretary of DHHS is in a position to propose changes to the current HIPAA standards. Strong leadership from DHHS would be needed to guide the proposed changes through the process of approval from the DSMOs in consultation with other industry committees. The case for the proposed changes would be strengthened by the argument that the collection of racial and ethnic data is essential to meeting the Healthy People 2010 initiative to eliminate disparities in health and
health care. Because HIPAA only covers electronic transactions, the addition of racial and ethnic data to the standards would not cover all transactions, but it would significantly enhance both the amount of data available for studying disparities and the effectiveness of interventions designed to eliminate disparities.17
A potential vehicle for standardizing the collection of data on race, ethnicity, SEP, and language use among hospitals is the Joint Commission on Accreditation of Healthcare Organizations (JCAHO), which is an independent nonprofit organization that is a standard-setting and accrediting body for more than 16,000 health care organizations. JCAHO already requires the collection of primary language information and could add requirements for the collection of racial and ethnic data.
Aligning incentives for hospitals to collect and use data on race and ethnicity is key to overcoming the barriers that now contribute to incomplete and non-comparable data. Given the importance of these data for public health and research, efforts at the hospital, governmental, regulatory, and national levels are essential for overcoming the barriers. The HRET and AHA efforts are to be commended and are a step in the right direction for improving the completeness and utility of racial, ethnic, and socioeconomic data collected by the nation’s hospitals.
Creative ideas for overcoming concerns and disincentives to collect data on race, ethnicity, SEP, and acculturation and language use are needed. There are, however, examples of barriers that have been overcome. Leading hospital systems have proven that the collection of these data is possible. These hospitals have served as laboratories for collecting and using racial and ethnic data in a hospital environment. Despite the difficulties and limitations of the data, hospitals that made the investment have demonstrated the utility of the data by understanding—and improving services for—the community they serve. These hospitals can target quality improvement interventions and measure their effectiveness, comply with grant reporting requirements, and compete more effectively for research and service grants. They can also design their workforce to match the communities they serve and thus underscore their commitment to their mission and to their donors and communities.
Since the sources of these data are usually records rather than surveys, extensive data on socioeconomic position and acculturation and language use may be infeasible to collect without sizable costs and time commit-
ments. However, measures of education and occupation are more easily collected, as are proxy measures of acculturation and language such as place of birth, generation status, and primary language.
DHHS is a powerful player in health care transactions conducted by private entities through the Medicare and Medicaid programs. Racial and ethnic data for some Medicare enrollees are already available through Social Security Administration records. But as we described in Chapter 4, such data are not available for all enrollees, will not be available for some future enrollees, and are not always reported consistent with recent OMB standards for the collection of these data. The department could, through its administration of the Medicare program, require the collection of data on race, ethnicity, SEP, and primary language to fill in gaps. Providers, hospitals, and other entities seeking reimbursement for services provided under Medicare would then be required to provide data on the race and ethnicity of each individual who receives service.
DHHS could also promote standardized state-level collection of these data through each state’s Medicaid program. Data on race and ethnicity are reported on the Medicaid enrollment forms, and the OMB standard categories are supposed to be used. However, DHHS could more strongly enforce the collection of these data and also offer guidance and technical assistance to help the states implement procedures to collect the data.
In collecting such data, hospitals, health plans, and medical groups should be aware that some individuals may be reluctant to provide the information. Respondents should be informed that they are volunteering to provide these data and should also be informed about how the data will be used. This approach may help assuage fears about confidentiality breaches and may encourage individuals to provide the data.
Promoting Standardized Collection of Data
DHHS should work with hospitals and health insurers to determine the best way to collect standardized data, using the OMB standards for collecting racial and ethnic data as a base. Further detail may be required for some hospitals or health insurers that serve a large number of individuals from smaller population subgroups beyond the OMB standard categories. DHHS should also work with hospitals, health plans, and related groups to determine which SEP measures could reasonably be collected on enrollment or admissions forms. Collection of these data will necessarily be limited as extended collection of wealth and income data is not feasible for these record systems. Education level may be the most practical item to collect and the least sensitive for individuals to provide.
In setting up data systems and standards for the collection of such data, DHHS and industry agents should try to design systems that avoid repeat-
edly collecting the same information on individuals. This will reduce the burden both for respondents and for those collecting the information.
In developing standards for data collection, it is also critically important to provide clear information that indicates how the data will be used and that the data are provided on a voluntary basis. Providing this information can help alleviate fears that the data will be used for discriminatory purposes. This information should be provided at the data collection point, which in most cases would be when the patients and plan enrollees fill out forms. Acknowledging the risks associated with the collection and use of data on race and ethnicity is part of the due diligence of the collection of these data by hospitals, health plans, and medical groups. Building trust by protecting the data from improper use or disclosure is essential. If the patients are told that providing their socioeconomic and demographic data will result in more translation services or community prevention programs, then these should be implemented. DHHS should work with industry agents and legal experts to develop the information to be given to individuals who are asked to provide the data.
RECOMMENDATION 6-2: DHHS should provide leadership in developing standards for collecting data on race, ethnicity, socioeconomic position, and acculturation and language use by health insurers, hospitals, and private medical groups.
Linking Geocoded Data from the Private Sector to Federal Data
As noted throughout this chapter, only very limited data on race and ethnicity are typically collected in private-sector health care information systems. Implementation of this report’s recommendations would greatly enhance the data infrastructure available for understanding and eliminating disparities. However, if these recommendations cannot be implemented such that high-quality data are produced, linking aggregate-level data on race, ethnicity, SEP, and language use may be needed to bridge the gaps. In general, provider, hospital, and insurance claim forms contain the claimant’s address. The Bureau of the Census provides aggregated data on race, ethnicity, and SEP for census geographical units (Zip Code tabulation areas, tracts, or block groups), and these aggregated data, when geographically linked with data from private-sector records, can be used as proxies for individual data on race, ethnicity, or SEP.
Two technical issues are critical to implementation of such linkages. First, software must be used to link addresses to census geographical units. Second, census data must be linked to the addresses, with suitable protections for confidentiality. If the data will be disseminated beyond the plan for research purposes, the second step requires special care because the
precise combination of values of the sociodemographic variables might identify the subject’s geographical area and thus pose a risk of disclosure of confidential information about individual plan members. Methods have been developed for masking such data by rounding and/or adding random noise. Such masked data sets can be analyzed with appropriate corrections for the effects of masking. But development of the specific procedures and parameters required to implement data masking requires particular statistical expertise that is not likely to be found within health insurers. Considerable resources would be required to accomplish it. Furthermore, a uniform procedure should be followed so that data will be comparable across the private-sector units generating the data.
DHHS could greatly facilitate the routine generation of high-quality, uniform, and nondisclosing geographically linked data sets by providing a linking service that could be used by private- and public-sector health care organizations. Such a service could be administered, for example, through a Web site. An organization would anonymously submit a file containing member addresses and would receive in return a file of masked geographical variables at several levels. Although geocoding is an imperfect process, typically 85 percent of addresses in a health care file might be geocoded down to the block group level; cases that cannot be geocoded might be either imputed or analyzed using variables aggregated to higher levels of geography.
The greatest expertise in the federal government for solving the problems involved in establishing such a service resides in the Bureau of the Census. Within DHHS, the NCHS has been a leader in dealing with confidentiality issues. Alternatively, a private-sector vendor with the necessary geocoding expertise could be recruited, although such vendors do not typically deal with the related confidentiality issues.
RECOMMENDATION 6-3: DHHS should establish a service that would geocode and link addresses of patients or health plan members to census data, with suitable protections of privacy, and make this service available to facilitate development of geographically linked analytic data sets.