National Academies Press: OpenBook

Evaluation of Compensation Data Collected Through the EEO-1 Form (2023)

Chapter: Appendix C: Data Storage, Security, and Management Procedures

« Previous: Appendix B: Data-Handling Procedures Used for Original Data Analysis
Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×

Appendix C

Data Storage, Security, and Management Procedures

DATA STORAGE AND SECURITY PROCEDURES

The National Academies of Sciences, Engineering, and Medicine received the data and delivered them to RTI International, which set up the files in a secure environment and allowed only approved personnel to access the files, according to the data-handling procedures for controlled-use data described in the project contract. Only RTI International staff who had been trained in controlled-use data handling and who had signed affidavits of nondisclosure for this project worked with the data files. RTI International reviewed all output for potential disclosure issues and cleaned data of such issues before sending them to the National Academies. Table cells based on fewer than three observations were suppressed to protect confidentiality.

FILE STRUCTURE

The data files delivered by the National Opinion Research Center (NORC) came in two primary formats: data from filers and data from individual establishments. Data were also divided into multiple files, separated by year (2017 and 2018) and by data type (employer/establishment/employee). The data-collection instruments provided two separate structures, with the online-entry version containing large matrices (one for each occupation) and the data-upload version having one record per sex-race/ethnicity-occupation-pay band (SROP) cell. The files delivered by NORC followed the data-upload version and excluded cells that had both zero

Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×

employees and zero hours worked, which is more efficient for data-storage purposes. For data analysis, RTI International prepared multiple versions of the files, with one version having one record per SROP cell and another having one record per establishment; some analyses were better suited to one format and some to the other format.

A notable feature for analytic considerations is the difficulty of obtaining data at the firm level. In the case of firms with only a single establishment, firms’ data were directly available in the establishment file. Often, and particularly for large firms, firms acted as their own filers, again making firm-level data readily available. However, for those firms that used professional employer organizations (PEOs) as their filers, multiple firms could be included in a single filing, and there was no separate ID to distinguish a firm from a filer. NORC attempted to enumerate the firms within each PEO’s submission by creating a universe file: Comp2EmployerUniverse. If NORC was able to separate the firm from the filing PEO, the firm was given a unique ID (USERID), and the PEO ID was retained as USERID2. Otherwise, USERID was the ID for the PEO and could be associated with multiple firms. NORC also provided the federal employment identification number (FEIN or EIN), but a firm may have multiple EINs, so EINs have limitations as firm identifiers. RTI International’s approach, as recommended by NORC, was to match the analytic data files with the universe file, using a combination of EIN and USERID (U.S. EEOC, 2020h). This approach should be largely successful, based on NORC’s substantial data-editing efforts, but with errors.

Due to the difficulty in identifying firms, there were also difficulties identifying firm sizes. PEOs provided a consolidated report, which included the size of each establishment, but if a report contained multiple Type 6 reports (for establishments with fewer than 50 employees), it was difficult to associate those establishments with firms. Firm sizes were estimated by summing the sizes of the establishments.

Since EEOC’s primary planned use of the data files was with regard to establishments (where enforcement efforts are targeted), and because of the difficulties in properly identifying firms, this report primarily focused on establishments.

Another key aspect of the file structure is that the files were not designed to support merges of Components 1 and 2, or of Component 2 over time. The difficulty with identifying firms is one reason, but another is that establishment IDs were not consistent across databases, so establishment IDs cannot be used to support merges.

DATA ISSUES

As described predominantly in Chapter 5, much of the data appeared to be of high quality, but there were instances of substantial errors or likely

Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×

errors large enough to potentially have a major impact on statistics produced from the data. Major errors or likely errors include the following:

  • Some reported numbers of employees were so large that they would produce major changes to national estimates. For example, one firm reported having more than 245 million employees, and another reported having over 86 million. A total of 33 firms in 2017 and 29 firms in 2018 were excluded from the database because they reported more employees than the largest U.S. employer (i.e., greater than 1.4 million). For perspective, note that employees at franchises are counted separately. For example, Forbes listed McDonald’s as the fourth largest employer in the world in 2015, with 1.9 million employees;1 however, that number included franchises, and 93 percent of McDonald’s restaurants are franchises (McDonald’s Corporation, 2021). Without franchises, McDonald’s had about 200,000 employees worldwide in 2020, and less than 25 percent were in the U.S. (McDonald’s Corporation, 2021).
  • Similarly, some reports of hours worked were physically impossible, requiring workers to work more hours than exist in a year. For this study, reports of hours that reflected working more than 16 hours per day every day of the year (i.e., 5,840 hours) were assigned a red flag and excluded from the primary analyses in Chapters 6 and 7. This was a conservative adjustment designed to allow for naturally occurring extreme values, with the intention that analysts could make different filtering decisions (or data corrections) depending on specific analytic needs.
  • Some inconsistencies appeared in the data, such as the reporting of non-zero numbers of employees with zero hours worked reported, non-zero hours worked with zero employee counts reported, discrepancies between Components 1 and 2, and discrepancies between 2017 and 2018 in the Component 2 data. The discrepancies between 2017 and 2018 data could reflect real change over time (though the size of the differences was sometimes large enough to make that highly unlikely), as to a much lesser degree could the differences between Components 1 and 2 (however, both Components 1 and 2 were to be based on a pay period within the same October–December quarter, lessening the likelihood of large changes).

___________________

1https://www.forbes.com/sites/niallmccarthy/2015/06/23/the-worlds-biggest-employers-infographic/?sh=5132bfb9686b

Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×

DOCUMENTATION

The analysis programs prepared by RTI International are available in a single compressed zip file, with separate folders for the file construction and for each chapter containing data analysis (Chapters 37). Within each chapter folder, the SAS programs are identified based on which tables each program produced. Note, however, that table numbering changed when the report authors extracted the key data to be discussed in the report; a crosswalk provided in the zip files compares the original and final table numbers. The zip file also contains technical memos describing how data merges were performed, how the flag variables were created to filter out problematic data, and a report on the geocoding performed to support matching establishments across time.

Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×
Page 347
Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×
Page 348
Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×
Page 349
Suggested Citation:"Appendix C: Data Storage, Security, and Management Procedures." National Academies of Sciences, Engineering, and Medicine. 2023. Evaluation of Compensation Data Collected Through the EEO-1 Form. Washington, DC: The National Academies Press. doi: 10.17226/26581.
×
Page 350
Next: Appendix D: Biosketches of Contracted Project Staff »
Evaluation of Compensation Data Collected Through the EEO-1 Form Get This Book
×
 Evaluation of Compensation Data Collected Through the EEO-1 Form
Buy Paperback | $50.00 Buy Ebook | $40.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The U.S. Equal Employment Opportunity Commission (EEOC) expanded EEO-1 data collection for reporting years 2017 to 2018 in an effort to improve its ability to investigate and address pay disparities between women and men and between different racial and ethnic groups. These pay disparities are well documented in national statistics. For example, the U.S. Census Bureau (2021) found that Black and Hispanic women earned only 63 percent and 55 percent as much, respectively, of what non-Hispanic White men earned.

Evaluation of Compensation Data Collected Through the EEO-1 Form examines the quality of pay data collected using the EEO-1 form and provides recommendations for future data collection efforts. The report finds that there is value in the expanded EEO-1 data, which are unique among federal surveys by providing employee pay, occupation, and demographic data at the employer level. Nonetheless, both short-term and longer-term improvements are recommended to address significant concerns in employer coverage, conceptual definitions, data measurement, and collection protocols. If implemented, these recommendations could improve the breadth and strength of EEOC data for addressing pay equity, potentially reduce employer burden, and better support employer self-assessment.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!