INNOVATIONS IN
FEDERAL STATISTICS
Combining Data Sources While
Protecting Privacy
Panel on Improving Federal Statistics for
Policy and Social Science Research Using Multiple Data Sources and
State-of-the-Art Estimation Methods
Robert M. Groves and Brian A. Harris-Kojetin, Editors
Committee on National Statistics
Division of Behavioral and Social Sciences and Education
A Report of
THE NATIONAL ACADEMIES PRESS
Washington, DC
www.nap.edu
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001
This activity was supported by a grant from the Laura and John Arnold Foundation with additional support from the National Academy of Sciences Kellogg Fund. Support for the work of the Committee on National Statistics is provided by a consortium of federal agencies through a grant from the National Science Foundation (award number SES-1024012). Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.
International Standard Book Number-13: 978-0-309-45428-5
International Standard Book Number-10: 0-309-45428-X
Digital Object Identifier: 10.17226/24652
Additional copies of this publication are available for sale from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu.
Copyright 2017 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
Suggested citation: National Academies of Sciences, Engineering, and Medicine. (2017). Innovations in Federal Statistics: Combining Data Sources While Protecting Privacy. Washington, DC: The National Academies Press. doi: 10.17226/24652.

The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president.
The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. C. D. Mote, Jr., is president.
The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president.
The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine.
Learn more about the National Academies of Sciences, Engineering, and Medicine at www.national-academies.org.

Reports document the evidence-based consensus of an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and committee deliberations. Reports are peer reviewed and are approved by the National Academies of Sciences, Engineering, and Medicine.
Proceedings chronicle the presentations and discussions at a workshop, symposium, or other convening event. The statements and opinions contained in proceedings are those of the participants and are not necessarily endorsed by other participants, the planning committee, or the National Academies of Sciences, Engineering, and Medicine.
For information about other products and activities of the National Academies, please visit nationalacademies.org/whatwedo.
PANEL ON IMPROVING FEDERAL STATISTICS FOR POLICY AND SOCIAL SCIENCE RESEARCH USING MULTIPLE DATA SOURCES AND STATE-OF-THE-ART ESTIMATION METHODS
ROBERT M. GROVES (Chair), Provost, Georgetown University
MICHAEL E. CHERNEW, Department of Health Care Policy, Harvard Medical School
PIET DAAS, Department of Corporate Services, Information Technology, and Methodology, Statistics Netherlands
CYNTHIA DWORK, John A. Paulson School of Engineering and Applied Sciences and Radcliffe Institute for Advanced Study, Harvard University
OPHIR FRIEDER, Department of Computer Sciences, Georgetown University
HOSAGRAHAR V. JAGADISH, Computer Science and Engineering, University of Michigan
FRAUKE KREUTER, Joint Program in Survey Methodology, University of Maryland, and Statistics and Methodology, University of Mannheim and Institute for Employment Research
SHARON LOHR, Vice President, Westat, Rockville, Maryland
JAMES P. LYNCH, Department of Criminology and Criminal Justice, University of Maryland
COLM O’MUIRCHEARTAIGH, Harris School of Public Policy Studies, University of Chicago
TRIVELLORE RAGHUNATHAN, Institute for Social Research, University of Michigan
ROBERTO RIGOBON, Sloan School of Management, Massachusetts Institute of Technology
MARC ROTENBERG, President, Electronic Privacy Information Center, Washington, DC
BRIAN HARRIS-KOJETIN, Study Director
HERMANN HABERMANN, Senior Program Officer
GEORGE SCHOEFFEL, Research Assistant
AGNES GASKIN, Administrative Assistant
COMMITTEE ON NATIONAL STATISTICS
LAWRENCE D. BROWN (Chair), Department of Statistics, The Wharton School, University of Pennsylvania
FRANCINE BLAU, Department of Economics, Cornell University
MARY ELLEN BOCK, Department of Statistics (Emerita), Purdue University
MICHAEL CHERNEW, Department of Health Care Policy, Harvard Medical School
JANET CURRIE, Department of Economics, Princeton University
DONALD DILLMAN, Social and Economic Sciences Research Center, Washington State University
CONSTANTINE GATSONIS, Department of Biostatistics and Center for Statistical Sciences, Brown University
JAMES S. HOUSE, Survey Research Center, Institute for Social Research, University of Michigan
THOMAS MESENBOURG, U.S. Census Bureau (Retired)
SUSAN MURPHY, Department of Statistics and Institute for Social Research, University of Michigan
SARAH NUSSER, Office of the Vice President for Research, Iowa State University
COLM O’MUIRCHEARTAIGH, Harris School of Public Policy Studies, University of Chicago
RUTH PETERSON, Criminal Justice Research Center, Ohio State University
ROBERTO RIGOBON, Sloan School of Management, Massachusetts Institute of Technology
EDWARD SHORTLIFFE, Department of Biomedical Informatics, Columbia University and Arizona State University
CONSTANCE F. CITRO, Director
BRIAN HARRIS-KOJETIN, Deputy Director
Acknowledgments
This report of the Panel on Improving Federal Statistics for Policy and Social Science Research Using Multiple Data Sources and State-of-the-Art Estimation Methods is the product of contributions from many colleagues, whom we thank for their generous sharing of their time and expertise.
The panel is grateful to the Laura and John Arnold Foundation for funding this study, and to foundation staff Stuart Buck and Meredith McPhail for their help and guidance throughout the study. The panel also is grateful for the supplemental funding provided by the National Academy of Sciences Kellogg Fund.
The panel thanks Katherine Wallman, recently retired chief statistician at the U.S. Office of Management and Budget, and the heads of the principal statistical agencies for their valuable insights: Mary Bohman, Economic Research Service; Peggy Carr, National Center for Education Statistics; John R. Gawalt, National Center for Science and Engineering Statistics; Erica L. Groshen, Bureau of Labor Statistics; Hubert Hamer, National Agricultural Statistics Service; Patricia Hu, Bureau of Transportation Statistics; Barry Johnson, Statistics of Income Division of the Internal Revenue Service; Brian Moyer, Bureau of Economic Analysis; Jeri Mulrow, Bureau of Justice Statistics; John W.R. Phillips, Office of Research, Evaluation, and Statistics in the Social Security Administration; Charles J. Rothwell, National Center for Health Statistics; Adam Sieminski, Energy Information Administration; and John H. Thompson, U.S. Census Bureau. Their contributions and support to the panel during our initial meeting as well as their support and encouragement throughout the study have been invaluable in helping the
panel understand the challenges and constraints that the federal statistical agencies face and their dedication to providing high-quality information for the public good.
The panel also thanks all the many individuals who participated in one or more of the panel’s three workshops. A list of the presenters at the workshops can be found in Appendix A. The panel also thanks Steve Eglash (Stanford University) for his work examining issues of data access for private-sector companies.
At the National Academies of Sciences, Engineering, and Medicine, the panel would not have been able to complete its work efficiently without a capable staff. Connie Citro, director of the Committee on National Statistics, had the vision and perseverance to make this study a reality. Mary Ellen O’Connell, director of the Division of Behavioral and Social Sciences and Education, and Robert Hauser, previous director of the division, provided both institutional leadership and substantive insights. The division’s Kirsten Sampson-Snyder was extremely helpful in coordinating the review process, and Eugenia Grohman provided meticulous and thorough editing that greatly improved the readability of the report.
For the Committee on National Statistics, both Agnes Gaskin, administrative assistant, and Anthony Mann, program coordinator, provided considerable assistance in managing the logistics of this panel and their meetings in various geographic and institutional locations. Hermann Habermann, senior program officer, provided valuable feedback and guidance on drafts of this report, and George Schoeffel, research assistant, cheerfully assisted with every aspect of the study, performing countless tasks to make this report possible. Most critically, Brian Harris-Kojetin served as study director and not only kept the panel focused on the relevant tasks at hand, but also provided much expertise in the report and responded to the many comments and questions from panel members and reviewers. Without his technical skill, organizational skill, and insight, this report would not nearly be what it is currently.
This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making the published report as sound as possible and to ensure that the report meets the institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their participation in the review of the report: John M. Abowd, research and methodology, U.S. Census Bureau; Cynthia Z.F. Clark, independent consultant, McLean, Virginia; Arthur B. Kennickell, research and statistics, Board of Governors of the
Federal Reserve System; Partha Lahiri, Joint Program in Survey Methodology, University of Maryland; Thomas A. Louis, Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University; Nancy Mathiowetz, Department of Sociology (emerita), University of Wisconsin–Milwaukee; Thomas L. Mesenbourg, U.S. Census Bureau (retired); and Alan M. Zaslavsky, Department of Health Care Policy, Harvard Medical School.
Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the report’s conclusions or recommendations, nor did they see the final draft of the report before its release. The review of this report was overseen by Michael Hout, Department of Sociology, New York University, and Alicia L. Carriquiry, Department of Statistics and Sciences, Iowa State University. They were responsible for making certain that an independent examination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring panel and institution.
Robert M. Groves, Chair
Panel on Improving Federal Statistics for
Policy and Social Science Research
Using Multiple Data Sources and
State-of-the-Art Estimation Methods
This page intentionally left blank.
Contents
Federal Statistics and Evidence-Based Policy Making
2 CURRENT CHALLENGES AND OPPORTUNITIES IN FEDERAL STATISTICS
The U.S. Federal Statistical System
The Importance of Federal Statistics
Threats to the Survey Paradigm
Increasing Demands for More Detailed and More Timely Information
3 USING GOVERNMENT ADMINISTRATIVE AND OTHER DATA FOR FEDERAL STATISTICS
Use of Administrative Records by Other National Statistical Offices
Access to and Use of Administrative Data by Federal Statistical Agencies
State and Local Government Administrative Data for Federal Statistics
Challenges in Using Administrative Data for Federal Statistics
Combining Survey and Administrative Data Sources
4 USING PRIVATE-SECTOR DATA FOR FEDERAL STATISTICS
Dimensions of New Data Sources
Using Private-Sector Data Sources for Statistics
Challenges to Using Private-Sector Data Sources for Federal Statistics
5 PROTECTING PRIVACY AND CONFIDENTIALITY WHILE PROVIDING ACCESS TO DATA FOR RESEARCH USE
Tension between Private Lives and Public Policies
Legal Foundation for Privacy and Confidentiality
Approaches to Protecting Privacy
6 ADVANCING THE PARADIGM OF COMBINING DATA SOURCES
Using Administrative Records and Other Sources of Data for Federal Statistics