Skip to main content

Currently Skimming:

6 Quality Frameworks for Statistics Using Multiple Data Sources
Pages 109-132

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 109...
... The theme and the caution have been driven by the relatively recent novelty of the simultaneous use of multiple data sources and the fact that some potential new sources of data present new issues of data quality. We begin this chapter with a discussion of quality frameworks for survey data and then briefly review additional quality features and some extensions of these frameworks for administrative and private-sector data sources.
From page 110...
... The first is a measurement inference, in which the questions answered or items sought from a sample unit are viewed as proxies for the true underlying phenomenon of interest. For example, every month interviewers asked a sample person: "what were you doing Measurement Representation Inferential Population Construct Validity Target Population Measurement Coverage Error Sampling Frame Measurement Sampling Error Error Response Sample Processing Nonresponse Error Error Edited Data Respondents Survey Statistic 11 FIGURE 6-1  Total survey error framework.
From page 111...
... The second inferential step concerns the measurement of a subset of units of the target population. In this case, inference based on probability sampling is the foundation of government statistical agencies throughout the world.
From page 112...
... rather than biases, as well as the ability to calculate standard errors directly from the sample, while determining nonresponse bias requires information external to the survey. Some federal statistical agencies have created "quality profiles" for some major surveys to bring together what is known about the various sources of error in a survey.
From page 113...
... Other error sources are noted in more general narrative form, such as statements that surveys are also subject to nonsampling errors. Since standard errors are typically the only quantitative metric available at the estimate level, it is easy for users to conclude that this measure conveys the overall quality of the estimate.
From page 114...
... BOX 6-1 Eurostat Quality Assurance Framework The Eurostat Quality Assurance Framework (European Statistical System Committee, 2013) describes activities, methods, and tools that can provide guid ance to national statistical offices to fulfill the principles in the European Statistics Code of Practice (European Commission, 2011)
From page 115...
... . This framework has five major output quality components: 1.
From page 116...
... Timeliness Though timeliness is described in the European Statistical System Committee's quality framework in terms of the timing and punctuality of reports, it is important to recognize that existing systems have tailored their reporting mechanisms and their reporting requirements to the practical constraints and limitations in place at the time the systems were established. For example, the unemployment rate, calculated from the Current Population Survey (CPS)
From page 117...
... RECOMMENDATION 6-1 Federal statistical agencies should adopt a broader framework for statistical information than total survey error to include additional dimensions that better capture user needs, such as timeliness, relevance, accuracy, accessibility, coherence, integrity, privacy, transparency, and interpretability.
From page 118...
... In contrast, survey designers spend a great deal of time and effort developing and pretesting survey instruments to ensure they are obtaining the information they want from respondents and minimizing measurement errors. Electronic survey instruments often include consistency checks and acceptable ranges of responses to further ensure that potential problems with data entry or responses are resolved at the point of collection.
From page 119...
... Concepts of Interest and Other Quality Features The kinds of statistical models discussed in Chapter 2 when using multiple data sources underscore the need for more attention to quality features that have not received as much attention in traditional statistics (see also Groves and Schoeffel, in press)
From page 120...
... Moreover, it will not be possible to separate linkage errors with coverage errors. The level of measurement in the multiple data sources may vary.
From page 121...
... To this end, federal statistical agencies should create col laborative research programs to address the many challenges in using administrative data for federal statistics. (National Academies of Sci ences, Engineering, and Medicine, 2017b, Recommendation 3-1, p.
From page 122...
... BOX 6-2 The Data Quality Assessment Tool for Administrative Data The Data Quality Assessment Tool for Administrative Data, commonly re ferred to as the Tool, was developed by the Federal Committee on Statistical Methodology's Data Quality Working Group (Iwig et al., 2013)
From page 123...
... . There have been efforts to examine the processes that generate administrative data and population registers (see Wallgren and Wallgren, 2007)
From page 124...
... The emergence of computational linguistics since the early days of survey coding may offer help in this aspect of big data quality. Some of the data used in federal statistics could be combinations of data arising from sensors.
From page 125...
... However, the data generated by the software bots are not a person- or businessmeasurement unit eligible for a survey or census measurement. Although this might be viewed as a type of coverage error in traditional total survey error terms, it has such a distinct source that it needs its own attention.
From page 126...
... have taken a more specific approach and described error sources for a specific social media data source, creating a "total Twitter error." The authors posit that major classes of errors occur during the data extraction and the analysis process. CONCLUSION 6-5 New data sources require expanding and further development of existing quality frameworks to include new compo nents and to emphasize different aspects of quality.
From page 127...
... We discuss how some key quality characteristics of these sources might interact and permit enhanced federal statistics if these sources were combined. Our goal here is illustrative rather than prescriptive, and the responsible federal statistical agencies would need to conduct a much more in-depth review of the alternative sources and the methods for combining them than we can do here.
From page 128...
... • Data collection takes considerable time, so that annual estimates are only available months after the end of the year in which the crimes occurred. Alternative Approaches Alternative data sources on crime include the Uniform Crime Reports (UCR)
From page 129...
... In addition, unit missing data is assessed on the basis of jurisdictions that have 3  See http://www.latimes.com/local/la-me-crimestats-lapd-20140810-story.html [August 2017]
From page 130...
... Third, the CPI is used to adjust wages: more than 2 million workers are covered by collective bargaining agreements that tie wages to inflation.5 BLS produces thousands of component indexes, by areas of the country, 4  The NCS-X program is designed to generate nationally representative incident-based data on crimes reported to law enforcement agencies. It comprises a sample of 400 law enforcement agencies to supplement the existing NIBRS data by providing their incident data to their state or the federal NIBRS program.
From page 131...
... The market basket of goods is based on the Consumer Expenditure Survey (CE) , which is a household survey conducted each year in a probability sample of households selected from all urban areas in the United States.
From page 132...
... 6  Nonresponse errors would be zero if all selected websites were able to be scraped. Sampling errors would be zero if the relevant universe was completely covered.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.