Skip to main content

Currently Skimming:

5 Metadata and Standards
Pages 95-130

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 95...
... When that is done, the users do not have to figure out how to interpret each element of the metadata. Standardized metadata allow automation of data analysis, transfer, and aggregation.
From page 96...
... In the statistical community, efforts to begin the management and use of metadata to describe data, datasets, and the methodology used to create them began in the 1970s. In the 1980s the efforts expanded to include research data libraries, data archives, and national statistical offices; and they expanded even more widely in the 1990s as the online digital revolution exploded.
From page 97...
... Metadata stored in formal databases as numbers, codes, or entries from controlled vocabularies are designed to be machine readable; they are ­active when used to control the execution of some system in a particular way. In this report, statistical metadata (hereafter, metadata)
From page 98...
... If a user constructs a new dataset based on integrating data produced at several statistical offices, then the new dataset needs its own metadata. For scientific research data, metadata are particularly important in facilitating the machine readability of a dataset (the automatic use of a dataset by software)
From page 99...
... • Datatype (in the case of marital status, nominal datatype) ,4 and • Universe (say, adults in the United States)
From page 100...
... will come up again later in this report, for example in our reviews of the Data Documentation Initia tive (DDI) and the Generic Statistical Information Model (GSIM)
From page 101...
... Consumer Expenditure Public Use Microdata – Interview https://www.bls.gov/cex/pumd/data/sas/intrvw20p1.zip SAS Fam_Size numeric positive integer Fam_Type category married couple only 1 married couple, own children only, oldest child < 6 2 Figure 5-1  Example of a simple dataset description in XML. Figure 5-1 Example of a simple dataset description in XML.
From page 102...
... was designed as a language for representing metadata about Web resources, or information about physical resources that can be identified on the Web. This framework is not meant to be particularly human readable, but rather is meant to work behind the scenes to facilitate search and retrieval (see Figure 5-2 for an example)
From page 103...
... Figure 5-2  A simple dataset description in RDF. 103 Figure 5-2 A simple dataset description in RDF.
From page 104...
... Statistical Agencies, in Plain Language," 10 July 2020. The SCOPE Metadata team is an infor mal, longstanding interagency group of U.S.
From page 105...
... These four goals are each subdivided into three or four principles (15 in total) that bear on data and metadata management and dissemination.
From page 106...
... federal statistical agencies, collaborative projects are very likely the right way to proceed. In addition to all of the above, achieving success requires that manage ment and technical staff be supportive.
From page 107...
... RISKS AND BENEFITS Given the costs and time that must be devoted to training and developing tools to adopt and make use of any of the six metadata standards described in detail later in this chapter, there is understandable hesitancy about building the capabilities of statistical metadata management. Some agencies view the making of informed use of metadata standards as "a bridge too far." However, the costs are not excessive, and the benefits will extend long into the future.
From page 108...
... The Federal Data Strategy uses the phrase "Define once; use many times." Metadata are reused when they help describe many resources. For instance, a variable used in every dataset produced by a monthly survey only needs to be described one time.
From page 109...
... This problem repeats often, and metadata systems are built rarely. Metadata management functions are not in the original plans for many systems and statistical programs, so retrofitting metadata into a system built later does not happen easily.
From page 110...
... . Recommendation 5.1: The Interagency Council on Statistical Policy should develop and implement a multi-agency pilot project to ex plore and evaluate employing existing metadata standards and tools to accomplish data sharing, data access, and data reuse.
From page 111...
... Such a pilot project should serve as a foundation for additional follow-up projects. These projects should lead to statistical agencies developing policies on the use of metadata standards and tools in documenting methods and retaining for their input data and official estimates for future use.
From page 112...
... org is identified as a dataset in Google data search. • The open-source .Stat platform15 is an SDMX-based and CSPA inspired system for statistics data and metadata storage and dis semination, available from the Statistical Information System Collaboration Community (SIS-CC)
From page 113...
... Interview. Sampled households for the Diary survey are given a form to record small expenses for each of 2 consecutive weeks, and those in the Interview survey answer questions about large or recurring expenses for each of the previous 3 months for four consecutive quarters.
From page 114...
... The DDI Codebook and Dublin Core metadata provide rich, structured metadata that cover all dimensions of a data collection operation: objectives, concepts, methods, scope and coverage, universe, sampling methods, processing methods (editing and imputation methods) , quality and relationships to other datasets, and a detailed data dictionary.
From page 115...
... At the U.S. Census Bureau A group of analysts at the Census Bureau are using metadata standards for data discovery.
From page 116...
... federal statistical agencies can achieve some uniformity and interoperability among their data and metadata systems in terms of both data stores and services. When left to their own devices, the agencies build systems in their own ways.
From page 117...
... federal statistical system finds itself fighting in the effort to overcome the differences the individual approach has fostered. Participation in Standards Development Standards are specifications designed to solve the problems described above and, in general, to satisfy the business requirements of stakeholders (those organizations with a material interest in the outcome of the standards development process)
From page 118...
... The U.S. federal statistical agencies have many requirements in common, and these requirements are mostly shared with other statistical offices around the world as well.
From page 119...
... . Standards are built using precise language called provisions.
From page 120...
... Generic Statistical Information Model and the DDI-Lifecycle standard, do specify the attributes necessary to record metadata about frequency. When a system has statements claiming conformance to some standards around its use, the receiver of data from that system already knows what to expect.
From page 121...
... Figure 5-3  Conforming to standards -- efficiencies gained. Figure 5-3 Conforming to standards -- efficiencies gained.
From page 122...
... These six are the Generic Statistical Business Process Model. Generic ­Statistical Information Model.
From page 123...
... This generic conceptual framework is designed to support moderniz­ing, streamlining, and aligning the work of statistical offices, such as the principal U.S. federal statistical agencies, and is one of the building blocks for modernizing official statistics.
From page 124...
... Common Statistical Data Architecture The Common Statistical Data Architecture (CSDA) , developed under the UNECE, is a reference architecture and set of guidelines for managing statistics data and metadata throughout the statistical life cycle.
From page 125...
... , and in any case, the same principles apply. CSDA stresses that statistical information should be treated as an asset.
From page 126...
... federal statistical agencies are ongoing. More statistical offices in the United States and around the world are adopting DDI3 for their metadata needs, including the Bureau of ­Labor Statistics, Statistics Canada, and the Australian Bureau of Statistics.
From page 127...
... The current revision work being done on the standard will improve its capability for the exchange of statistical information. SDMX consists of three main elements: • Technical standards (including the Information Model)
From page 128...
... federal statistical system increases the need for metadata, generally, and standards in particular. Metadata constitute the information needed to understand statistical data, designs, and processing.
From page 129...
... prioritize and emphasize the importance and benefits of federal statistical agency staff engaging in international metadata stan dards and tool development, and (2) organize a discussion among statistical agencies that leads to an effective, coordinated, and account able approach for staff in agencies that produce federal statistics to contribute to international metadata standards and tool development.
From page 130...
... 130 TRANSPARENCY IN STATISTICAL INFORMATION Individual agency staff should attend and participate in such continuing education programs with the goals of gaining professional familiarity with metadata standards and tools and improving the transparency of statistical methods, operations, and analysis in their data programs.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.