Skip to main content

Currently Skimming:

8 Standards for Metadata and Work Processes
Pages 81-98

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 81...
... JAGADISH: AUTOMATING THE CAPTURE OF DATA TRANSFORMATION METADATA H.V. Jagadish began with a presentation on automating the capture of data transformation metadata.
From page 82...
... The statistical packages that could be converted include delimited text, SPSS, SAS, Stata, R, and Excel. An SDTL could be useful because there are many ways to implement transformations in different statistical programming languages, including how missing values are treated.
From page 83...
... DANIEL GILLMAN: DATA DOCUMENTATION INITIATIVE Dan Gillman then spoke about the DDI. His goal was to describe the efforts they are engaged in at the Bureau of Labor Statistics (BLS)
From page 84...
... The lifecycle includes the edit system, the estimation system, and the ultimate data products to the user. They want to show how similar variables change over time.
From page 85...
... They wanted to show that they could account for changes from one year to the next, and selected education and hospitalization and health insurance as the variables to be able to follow through from beginning to end to see what happens as the survey and production environment change. In order to compare variables, they built a "correspondence tree" that can show how things look across surveys, over time, and throughout the lifecycle, as well as the "code comparison," which are small numeric values used to represent categories in variables.
From page 86...
... What they want to do is show a process model for each of those systems and show what is happening to the data from input processing to output throughout the entire processing cascade. DAVID BARRACLOUGH: STATISTICAL DATA AND METADATA EXCHANGE David Barraclough from the OECD spoke about the Statistical Data and Metadata eXchange (SDMX)
From page 87...
... Also, the employee would like to make datasets user friendly and compatible with shared coding and related things. Barraclough continued that an international organization or a data receiver would like to avoid time and errors when processing these different ­ file formats from providers since they can come in a variety of formats.
From page 88...
... Barraclough then delved into the business case for using SDMX. Using­ SDMX saves resources by reusing exchange systems across domains and agencies, and through reuse of statistical metadata and methodology.
From page 89...
... The global registry is a tool that lives in the cloud and was designed to host global data structure definitions such as balance of payments, national accounts, and foreign direct investment. There are also cross-domain code lists, such as seasonal adjustments.
From page 90...
... They have a checklist for SDMX design projects, which includes questions such as if somebody wants to create a reporting framework or simply a set of datasets, then how should they go about it so that it works well? In this design phase, the user first maps the data flows of what he or she wants to do, then defines a concept scheme, which defines for national accounts all of the concepts used to describe these national accounts, and then defines code lists.
From page 91...
... GSBPM is a reference model that describes and defines the set of business processes to produce official statistics. It provides a standard framework and harmonized terminology to help statistical organizations modernize their statistical production processes, as well as to share methods ­ and components.
From page 92...
... The information objects contextualize or are described using the Generic Statistical Information Model (GSIM)
From page 93...
... GSBPM can be used to describe statistical processes based on any kind of input data -- survey, administrative records, etc. GSBPM is a practical model that can be and is being applied by many National Statistical Offices.
From page 94...
... Denk added that GSBPM provides a standard framework for a flexible model using harmonized terminology, which describes and defines the set of business processes used to produce official statistics. It is used to help statistical organizations move from topical stove-pipes (product-centric)
From page 95...
... Furthermore, GSIM enables a high degree of automation of the statistical business process that supports reproducibility. Denk stated that in addition, GSIM facilitates capacity building in statistical organizations and GSIM helps in assessing existing statistical information systems and processes.
From page 96...
... . The idea is that CSPA brings all of the standards that are being developed by the UNECE and partnering organizations together, and gives a reference to statistical organizations who want to take advantage of all of these standards by giving them tools or at least services where they can communicate at a technical level.1 Denk pointed out that the High-Level Group for the Modernisation of Official Statistics was set up by the UNECE Conference of European Statisticians in 2010 to oversee and coordinate international work relating to statistical modernization, and they are responsible for all of these standards.
From page 97...
... Muñoz added something to the last conversation about the United States taking part in this effort. He heard a participant say that they are searching for a transformation language for statistical information, and that there is already a team under the SDMX technical working group that is developing validation and transformation language.
From page 98...
... Levenstein commented that there are things that have to be developed in DDI to make it more useful for administrative data, but that is one reason why having the federal statistical agencies involved in these kinds of organizations at this level would be helpful. She closed by noting that having the voices of the federal statistical agencies engaged in these discussions in terms of the development of standards would lead to standards and products that were more useful to the agencies.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.