Skip to main content

Currently Skimming:

7 Improving Data Collection and Dissemination
Pages 35-44

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 35...
... The agency's science and technology innovation (STI) indicators program faces several challenges:  Traditional surveys face increasing expense, declining response rates, and lengthy time lags between when data are gathered and when derived indicators and other statistics can be published.
From page 36...
... During the panel's workshop, Alicia Robb (of the Kauffman Foundation) encouraged NCSES to explore the use of administrative records to produce STI indicators, but she also cautioned that ownership issues associated with use of those data will have to be addressed before they could become a reliable complementary data source to traditional survey data.
From page 37...
... Web Scraping In addition to improving survey methods and using administrative records databases directly, another potential avenue for acquiring data is web scraping, that is, collecting data publicly available on the web. This approach is distinct from web-based survey methods, which use the web to administer a survey.
From page 38...
... A fundamental question requires more examination: What kind of statistical methodology to apply to data from web scraping? There are other, related questions: What are the tradeoffs with using web-based data sources instead of survey data?
From page 39...
... For example, the national unemployment rate, gross domestic product, and consumer price index are periodically updated without diluting the measure's importance. The Billion Prices Project at the Massachusetts Institute of Technology uses an algorithm that collects prices daily from hundreds of online retailers worldwide, creating, among other things, a daily price index for the United States.8 7 LinkedIn and similar data could be quite useful for questions involving relative rather than absolute measures.
From page 40...
... report, "Innovation Inducement Prizes at the National Science Foundation," and the National Science Foundation's new Innovation Corps Program could also serve as useful models, although these resources are focused more specifically on technology commercialization. If the contest is designed to address the statistical questions around the usefulness of webbased data sources, it will be necessary to supply some sample data, and this might affect negotiations with companies.
From page 41...
... NCSES will have to proceed with caution as it considers integration of frontier tools and datasets into its indicators production processes. MULTIMODAL DATA DEVELOPMENT One issue that needs to be explored is the feasibility of blending the use of administrative records, scientometric tools, and survey techniques to produce more accurate data on STI human capital measures and other indicators that NCSES produces, such as R&D input and performance measures.
From page 42...
... Employment dynamics, including worker mobility trends in science and engineering occupations, could be developed by linking Census Bureau, BLS, and BEA data. Existing research data centers or data enclaves could facilitate platforms for data integration and potentially data comparability with other nations that also follow similar data administration policies.15 NCSES already has the infrastructure at the National Opinion Research Center (NORC)
From page 43...
... into practice at NCSES would represent a paradigm shift for the agency, at a critical time when they are reaping benefits from investments in revised surveys during the past four to five years. Therefore, the panel recommends that NCSES in the near term undertake pilot work to determine how its indicators program can incorporate the new techniques with traditional survey methods.
From page 44...
... analyzing the huge and growing amount of information on the Internet for similar purposes;  pilot programs or experiments to produce a subset of indicators using web tools; and  convening a workshop of experts on multimodal data development, to explore the new territory of developing metrics and indicators from surveys, administrative records, and scientometric sources.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.