Skip to main content

Currently Skimming:

Session 1: Introduction to Big Data
Pages 11-22

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 11...
... . FRONTIERS IN MASSIVE DATA ANALYSIS AND THEIR IMPLEMENTATION Daniel Crichton, Director, Center for Data Science and Technology, Jet Propulsion Laboratory Mr.
From page 12...
... In between, data triage is conducted; if the data are too massive, data reduction steps may be necessary to reduce their overall size. How ever, any data reduction requires inferences to be developed about the data, adding uncertainty to the data.
From page 13...
... 3. How can we use advanced data science methods to systematically derive scientific inferences from massive, distributed science measurements and models?
From page 14...
... These constitute a massive data set that has been brought together but that was not originally designed to be integrated. The data science infrastruc ture, such as the data, algorithms, and machines, informs data analytics.
From page 15...
... Data Capture Data Analysis FIGURE 1  The elements of systematic analysis of massive data for NASA. SOURCE: Dan Crichton, Jet Propulsion Laboratory, presentation to the committee on February 5, 2014, Slide 17.
From page 16...
... IBM AND BIG DATA Jed Pitera, Manager, Computational Chemistry and Materials Science, IBM Research-Almaden Dr. Pitera explained that he manages IBM's research team in computational chemistry and materials science at IBM Research-Almaden.
From page 17...
... The discovery phase focuses on conducting computational experiments, mining the literature (the published literature, as well as unpublished laboratory documentation) , and finding new materials or repurposing existing ones.
From page 18...
... BIG DATA FOR BIOSECURITY Dave Shepherd, Program Manager, Homeland Security Advanced Research Projects Agency, Department of Homeland Security Mr. Shepherd began by noting that he works in biology programs with home land security applications and that his portfolio does not include materials or manufacturing.
From page 19...
... One example where excellent surrogate data for biosecurity exists, Mr. Shepherd noted, are data for the spread of antimicrobial resistance.
From page 20...
... Mr. Shepherd envisions a new model in which clinical data holders participate in a national, distributed, interconnected grid similar to the collaborative model that underlies Lawrence Livermore National Laboratory's Earth Systems Grid Federation for the climate modeling community.
From page 21...
... One participant mentioned the Air Force's digital twin program, in which the digital representation of a material keeps information about the material properties; perhaps it would be valuable to include an actual sample to examine along with the digital twin. However, other workshop participants also pointed out that critical policy issues would need to be addressed, such as the amount of material to retain, access criteria, and other issues.
From page 22...
... Swink asked what steps the community should take now to move forward in data management. One participant suggested looking to the NSF program EarthCube as a model for how to work across different communities to develop ontologies and names.7 The materials community may suffer from the lack of conversation about ontologies.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.