Skip to main content

Currently Skimming:

PART I Participant's Expectations for the Workshop
Pages 3-12

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 3...
... PART I Participants' Expectations for the Workshop Session Chair: Dary! Pregibon AT&T Laboratories 3
From page 5...
... ~ am now looking at information retrieval, way the natural language people at the University of Pennsylvania, trying to see what techniques caI1 be pulled from things like PC8, and how well they scale when one gets much bigger data sets. ~ am interested in seeing what techniques people have for deciding' for example, which variables out of the space of 100,000 are relevant, and using those for applications.
From page 6...
... We have huge amounts of data on the fiber as it progresses through different stages of manufacture. One of the problems is mapping the corresponding centimeters of glass through the different stages and combining these data sets.
From page 7...
... At one time, ~ felt that data analysis was the big white spot in statistics; now ~ guess that large data sets are becoming the big white spot of data analysis. Lixin ZeIlg (University of Washington)
From page 8...
... ~ have worked extensively in shrinkage estimation, hierarchical modeling, and variable selection. These methods do work on moderate-to small-sized data sets.
From page 9...
... I started on a project in about 1979 for the Department of Energy, and so I have some experience, though not currently with the massive data sets of today. I have a long-time interest in graphics for large data sets and data analysis management, and in how to keep track of what is done with these data sets.
From page 10...
... My interest in the subject of this workshop has developed over probably 10 years of working on Navy and related remote sensing problems. Contrary to the accepted definition, I define massive data sets as data sets with more data than we can currently process, so that we are not using whatever data is there.
From page 11...
... The health care data sets that I will be talking about this afternoon are special, in that not only are they very large or even massive, but the human input that goes into .1 · · ~ ~ , · .
From page 12...
... I did not know I was interested in massive data sets until Daryl Pregibon invited me to this conference and I started reading the position papers. Prior to this, the biggest data set I ever worked on had a paltry 40,000 cases and a trifling hundred variables per case, and so I thought the position papers were extremely interesting' and I have a lot to ream.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.