National Academies Press: OpenBook

The Future of Statistical Software: Proceedings of a Forum (1991)

Chapter:Afternoon Session Opening Remarks

« Previous: Morning Discussion
Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×

Afternoon Session Opening Remarks

Forrest Young

University of North Carolina, Chapel Hill

There was talk at various times this morning about standardization and occasionally about certification. The Panel on Guidelines for Statistical Software is about neither of those, and it is important to emphasize that. Our business is guidelines, not issuing seals of approval.

If you think particularly about the three topics of exactness, richness, and guidance, it is hard to know how one would decide for the last two, richness and guidance, that something deserves a seal of approval. Making such judgments for exactness is a possibility, although I am not saying I think that is a good thing to do. The panel aims only to state guidelines, not to set standards or issue seals of approval. It is certainly possible, though, to set standards for exactness. For richness or guidance, however, standards--let alone certification--may not be possible.

Another theme that came up several times in the morning was layering, that there should be different layers of the software system. I tend to see this as related to this afternoon's featured topic of guidance in that there can perhaps be an outer layer of a statistical system whose purpose is to guide the relatively unsophisticated user.

In my ideal data analysis environment, such a layer would be there for the more naive user, but would not have to be there; it need not be used by a more sophisticated user. There would be several layers. Perhaps the innermost layer would be just a language. A complete system would need to have more layers put on the outside to help people who are less sophisticated in terms of the data analysis, but who are very interested in the application.

Another theme from this morning was that of strategy, which is a central idea in guidance. Paul Tukey mentioned that one ought to have a strategy for doing regression modeling. Also, Paul Velleman presented two strategies. One is an original strategy for doing statistical analysis based on batch submission of analyses, where one first reads in the data and then specifies the strategy, afterward producing output. That is a very linear strategy without any choices in it. Later, he presented a much more involved strategy, more in tune with exploratory data analysis, where data is read at the beginning and displayed, whereupon the user is faced with a lot of options having to do with outliers, with diagnosing problems in the data, or with putting the data into sub-groups and transforming the data. Basically, that is another idea of a strategy in data analysis. Strategies are important for providing guidance.

In a paper I presented at the ASA conference in August of 1990 on that topic [Lubinsky and Young, 1990], there were a couple of slides on guidance showing my ideas along this line. Figure 3 is a mock-up of a proof-of-concept system that David Lubinsky

Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×

FIGURE 3: One possible way of guiding a data analysis. Reprinted, with permission, from Lubinsky and Young [1990]. Copyright © 1990 by American Statistical Association.

of AT&T Bell Laboratories and I worked on. There is a window with a cyclic graph in it. As Paul Velleman pointed out this morning, it has an entry point, but no exit. This represents the process of data analysis. It is never finished. But you can exit at any point you want. There is no specified plan of things that must be done before you can quit. But when you do exit, the system would suggest a thing to do.

For example, the grayed-in box is suggesting that the first thing to do is to select the data. When that has been done, a sub-strategy might be given, a recursive definition of a strategy. A new strategy box opens up that focuses both on variables and observations or cases. When that is finished, that box closes.

Then the user goes to the next set of possible things that the strategy would suggest, either describing the data, transforming the data, or defining a model. As the flow indicates, if you describe the data, you still can again transform data or define the model, and conversely for transforming. But once you have a model defined, the only thing the strategies then suggest you do is to fit the model. Fitting the model itself is recursively defined. Within that one would see a more involved strategy depicting what to do.

This is one possible way of guiding a data analysis. Where does this strategy graph come from? It comes from an expert. Somewhere, an expert at multiple regression must have sat down and created this graph. In fact, this graph was created by Lubinsky and me after looking at the book by Daniel and Wood [1980], where such a strategy for doing multiple regression appears on the inside front cover. There are also analogous graphs presented for principal components in a factor analysis, for example. Such sources for

Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×

guidance strategies are available, and experts can certainly be consulted for strategies to guide data analyses.

References

Daniel, C., and F.S. Wood, 1980, Fitting Equations to Data, John Wiley & Sons, New York.


Lubinsky, D.J., and F.W. Young, 1990, Guiding data analysis, Proceedings of Section on Computational Statistics, American Statistical Association, Alexandria, Va.

Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×
This page in the original is blank.
Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×
Page33
Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×
Page34
Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×
Page35
Suggested Citation:"Afternoon Session Opening Remarks." National Research Council. 1991. The Future of Statistical Software: Proceedings of a Forum. Washington, DC: The National Academies Press. doi: 10.17226/1910.
×
Page36
Next: An Industry View »
The Future of Statistical Software: Proceedings of a Forum Get This Book
×
Buy Paperback | $45.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

This book presents guidelines for the development and evaluation of statistical software designed to ensure minimum acceptable statistical functionality as well as ease of interpretation and use. It consists of the proceedings of a forum that focused on three qualities of statistical software: richness—the availability of layers of output sophistication, guidance—how the package helps a user do an analysis and do it well, and exactness—determining if the output is "correct" and when and how to warn of potential problems.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!