Skip to main content

Currently Skimming:

4. Improvements Attainable in Performance Evaluation
Pages 31-46

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 31...
... This chapter describes methods for characterizing applications and architectures and points toward an emerging and promising approach for accomplishing their pairing. It is hoped that this description of what might be attainable will encourage the development of supercomputer performance evaluation as a science.
From page 32...
... Comparative measurements should be made on a constant set of applications and across a spectrum of systems. As pointed out in Chapter 2, the measurement process should cycle through the construction of abstract theoretical models of applications and architectures, the design of experiments to measure specific parameters relative to those models, the development of metrics and their use in a fully understood environment, and the consequent recreation of basic models.
From page 33...
... The first distinction is made on the basis of the number of "programs" being executed at once, where a program is assumed to need a single instruction location register in some control unit for its execution. The second distinction is based on the ability of the control unit to sequence one or more operation types at once, where an operation type corresponds roughly to a single operation code.
From page 34...
... Some of the steps may involve tradeoffs between mathematical properties, like local accuracy or rate of convergence, and more algorithmic aspects, like computational complexity. Having just said this, we must nevertheless admit that massively parallel architectures are having an impact on the development of mathematical models.
From page 36...
... Whereas a typical method of solution involves finite difference or finite element techniques on fixed or iteratively defined meshes, this implementation assigns discrete values-to some of the independent variables by the discrete ordinates method and then solves the resulting system of uncoupled ordinary linear differential equations.
From page 37...
... In particular, how significant to the application are classes of instructions such as floating point operations, integer arithmetic, memory references, jumps, or address computations? o For vector architectures how much vectorization is being exploited, how much of the vector power can be used concurrently, what are the average vector lengths, and what are the significant strides?
From page 38...
... FIVE STAGES IN PERFORMANCE EVALUATION The following approach attempts to synthesize the considerations described above, outlining several stages in the process of performance evaluation. The goals are both to determine performance capabilities of existing systems and to predict performance of future systems on specified programs, representative of larger classes of applications.
From page 39...
... Modifications could be high-level language changes to enable execution, syntax changes to permit a compiler to recognize vector and parallel sections of code, algorithmic changes to enhance the suitability of a specific architecture, or total reconsideration of the underlying mathematical model. Clearly, each of these considerations has an associated cost.
From page 40...
... o ~ = degree of vectorization (natural and obtainable by a vectorizing compiler) , average vector lengths, strides 0 P = degree and type of parallelism, granularity, balance 0 M = memory references, number relative to floating point operations, access patterns, likelihood of occurring in M1 to M4 as defined above o I/O = I/O requirements (if they exist beyond the capacity of M41.
From page 41...
... An initial attempt at mapping applications to architectures will be made at this point, and with it an assessment of the performance of specific systems given a predefined applications set. The results will be in the form of a set of ordered pairs of performance information; specifically, each ordered pair will provide the net processing rate of a particular implementation of an application on an architecture, executed within a defined environment.
From page 42...
... , and extensions of Amdahl's Law (Amdahl, 1967) will direct us toward expectations of total performance based on the relative weights assigned to the various processing modes for a given workload.
From page 43...
... Requirements for fulfilling the goals, such as tools for making and analyzing measurements, can have an immediate impact on current performance evaluation work. Certainly, architecture designers, application developers, and system engineers will all profit from the establishment of a body of information, collected in a scientific manner, describing the analysis and measurement of a variety of applications on a spectrum of existing computer systems.
From page 44...
... In Supercomputers: Design and Applications. Los Alamitos California: Computer Society, Institute of Electrical and Electronics Engineers.
From page 45...
... 1985. SNEX: Semianalytic solution of the one-dimensional discrete ordinates transport equation with diamond differenced angular fluxes.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.