Skip to main content

Currently Skimming:

2 What Is Computer Performance?
Pages 53-79

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 53...
... 1 In addition, a growing performance gap between processor performance and memory bandwidth, thermal-power challenges and increasingly expensive energy use, threats to the historical rate of increase in transistor density, and a broad new class of computing applications pose a wideranging new set of challenges to the computer industry. Meanwhile, soci etal expectations for increased technology performance continue apace and show no signs of slowing, and this underscores the need for ways to sustain exponentially increasing performance in multiple dimensions.
From page 54...
... As a result, software investment on this model has accumulated over the years and has led to the de facto standardization of one instruc tion set, the Intel x86 architecture, and to the dominance of one desktop operating system, Microsoft Windows. The committee believes that the slowing in the exponential growth in computing performance, while posing great risk, may also create a tremendous opportunity for innovation in diverse hardware and software infrastructures that excel as measured by other characteristics, such as low power consumption and delivery of throughput cycles.
From page 55...
... For those systems, making each transaction run as fast as possible is not the best thing to do. It is better, for example, to have a larger number of lower-speed processors to optimize the throughput rate and minimize power consumption.
From page 56...
... See Box 2.1 for a discussion of embedded computing performance as distinct from more traditional desktop systems. In general, power considerations are likely to lead to a large variety of specialized processors.
From page 57...
... · Increasing computer performance enhances human productivity. · One measure of single-processor performance is the product of operating frequency, instruction count, and instructions per cycle.
From page 58...
... · Instruction-level parallelism has been extensively mined, but there is now broad interest in data-level parallelism (for example, due to graphics processing units) and thread-level parallelism (for example, due to chip multiprocessors)
From page 59...
... CPU performance is the driver that forces the many other system components and features that contribute to overall performance to keep up and avoid becoming bottlenecks PERFORMANCE AS MEASURED BY RAW COMPUTATION The classic formulation for raw computation in a single CPU core identifies operating frequency, instruction count, and instructions per cycle 2 Consider the fact that the term "computer system" today encompasses everything from small handheld devices to Netbooks to corporate data centers to massive server farms that offer cloud computing to the masses.
From page 60...
... The last 15 years have seen dramatic increases in the operating frequency of CPU cores. As an unfortunate side effect of that growth, the maximum operat ing frequency has often been used as a proxy for performance by much of the popular press and industry marketing campaigns.
From page 61...
... Some performance assessments focus on the peak capabilities of the machines; for example, the peak per formance of the IBM Power 7 is six instructions per cycle, and that of the Intel Pentium, four. In reality, those and other sophisticated CPU cores actually sustain an average of slightly more than one instruction per cycle when executing many programs.
From page 62...
... However, the techniques also highlight the importance of the full suite of hardware components in modern computer systems, the communication that must occur among them, and the software technologies that help to automate application development in order to take advantage of parallel ism opportunities provided by the hardware. COMPUTATION AND COMMUNICATION'S EFFECTS ON PERFORMANCE The raw computational capability of CPU cores is an important com ponent of system-level performance, but it is by no means the only one.
From page 63...
... In fact, there are several methods of taking advantage of the potential of parallelism offered by additional CPU cores, each with distinct advantages and associated challenges. 6 Nonvolatile storage does not require power to retain its information.
From page 64...
... · The second method takes advantage of the additional CPU cores to improve the turnaround time of a particular program more dra matically by running different parts of the program in parallel. This method requires programmers to use parallel-programming constructs; historically, this task has proved fairly difficult even for the most advanced programmers.
From page 65...
... As a result, the development and optimization of such programs are quite different from those of the others mentioned above. In addition to the methods described above, computer scientists are actively researching new ways to exploit multiple CPU cores, multiple computer systems, and parallelism for future systems.
From page 66...
... This technology breakthrough inaugurated the modern computing era. In 1965, Gordon Moore observed that the transistor density on integrated circuits was doubling with each new technology generation, and he projected that this would continue into the future.11 (See Appendix C 7 B
From page 67...
... By the late 1980s, the power-consumption characteristics of the BJT-based computer systems hit a breaking point; around the same time, the early use of FET-based integrated circuits had demonstrated both power and cost advantages over the BJT-based technologies. Although the underlying transistors were not as fast, their characteristics enabled far greater integration potential and much lower power consumption.
From page 68...
... ASSESSING PERFORMANCE WITH BENCHMARKS As discussed earlier in this chapter, another big challenge in understanding computer-system performance is choosing the right hardware and software metrics and measurements. As this committee has already discussed, the peak-performance potential of a machine is not a particu larly good metric in that the inevitable overheads associated with the use of other system-level resources and communication can diminish delivered performance substantially.
From page 69...
... The underlying performance of the memory system can be even more important than the raw computational capability of the CPU cores involved. That can be seen as an example of throughput as performance (see Box 2.5)
From page 70...
... Such a benchmark would be bound by the speed THE INTERPLAY OF SOFTWARE AND PERFORMANCE Although the amazing raw performance gains of the microprocessor over the last 20 years has garnered most of the attention, the overall performance and utility of computer systems are strong functions of both hardware and software. In fact, as computer systems have deployed more hardware, they have depended more and more on software technologies to harness their computational capability.
From page 71...
... Those high-level programming constructs make it easier for programmers to develop correct complex programs faster. Abstraction tends to trade increased human programmer productivity for reduced software performance, but the past increases in single-processor performance essentially hid much of the performance cost.
From page 72...
... Just as there are physical limits on how fast a jackhammer's chisel can be driven downward and then retracted for the next blow, higher computer clock rates generally yield faster time-to-solution results, but there are several immutable physical constraints on the upper limit of those clocks, and the attainable performance speedups are not always proportional to the clock-rate improvement. How much a computer system can accomplish per clock cycle varies widely from system to system and even from workload to workload in a given system.
From page 73...
... Although GPUs are just as constrained by the exponentially rising power dissipation of modern silicon as are the GPs, GPUs are 1-2 orders of magnitude more energy-efficient for suitable workloads and can therefore accomplish much more processing within a similar power budget. Applying multiple jackhammers to the pavement has a direct analogue in the computer industry that has recently become the primary development avenue for the hardware vendors: "multicore." The computer industry's pattern has been for the hardware makers to leverage a new silicon process technology to make a software-compatible chip that is substantially faster than any previous chips.
From page 74...
... The reason that industry is ill prepared is that an enormous amount of existing software does not use thread-level or data-level parallelism -- software did not need it to obtain performance improvements, because users simply needed to buy new hardware to get performance improve ments. However, only programs that have these types of parallelism will experience improved performance in the chip multiprocessor era.
From page 75...
... Although expert programmers in such application domains as graphics, information retrieval, and databases have successfully exploited those types of parallelism and attained performance improvements with increasing numbers of processors, these applications are the exception rather than the rule. Writing software that expresses the type of parallelism that hardware based on chip multiprocessors will be able to improve is the main obstacle because it requires new software-engineering processes and tools.
From page 76...
... The baggage is there, but the magic of Moore's law is that so many additional transistors are made available in each new generation, that there have always been enough to reimplement the baggage and to incorporate enough innovation to stay com petitive. Over time, such non-x86-compatible but worthy competitors as DEC's Alpha, SGI's MIPS, Sun's SPARC, and the Motorola/IBM PowerPC architectures either have found a niche in market segments, such as cell phones or other embedded products, or have disappeared.
From page 77...
... The rapidly growing software base for portable applications running on ARM processors has made the compatible series of processors licensed by ARM the dominant processors for embedded and portable applica tions. As seen in the dominance of the System/360 architecture for mainframe computers, x86 for personal computers and networked servers, and the ARM architecture for portable appliances, there will be an opportunity for a new architecture or architectures as the industry moves to multicore, parallel com puting systems.
From page 78...
... In summary, the sustained viability of the computer-systems industry is heavily influenced by an underlying virtuous cycle that connects continuing customer perception of value, financial investments, and new products getting to market quickly. Although one of the primary indicators of value has traditionally been the ever-increasing performance of each individual compute node, the next round of technology improve
From page 79...
... As a result, many computer systems under development are betting on the ability to exploit multiple processors and alternative forms of parallel ism in place of the traditional increases in the performance of individual computing nodes. To make good on that bet, there need to be substantial breakthroughs in the software-engineering processes that enable the new types of computer systems.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.