**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

*The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering*. Washington, DC: The National Academies Press. doi: 10.17226/12451.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

*The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering*. Washington, DC: The National Academies Press. doi: 10.17226/12451.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

*The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering*. Washington, DC: The National Academies Press. doi: 10.17226/12451.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

**Suggested Citation:**"6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields." National Research Council. 2008.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

6 Numerical and Algorithmic Characteristics of HECC That Will Be Required by the Selected Fields This chapter discusses the numerical and algorithmic characteristics of HECC that will be needed in the four fields examined in Chapters 2-5. In particular, it addresses tasks (e) and (f) of the charge: âIdentify the numerical and algorithmic characteristics of the high-end capability computing require- ments needed to address the scientific questions and technological problems identified in [Chapters 2-5]â and âCategorize the numerical and algorithmic characteristics, specifically noting those categories that cut across disciplines.â Among other things, this chapter gives some indication of the mathematics, computer science, and computing infrastructure requirements, opportunities, and difficulties associated with the opportunities identified in Chapters 2-5. It identifies the prevailing computational demands in each of the four fields and suggests how the demands are likely to grow in the near and longer term. It also characterizes the rate-limiting mathematical parts of these calculations. In this chapter the committee discusses two stimulating tasks facing the four fields of science and engineering covered in this report: (1) managing and exploiting massive (and growing) amounts of data and (2) preparing the next generation of people who will push the frontiers of computational science and engineering. Both tasks are pervasive across many other fields of science and engineering as well. NUMERICAL AND ALGORITHMIC CHARACTERISTICS OF HECC FOR ASTROPHYSICS Several tasks face HECC for astrophysics in the near term. One is to raise the overall performance of astrophysics codes to the point where the current generation of HECC platforms can be effectively used. At present, only a small fraction of the algorithms and codes in use scale to 10 3-104 processors. Some algorithmsâfor example, grid-based fluid dynamicsâscale very well to tens of thousands of processors, while others require global communication that can limit scalingâfor example, elliptic partial differential equations, such as Poissonâs equation. The availability of systems with 10 5 or more processors will enable these much larger calculations while also making it feasible to perform more complex calculations that couple different models. Algorithms, models, and software are needed to enhance scalability, especially for those computations that require adaptive mesh refinement (AMR) or multiscale methods. The scalability of software is limited by the performance of its least-scalable 105

106 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING component. As we incorporate more models into one code (multiscale and/or multiphysics models) and perhaps integrate data management and other functions, it is more likely that overall performance will be held back by one weak link. Therefore, there will be an increased need to work toward balanced systems with components that are relatively similar in their parallelizability and scalability. Another problem is that the computations must deal with an enormous number of particles (1010-1011 at present) and large grids (as big as 20483, or 1010 cells, currently, with larger calculations planned for the future). Complex methods can require 103-104 floating point operations per cell per time step and generate hundreds of gigabytes to multiple terabytes of data in a single snapshot. Systems that deal with data on this scale will usually need to access thousands, perhaps even hundreds of thousands, of disks in parallel to keep the systemâs performance form being limited by input/output (I/O) rates. Thus scalable parallel I/O software would be extremely critical in these situations in the long term. In all of the HECC-dependent areas of astrophysics reviewed in Chapter 2, there is a strong science case to use the increase in computational capability expected over the next 10 years to increase model fidelity. This would be done mainly by including the effects of a large number of physical processes and their interactions in a single simulation. For such a program to be successful, it will be necessary to make fundamental improvements in the models, algorithms, and software. Models Traditionally, large-scale simulation in astrophysics has taken the form of first-principles modeling, in which the equations to be solved were relatively well characterized. Examples include three-Âdimensional calculations of inviscid compressible flow, flows of collisionless matter, effects of self-gravity, and pas- sive radiative losses in optically thin media. The next generation of models will go in the direction of much more complicated coupling between different physical processes (multiphysics modeling) and the interaction of a much greater range of temporal and spatial scales (multiscale modeling). Some of these complex models include the following: â¢ In cosmology models, those that incorporate the feedback effects of star formation and of super- massive black holes on galaxy formation. â¢ In models of the early stages of star formation, those that incorporate a self-gravitating multiÂ component medium that includes both collisional and collisionless components undergoing ionization, chemical reactions, heat and mass transfer within and among the components, and radiative transfer. â¢ In models of supernovae, those that include the ignition and propagation of a nuclear reaction front in a turbulent, gravitationally stratified medium, on the scale of an entire star. In these and other cases, it is not feasible to use a first-principles model that resolves all of the length scales and timescales and all of the coupled physical processes, even under the most optimistic view of the growth of computational capabilities over the next decade. Instead, it will be necessary to resolve a Ânarrower range of length scales, timescales, and physical processes, with the remaining ef- fects represented approximately through models that can be represented on the resolved scales. The development of such models is a complex process involving a combination of mathematical analysis and physical reasoning; they must satisfy the mathematical requirement of well-posedness in order to be computable; and they must be validated. Large-scale, detailed simulations will play a substantial role throughout this process by simulating unresolved scales that will provide input to the development of models and be the basis for validation.

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 107 Algorithms For the simulation of large-scale phenomena, such as cosmology, star formation, or gravity wave formation and propagation, the use of multiresolution for particles (hierarchical N-body methods) or fields (AMR methods) are core capabilities on which much of the current success is built. However, multiresolution methods are mature to varying degrees, depending on the level of model complexity. AMR for magnetohydrodynamics or for Einsteinâs equations of general relativity, for example, is cur- rently undergoing rapid development, while for coupled radiation and matter, or general relativistic fluid dynamics, such methods are still in their infancy. Radiation is particularly difficult owing to the need to solve time-dependent problems in six-dimensional phase space. For supernova simulations, an additional set of difficulties exists. For instance, while for much of the simulation the star is spheri- cally symmetric on the largest scales, it has asymmetric three-dimensional motions on the small scales. The need to preserve that large-scale symmetry requires new gridding methodologies such as moving multiblock grids, combined with local refinement. A second difficulty is that there are stiffness issues due to the reaction kinetics of thermonuclear burning or due to low-Mach-number fluid flows. New algorithms will need to be developed to integrate over the fast timescales efficiently and without loss of accuracy or robustness. Data Analysis and Management For data-intensive fields like astronomy and astrophysics, the potential impact of HECC is felt not just in the power it can provide for simulations but also in the capabilities it provides for managing and making sense of data, irrespective of whether the data are generated by simulations or collected via observations. The amount, complexity, and rate of generation of scientific data are all increasing exponentially. Major Challenges 1 and 2 in Chapter 2 (the nature of dark matter and dark energy) prob- ably will be addressed most productively through observation. But whether data are collected through observation or generated with a computer, managing and exploiting data sets on this scale is critically dependent on HECC. The specific problems stemming from massive amounts of data are discussed in Chapter 2. Software Infrastructure There are three drivers for the development of software infrastructure for astrophysics in the long term. The first is the expected radical change in computer hardware. Gains in aggregate performance are expected to come mainly from increasing the level of concurrency rather than from a balanced combination of increases in clock speeds of the constituent processors and increases in the number of processors. Only a small fraction of the algorithms of importance to astrophysics have been shown to scale well to 103-104 processors. Thus high-end systems requiring the effective use of 108 processors and 109 threads represent an enormous challenge. The second driver is the nonincremental nature of the model and algorithm changes described above. Development of optimal codes will require an aggres- sive and nimble exploration of the design space. This exploration will involve a moving target because it will have to be done on very high-end systems, whose architectures are evolving simultaneously. The third driver is the problematic aspects of data management. The response to these three drivers is roughly the same: Design high-level software tools that hide the low-level details of the problem from the developer and user, without foreclosing important design options. Such tools will include new programming environments to replace MPI/OpenMP for dealing

108 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING with new hardware architectures. For new algorithm development, the tools are software frameworks and libraries of high-level parallel algorithmic components (for example, discretization libraries, data holders, solvers) from which new simulation capabilities can be built. Similar collections of visualiza- tion, data-analysis, and data-management components would provide capabilities for those tasks. Some methods and prototypes have already been developed by the mathematics and computer science research communities that would form a starting point for developing these high-level software tools. They would have to be customized in collaboration with the astrophysics community. NUMERICAL AND ALGORITHMIC CHARACTERISTICS OF HECC FOR THE ATMOSPHERIC SCIENCES In the atmospheric sciences, the requirements, opportunities, and challenges divide roughly into near term (1-5 years) and long term (5-10 years). In the near term, the main opportunities require the use of existing simulation capabilities on 103-104 processors to address scientific questions of immediate interest. In the longer term, the ability to exploit capabilities that use 10 4-105 or more processors per run on a routine basis will require the development of new ideas in mathematical models, numerical algorithms, and software infrastructure. The core computations are those associated with computational fluid dynamics, though many other models and algorithmsâfor example, gridding schemes, statistical models of subgrid-scale processes, models of chemical kinetics, statistical sampling across ensembles, and data-management modelsâare also essential. Near-Term Requirements, Opportunities, and Challenges Currently, simulation codes for climate and numerical weather prediction (NWP) are routinely run on 700-1,500 processors. Nevertheless, both efforts are still constrained by the insufficient aggregate computing capacity available to them. In the case of climate, the atmospheric models are run at horizontal mesh spacing of about 100 km, which is insufficient to capture a number of key solution phenomena, such as orographically driven precipitation. To resolve such phenomena, it is necessary to increase the horizontal mesh resolution by a factor of four (25 km mesh spacing). In the case of NWP, the demand for increased capability is driven by requirements for improved prediction of severe weather and for better support of critical industries such as transportation, energy, and agriculture. In addition, new higher-resolution data streams are expected to soon be available for NWP, and the computational effort involved in data assimilation will increase accordingly. Thus, for both climate modeling and NWP, there is a near-term requirement to increase computing capability by a factor of 10 or so. In the case of climate modeling, this increased capability could be used to increase the atmospheric grid resolu- tion of a coupled atmosphere-ocean-sea ice climate model. Such simulation codes have already been demonstrated to scale up to as many as 7,700 processors, so no reimplementation would be required. In the case of NWP, the increase of computer power would need to be accompanied by changes in code architecture to improve the scalability to 104 processors and by the recalibration of model physics and overall forecast performance in response to the increased spatial resolution. Long-Term Requirements, Opportunities, and Challenges A number of opportunities will open up as computing at the petascale and beyond becomes available. For climate simulations, these include the prediction of global and regional precipitation patterns with an accuracy and robustness comparable to what we now have for temperature; the accurate prediction of

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 109 effects that would have substantial human impacts, such as climate extremes; and decadal predictions on regional scales, which are of greatest interest currently to policy makers. In NWP, such models would enable âwarn on forecastâ for locally severe weather, improved prediction of hurricane intensity and landfall, and very high-resolution analysis products. In order to realize these opportunities, there will need to be major advances in models, algorithms, and software. Models For both climate modeling and NWP, one of the most prominent changes in the models that will enable the advances outlined above is the replacement of the current statistical models of convective and hydrological processes by much higher-fidelity explicit representations. This change is intimately tied to increased spatial resolution (horizontal mesh spacing of ~1 km), which we will discuss below. However, development of the models themselves is a substantial mathematical undertaking. For convec- tive processes, even at 1 km resolution, there are smaller-scale motions that are not representable on the grid, for which large-eddy simulation models will need to be developed. For hydrological processes, the problem is even more complicated, with clouds having a geometrically and thermodynamically rich microphysical structure whose effects must be scaled up to the scale of cloud systems. Finally, for climate modeling, the need for high-fidelity representations of clouds in terms of their hydrology and their interaction with radiation introduces additional difficulty. Addressing these problems will require a combination of mathematical and computational techniques to produce simulations at scales small enough to explicitly model the convective and hydrological processes of interest. That effort must be strongly coupled to observations and experiments so that the simulations can be validated and the subgrid-scale models developed from those calculations can be constrained by experimental and observational data. This process will incorporate new ideas being developed in the multiscale mathematics community that will provide techniques for using such small-scale simulations to inform models at the larger scales. Algorithms One of the principal drivers of algorithm development in atmospheric models is the need for substan- tial increases in spatial resolution, both in climate modeling and NWP. There are two possible approaches to obtaining resolutions of a few kilometers. The first is to simply use a uniform grid at that resolution. Such a discretization of an atmospheric fluid dynamics model would strain or exceed the limits of a petascale system. A second approach is to use a multiresolution discretization, such as nested refine- ment, provided that the regions that require the finest resolution are a small fraction (10 percent or less) of the entire domain. In that case, the computational capability required could be reduced by an order of magnitude, which would make the goal of computing with such ultra high-resolution models more feasible. As shown by Figure 3-4, such multiresolution methods are already feasible in weather predic- tion; however, they are not yet in use in most climate models. In any case, a broad range of design issues would need to be addressed before such models could be used routinely in climate or NWP, including the choice of discretization methods, coupling between grids at different resolutions, and dependence of subgrid models on grid resolution. In current atmospheric simulation codes, there is the additional difficulty that the time step must decrease in proportion to the spatial mesh size in order to achieve numerical stability. This further i Â ncreases the computing power needed for finer resolution and limits the potential gain from simply applyÂing more powerful computers. This is a particularly pressing problem for climate modeling, Âbecause

110 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING it calls for evolving the system for very long times. One possible solution would be to treat more of the time evolution implicitly, so that larger time steps can be taken. There have been many new ideas for designing implicit methods, such as matrix-free Newton-Krylov methods for building efficient Âsolvers for the resulting systems of equations and deferred corrections methods for designing implicit and semi- implicit time discretizations. A second class of algorithms that is limiting current capabilities in atmospheric science contains methods for data assimilation and management. While data assimilation techniques are a mature tech- nology in atmospheric modeling, in climate modeling and NWP new problems are arising that call for a reconsideration of data assimilation. In climate modeling, new data on biogeochemistry and the carbon cycle are being collected for the first time. Both the nature of the data and the nature of the models will require a rethinking of how to integrate these data into the overall assimilation process for climate modeling. In the case of NWP, growth in the volume of data is the dominant change, and the required four-dimensional assimilation will require new, more efficient methods for managing, analyzing, mining, sharing, and querying data and considerably more computing power. Software Infrastructure The comments about software infrastructure for the long-term needs of astrophysics apply equally well to the atmospheric sciences. But the atmospheric sciences face some additional issues because some of their computational products are used operationally for NWP. For that reason, changes in algoÂ rithms will have to be evaluated at the high spatial and temporal resolutions that we expect to use in next-generation operational systems. Higher-resolution validationâa common requirement in computa- tional science and engineeringâdemands even more computing capability than operational predictions. Another difference between computing for the atmospheric sciences and for astrophysics is that data management decisions for the atmospheric sciences must take into account the needs of a much broader range of stakeholders than just the scientific community. Finally, any new methods and prototype soft- ware developed in the atmospheric sciences must be made robust enough to produce forecasts reliably on a fixed production schedule. Numerical and Algorithmic Characteristics of HECC for Evolutionary Biology Computations of importance for evolutionary biology rely heavily on statistics and methods from discrete mathematics, such as tree searching, data mining, and pattern matching. The computational methods for searching and comparing genomes are increasingly important. A particular challenge is generating and validating phylogenetic trees as they grow to include thousands of species. Algorithms for all such computations tend to scale poorly. However, that has not been a significant limitation to date because computational evolutionary biology is still young enough that important scientific results can be obtained by the study of modest systems. Such systems can be studied in many cases with desktop computing, and that is the prevailing scale of computing today in evolutionary biology, although there are exceptions. Working with oneâs own desktop system has clear advantages when a field is still exploring a diversity of models, algorithms, and software, because an investigator can customize and adjust the computing environment in many ways. However, this era is ending. As the community gains confidence in particular models, as algorithms are improved, and as data are assembled, it is inevitable that researchers will strive to study larger and larger systems. Adding more species and making more use of genomic data will quickly drive compu-

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 111 tational evolutionary biology into the realm of high-end computing. Because of scalability problems with many algorithms and the massive amounts of genomic data to be exploited, evolutionary biology will soon be limited by the insufficiency of HECC. In the longer termâonce the foundations for simulation as a mode of inquiry become well e Â stablishedâthe numerical, algorithmic, and other related infrastructure requirements of Âevolutionary biology will be very similar to those of astrophysics and the atmospheric sciences. A particularly exciting longer-term impact of the use of HECC for evolutionary biology is that the dynamic interplay of ecologi- cal genetics, evolutionary genetics, and population genetics will be studied, whereas today ecological theory and evolutionary theory have most often been studied in isolation, as noted in Chapter 4. As noted above, evolutionary biologists are beginning to add population details. Access to high-end computation will enable increased coupling despite the different timescales of ecological and evolutionary processes. Applying HECC will allow making the models consistent with life processes, which couple ecological and evolutionary dynamics. Thus the impact will begin near term but extend into the indefinite future as this very important question, the evolutionary dynamics of the phenotype-environmental interface, is addressed. Another powerful capability resulting from HECC will be the introduction of climate models into evolutionary trees, which will connect environmental and ecological modeling with evolutionary modelÂ ing. This would provide the basis for, among other topics, causal models that link Earthâs physical history (including climate change) with its biotic evolution, necessitating extraordinary spatiotemporal scales. Models that link Earthâs geosphere and its biosphere will inevitably have a large number of parameters. Already, scientists across many disciplines couple climate and environmental models as well as ecosystem distribution (from satellite data) to simulate the course of Earthâs environment and predict changes due to global change. Species distribution can be modeled as a function of environment or environmental models, which can lead to inferences about the environmental experiences of common ancestors down a phylogenetic tree. Thus these classes of environmental and evolutionary models can be linked and extended over historical time to look at the interplay of the physical world (geology, climate) and the biological world (species, populations, communities within the range of natural environments). The powerful advances beginning today will mature, along with related efforts to integrate and analyze disparate data sets. This will lead to very large requirements at this interface that will also push whatever state-of-the-art computing resources and infrastructure are available in the 5- to 15-year time frame. Besides the near-term opportunities for high-end computing to provide insight into speciation, the continued access to advanced resources will open up new options in the longer term for refined analysis of the origin of individual species and the speciation process. The option of including demographic his- tory will become viable. Sequence data yield multiple phylogenies, and complex population histories are often consistent with a number of different sequence trees, resulting in considerable uncertainty. Nor can analytical or closed-form solutions be realized. Approximations will continue to be used in the short term, including Markov chain Monte Carlo sampling. Increased computing power and the continued analysis of speciation will enable the statistical evaluation of the universe of potential trees, and then population models will need to be evaluated to see the best fit to the trees. In sum, as theory advances and allows the exploration of alternatives, continued progress on the speciation problem will require access to whatever state-of-the-art capability platform and associated computing infrastructure become available. At the heart of what biologists term twenty-first century biology is the characterization of the truly complex processes of multicellular organisms and their development. Today, this work is just beginning: An experimental basis, with some theoretical foundation, is being established. As always, the actual characterization must proceed from an evolutionary biology perspective. Understanding the evolution

112 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING of cell processes and of developmentâhow intracellular and intercellular (physiological) networks and cellular- to organismic-level developmental processes have evolvedâwill require implementing new algorithms as well as continuing the current, rapid experimental progress. These efforts are just beginning as biologists learn that simple extensions of engineering and physical models are wholly inadequate for capturing the complexity of any level of biology and lifeâs processes. In the next 5-15 years, massive computational efforts will be involved in such modeling. Numerical and Algorithmic Characteristics of HECC for Chemical Separations The key HECC requirements of chemical separations are those that will allow simulations of greater complexity to be performed more accurately so that their ability to guide experimentation can be e Â xploited more readily. Current algorithmic formulations provide very useful physical insights, but they are not yet sufficiently accurate to provide stand-alone predictive power for phase-equilibria problems nor powerful enough to represent the complexity needed for separation process design. At the least they require a close interface with experimental validation studies. However, experimental approaches typi- cally are limited to optimizing existing chemical separations solutions. Computational approaches, by contrast, offer low-cost explorations of radical changes that might optimize chemical separations. While current model and algorithmic capabilities allow the routine use of 103-104 processors per run, long-term requirements necessary for exploiting 105-106 or more processors will require the development of new ideas in mathematical models, numerical algorithms, software infrastructure, and education and training in the computational and mathematical sciences. Given that some 80 percent of the chemical separations industry essentially relies on understanding phase equilibriaâwhether explicitly formulated as a thermal-based distillation problem or as the context for developing MSA materialsâthe current capabilities of computational chemistry must be significantly extended to address Major Challenges 1 and 2 in Chapter 5. Computational chemistry includes calcula- tions at the molecular scale with algorithms based on quantum chemical theories and classical analogs that evaluate the energetics of molecular conformations, as well as statistical mechanical methods that sample those conformations consistent with thermodynamic variables such as temperature and pressure. The underlying theoretical framework of quantum mechanics is used to define a model potential energy surface for the materials system of interest, and statistical mechanics is used to formulate the sampling protocols on this surface to evaluate equilibrium properties at a series of thermodynamic state points. Simulation methods such as molecular dynamics involve calculating averaged properties from finite- length trajectories. The underlying molecular dynamics engine is a particle-based algorithm that solves Taylor expansion approximations to Newtonâs equation of motion. The algorithms are well-formulated as symplectic integratorsâthat is, discretizations that have a Hamiltonian structure analogous to that of the dynamics being approximated, which contributes to their ability to generate stable long Âtrajectories. Molecular dynamics simulations involve two levels of problem granularity, which makes them well- suited for parallelism. The rate-limiting step for these simulations is the evaluation of empirical energy and forces for N particles. The most common forms of those energies and forces map onto a fine- grained parallelization (using spatial or force decomposition) that scales reasonably well (approximately 55 percent of linear) up to 128 processors. More sophisticated models include multibody polarization, or multipoles, which scale less well and will require reconsideration of problem granularity. Overlaid on this fine-grained parallelization is another layer of (trivial) coarse-grained parallelization involving a statistical mechanical sampling algorithm which runs N independent simulations, which may involve infrequent communication to swap state point information (6N real numbers). Both MPI and

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 113 OpenMP are already effectively used to efficiently exploit distributed and shared memory architectures. Calculations involving 105 particles are currently feasible with the use of existing simulation capabili- ties on 103-104 processors. In addition, improved sampling methods that accelerate the convergence to equilibrium, characterize dynamical properties involving long timescales, and sample rare events have been advanced by impressive new mathematical models such as transition path theory formulations and the string method and its variants. The numerical algorithms are well understood; coupled with longer trajectory runs on larger systems, such algorithms will easily be deployed on massively parallel archi- tectures involving 105-107 processors. Currently, quantum electronic structure algorithms are used to develop empirical force fields by calculating conformational energies and geometries of small material fragments that are ultimately transferred to describe components of larger materials. For some materials, the physics of electron c Â orrelation is well-described by a mean field theory plus a second-order perturbation of electron corre- lations (MP2). MP2 is formulated as an algorithm that solves dense linear algebra equations dominated by matrix multiples. Significant algorithmic improvements over the last 5 years allow for MP2 calcula- tions of 40-50 atom systems on a single processor; recent modestly scaling parallel algorithms allow for MP2 calculations on larger chemical systems of up to a few hundred atoms. Computational hardware improvements would allow for the MP2 calculation of much larger fragments and model compounds, which would reduce the error in transforming the quantum mechanical data into parameters for Âempirical, classical potential energy surfaces of the chemical separations materials of interest. But for other materials, MP2 is inadequate or fails altogether. To attain the desired quantum model accuracy in those cases requires the use of more sophisticated electron correlation schemes for wave-Âfunction methods or developing the promise of density functional theory (DFT). Higher-order correlation schemes for wave-function methods are well-formulated mathematical models, but they are severely limited by poor algorithmic scaling. By contrast, DFT methods scale much better but are currently limited by weaknesses in their theoretical formulation. Future capabilities in computing and algorithms would allow the use of the gold standard, coupled cluster algorithms. Because these are not yet parallelized, applying them to fragments with more than 10 atoms is not yet feasible in the absence of symmetry. Major Challenge 3 concerns the design of overall separation systems. It is an instance of a math- ematical optimization problem that aims to extremize a relevant cost function. For example, methods have been developed that use mathematical âsuperstructure optimizationâ to find optimal configurations and operating conditions for separations based on inputs about mass, heat, and momentum transfer. However, superstructure optimization is limited by the difficulty of formulating the full parameter space for the initial superstructure and the large size of the optimization problem necessary to determine the best solution within that space. The first of these limitations requires expert input to help set up the parameter space of the problem. The second can be addressed with new optimization approaches known as generalized disjunctive programming that can deal effectively with the discontinuous and nonconvex nature of the resulting mixed-integer nonlinear program. But such mathematical optimization problems quickly push the limits of HECC as more complexity is incorporated into the model. In summary, computational chemistry is adequate at present to address, usually only qualitatively, the most standard chemical separations processes of immediate interest to industry, although it could also serve to screen candidate processes for nonstandard phase-equilibria chemical separations. The primary benefit of future massively parallel computers would be to increase by one or two orders of magni- tude the size of chemical systems that can be simulated, and sampling timescales can be significantly lengthened, leading to better convergence of phase data without any significant changes in algorithmic structure. Enhanced hardware capabilities will allow superstructure optimization to be applied to usefully

114 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING address complex synthesis problems. That having been said, far more issues remainâmodel accuracy (computational chemistry), model development (superstructure parameter space), and workforce train- ing in computational and mathematical sciencesâto enable moving ahead on the biggest challenges facing chemical separations. CATEGORIZATION OF NUMERICAL AND ALGORITHMIC CHARACTERISTICS OF HECC NEEDED IN THE FOUR SELECTED FIELDS Models A common thread emerging from this study of four fields is the need to develop models whose detailed mathematical structure is still not completely specified, much less understood. In astrophysics and the atmospheric sciences, the need for new models arises from the attempt to represent complex combinations of physical processes and the effect of multiple scales in a setting where many of the constituent processes (fluid dynamics, radiation, particle dynamics) have well-defined mathematical descriptions whose ranges of validity are well understood. In the field of evolutionary biology, com- prehensive frameworks must be developed for mathematical modeling at all scales, from populations and ecosystems down to the cellular level, as well as connections to various levels of description and to experimental and observational data. In chemical separations, the complete parameter space for process optimizationâthat is, the space that contains the best solutionâremains ill-defined and is developed in mixed discrete and continuous variables that do not allow for the ready use of off-the-shelf algorithms for mathematical optimization. In all four fields, the resulting models must be mathematically well- posed (in order for them to be reliably computable) and susceptible to validation (in order to obtain well-characterized ranges for the fidelity and applicability of the models). Although the detailed requirements for models in these four fields are quite different, there are com- monalities that suggest overlap in the mathematical infrastructure. One is the extent to which models will be bootstrapped from data. None of the new models will be first-principles models in the sense that some of the fundamental models in the physical sciences from the nineteenth and early twentieth centuries are. Those early models are broadly applicable and well-characterized mathematically and physically, and only a small number of parameters for them must be determined from experimental data. In contrast, the new models will probably be more domain-specific, with larger sets of parameters that will have to be determined from combinations of experimental and observational data and auxiliary computer simulations. This suggests tight coupling between model formulation and model validation, requiring optimization and sensitivity analysis. A second commonality is the extent to which the next generation of models is likely to consist of hybrids that have both deterministic and stochastic compo- nents. The well-posedness of such hybrids is much more difficult and far less complete than for pure deterministic or pure stochastic models, particularly when they are expressed in the form required to ensure well-behaved numerical simulations. Algorithms Several requirements occur repeatedly among applications in the physical sciences. One is the need for handling stiffness in time-dependent problems so that the time step can be set purely by accuracy considerations rather than by having to resolve the detailed dynamics of rapidly decaying processes as they come to equilibrium. In ordinary differential equations, this problem is dealt with using implicit methods, which lead to large linear systems of equations. Such an approach, when applied naively to the high-resolution spatial discretizations that arise in astrophysics and atmospheric modeling, leads to linear

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 115 systems whose direct solution is prohibitively expensive in terms of central processing unit time and memory. The alternatives are to use reduced models that eliminate the fast scales analytically or to use iterative methods based on approximate inverses (âpreconditionersâ) constructed to be computationally tractable and to take into account the analytic understanding of the fast scales. In both cases, a success- ful attack on this problem involves understanding the interplay between the mathematical structure of the slow and fast scales and the design of efficient solvers. There is also a need for continuing advances in multiresolution and adaptive discretization Âmethods, particularly in astrophysics and atmospheric modeling. Topics that require further development Âinclude the extension to complex multiphysics applications, the need for greater geometric flexibility, the d Â evelopment of higher-order accurate methods, and scalability to large numbers of processors. The con- struction of such methods, particularly in conjunction with the approaches to stiff timescales, Âdescribed above, will require a deep understanding of the well-posedness of the problems being solved as initial- value or boundary-value problems in order to obtain matching conditions across spatial regions with different resolutions. Finally, the need for high-performance particle methods arises in both astrophysics and chemical separations. The issues here are the development of accurate and efficient methods for evaluating long- range potentials that scale to large numbers of particles and processors and of stiff integration methods for large systems of particles. Software The two key issues that confront simulation in all four fields relate to the effective use of the Âincreased computing capability expected over the coming decade. The first is that much of the increased capability will be used to substantially increase the complexity of simulation codes in order to achieve the fidelity required to carry out the science investigations described in Chapters 2-5. The second issue is that much of the increase in capability is likely to come from disruptive changes in the computer architectures. Over the last 15 years, the increase in computing capability has come from increases in the number of processors, improved performance of the interconnect hardware, and improved performance of the component processors, following Mooreâs law, as clock speeds increased and feature sizes on chips decreased. As seen by the applications developer, these improvements were such that the computing environment remained relatively stable. The difficulty we face now is that the continuous increase in single-processor performance from i Âncreased clock speeds is about to come to an end, because the power requirements to drive such systems are prohibitive. We expect that the aggregate performance of high-end systems will continue to increase at the same rate as before. However, the mechanisms for achieving that increase that are currently under discussionâhundreds of processors on a single chip, heterogeneous processors, hardware that can be reconfigured at run timeârepresent radical departures from current hardware and may well require an e Â ntirely new programming model and new algorithms. The potential rate of change here is overwhelmingly greater than what we have experienced over the last 15 years. Initially, HECC multiprocessor systems had ~100 processors, and current HECC systems have ~104 processors. By comparison, over the next decade we could see as many as 108 processors and 109 threads in order to see the same rate of increase in capability. It is not known how to effectively manage that degree of parallelism in scientific applications, but it is known that many of the methods used now do not scale to that level of concurrency. The increased complexity of simulation codes that will be brought about by these two issues not only poses an economic problem but also could be a barrier to progress, greatly delaying the development of new capabilities. The development of simulation and modeling software, particularly for nonlinear prob- lems, is a combination of mathematical algorithm design, application-specific reasoning, and Ânumerical

116 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING experimentation. These new sources of software complexity and uncertainty about the programming model threaten to stretch out the design-development-test cycle that is central to the process of develop- ing new simulation capabilities. Both issues suggest that it may be necessary to rethink the way HECC scientific applications are developed. Currently, there are two approaches to HECC code development. One is end-to-end develop- ment by a single developer or a small team of developers. Codes developed in this fashion are sometimes made freely available to the scientific community, but continuing support of the codes is not formally funded. The second approach is community codes, in which software representing complete simulation capabilities is used by a large scientific community. Such codes require a substantial investment for development, followed by funding for ongoing enhancement and support of the software. The single- developer approach is used to various extents by all four fields, with community codes used mainly in the atmospheric sciences and chemical separations. There are limits to both approaches relative to the issues described above. The single-developer approach is resource-limited: Building a state-of-the-art simulation code from scratch requires more effort and continuity than can be expected from the typi- cal academic team of graduate students and postdoctoral researchers. The limitation of the community code approach as it is currently practiced is that the core capabilities of such codes are fixed for long periods of time, with a user able to make only limited modifications in some of the submodels. Such an approach may not be sufficiently nimble to allow for the experimentation in models and algorithms required to solve the most challenging problems. An intermediate approach would be to develop libraries that implement a common set of core algoÂ rithms for a given science domain, out of which simulation capabilities for a variety of models could be assembled. If such a sufficiently capable collection of core algorithms could be identified, it would greatly expand the range of models and algorithmic approaches that could be explored for a given level of effort. It would also insulate the science application developer from changes in the programming model and the machine architecture, since the interfaces to the library software would remain fixed, with the library developers responsible for responding to changes in the programming environment. Such an approach has been successful in the LAPACK/ScaLAPACK family of libraries for dense linear algebra. Essential to the success of such an endeavor would be developing a consensus within a science domain on the appropriate core algorithms and the interfaces to those algorithms. The discussion above suggests that all four fields represented in this study would be amenable to such a consensus. Two of them already make use of community codes, and the other two have indicated the need for standard tool sets. More generally, the prospect of radical changes in all aspects of the HECC enterpriseâhardware, programming models, algorithms, and applicationsâmakes it essential that the computational science community, including both applications developers and library developers, work closely with the com- puter scientists who are designing the hardware and software for HECC systems. CROSSCUTTING CHALLENGES FROM MASSIVE AMOUNTS OF DATA Of the four fields examined in this study, threeâastrophysics, atmospheric science, and evolutionary biologyâcan be characterized as very data intensive. The nature of their data, the types of processing, and the interaction between models and simulations vary across these three fields, with evolutionary biology being the most distinct. The data intensity can be characterized using various dimensions: â¢ Size, â¢ Scaling,

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 117 â¢ Complexity, â¢ Types of data, â¢ Processing requirements, â¢ Types of algorithms and models to discover knowledge from the data, â¢ Archiving information, â¢ Sharing patterns among scientists, â¢ Data movement requirements, and â¢ Impact of discovery and results in driving experiments, simulations, and control of instruments. Data sets in all three of these fields, whether produced by simulations or gathered via instruments or experiments, are approaching the petascale and are likely to increase following an analog of Mooreâs law. First, new data from atmospheric and environmental observations are added every day as satellite and other sensor-based data are gathered at ever-increasing resolutions. This data resource is more than just a static archive: Continuing progress in scientific understanding and technological capability makes it possible to upgrade the quality and applicability of the stored data using analysis and simulations, which further increases the demands on storage and analysis. Second, as more instruments are used with higher resolution, the amount of observed data is continuously increasing at an exponential rate. Third, data produced by simulations are increasing as finer resolutions are used and larger simulations are performed. The most challenging data-related aspects in these fields are the following: â¢ Discovery of knowledge from data in timely ways, â¢ Sharing and querying of data among scientists, â¢ Statistical analysis and mining of data, and â¢ Storage and high-performance I/O. Advances in these fields require the development of scalable tools and algorithms that can handle all of these tasks for petascale data. To improve the productivity of scientists and engineers, as well as that of systems, data manage- ment techniques need to be developed that facilitate the asking of questions against data and derived data, whether produced by simulations or observations, in such a way that simulations do not need to be repeated. Technologies that create these data repositories and provide scalable analytics and query tools on top of them are necessary, as is the development of ontologies and common definitions across these fields. One of the most challenging needs for astrophysics and atmospheric science is the ability to share observational data and simulation data along with derived data sets and information. The atmospheric sciences are further along in providing common formats and sharing of data, but a tremendous amount of work remains to be done. One could think of this entire process as developing scientific data ware- houses with analytical capabilities. Another very important aspect of data management and analysis in these domains is the development of paradigms and techniques that permit user-defined processing and algorithmic tasks no matter where the data reside, thereby avoiding or reducing the need for transporting raw tera- and petascale data. Acceleration hardware and software may both be useful here. In evolutionary biology, the type, collection, and processing of data are somewhat different from what is encountered in astrophysics and atmospheric sciences and, in their own ways, very challenging. A large amount of data processing in evolutionary biology entails manipulating strings and complex tree structures. In the analysis of genomic data, the underlying algorithms involve complex searches,

118 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING optimizations, mining, statistical analysis, data querying, and issues of data derivation. Data produced by large-throughput processes such as those involving microarrays or proteomics instruments require statistical mining and different forms of signal and image processing algorithms (mainly to determine clusters of different expressions). One complex aspect of data management and processing for evolution- ary biology is the ability to continually curate data as new discoveries are made and to update databases based on discoveries. In evolutionary biology, unlike the other fields considered in this report, there is still no consensus on many of the mathematical models to be used, so that researchers often explore data in different ways, based on different assumptions, observations, and, importantly, the questions they are asking. Moreover, at its core, evolutionary science is founded on biological comparisons, and that means solutions to many computational problems, such as tree analyses among species or among indi- vidual organisms within and among populations, will scale in nonpolynomial time as samples increase, compounding the task of data analysis. A two-step data-management approach seems likely in the near term for evolutionary biology, whereby individual investigators (or small teams) analyze comparatively small amounts of data, with their results and data being federated into massive data warehouses to be shared and analyzed by a larger community. The federated data will require HECC resources. (Some individual investigators will also work with massive amounts of data and HECC-enabled research, as in astrophysics and atmospheric sci- ence, but that will probably not be a common investigative paradigm for some time.) A serious concern is that the lack of tools and infrastructure will keep us from managing and capitalizing on these massive data warehouses, thus limiting progress in evolutionary biology. In particular, real-time interaction with HECC systems is required in order for biologists to discover knowledge from massive amounts of data and then use the results to guide improvements to the models. Special-purpose configurable hardware such as field-programmable gate arrays (FPGAs) might be very useful for enhancing the performance of these tasks. Finally, because evolutionary biologists are not concentrated in a few research centers, Web-based tools that allow the use of HECC systems for data analysis would be very useful. CROSSCUTTING CHALLENGES RELATED TO EDUCATION AND TRAINING Computational modeling and simulation are increasingly important activities in modern science and engineering, as illustrated by our four fields. As highlighted in the Presidentâs Information Technology Advisory Committee report, âcomputational scienceâthe use of advanced computing capabilities to understand and solve complex problemsâhas become critical to scientific leadership, economic com- petitiveness, and national security.â As such, the readiness of the workforce to use HECC is seen as a limiting factor, and further investment in education and training programs in computational science will be needed in all core science disciplines where HECC currently plays, or will increasingly play, a significant role in meeting the major challenges identified in this report. Areas that warrant education and training can be identified by considering some of the early suc- cesses of the four fields examined in this report. The atmospheric science community has evolved in the direction of well-supported community codes largely because agencies and institutions recognized that computing for both research and operations was increasingly converging toward shared goals and strategies. One consequence is that the atmospheric sciences field has offered proportionally greater opportunities for training workshops, internships, and fellowships in the computational sciences than the other three fields. The chemical separations and astrophysics communities are largely self-sufficient Computational Science: Ensuring Americaâs Competitiveness, June 2005, p. iii. Available online at http://www.nitrd.gov/ pitac/reports/20050609_computational/computational.pdf.

REQUIRED NUMERICAL AND ALGORITHMIC CHARACTERISTICS 119 in computational science, as illustrated by the development and maintenance of robust academic codes. This stems in part from the broad training in mathematics and physical sciences received by scientists in those fields and the strong reward system (career positions in academia, government, and industry) that allows theory and computational science to thrive within the fields. Evolutionary biology has suc- cessfully collaborated with statisticians, physicists, and computer scientists, but only since the 1990s, to address the inevitable computational issues surrounding an explosion of genomic-scale data. This has pushed evolutionary biology into quantitative directions as never before, through new training grants and genuinely interdisciplinary programs in computational biology. These early successes also show that the four fields examined here are developing at different paces toward the solution of their HECC-dependent major challenges. Evolutionary biology is increasingly moving to quantitative models that organize and query a flood of information on species, in varying formats such as flat files of DNA sequences to complex visual imagery data. To approach these problems in the future, students will need stronger foundations in discrete mathematics and statistics, familiarity with tree-search and combinatorial optimization algorithms, and better understanding of the data mining, warehousing, and visualization techniques that have developed in computer science and information technology. Evolutionary biology is currently limited by the artificial separation of the biological sciences from quantitative preparation in mathematics and physics that are standard in the so-called âhardâ sci- ences and in engineering disciplines. Thus, the key change that must take place in evolutionary biology is greater reliance on HECC for overcoming the major challenges. Advances in chemical separations rely heavily on compute-bound algorithms of electronic structure theory solved by advanced linear algebra techniques, as well as on advanced sampling methods founded on statistical mechanics, to do large particle simulations of materials at relevant thermodynamic state points. Chemistry and chemical engineering departments at universities traditionally employ Âtheoretical/ computational scientists who focus broadly on materials science applications but less on chemical separations problems, which are more strongly centered in the industrial sector. Moreover, these aca- demic departments have emphasized coursework on chemical fundamentals and analytic models, and they need to better integrate numerical approaches to solving chemically complex problems into their undergraduate and graduate curricula. The chemical separations industry should consider sponsoring workshops, internships, and masterâs degree programs in computational science to support R&D in this economically important field. Astrophysics and the atmospheric sciences are the most HECC-ready of the four fields, evidenced in part by their consensus on many models and algorithms. In the case of the atmospheric sciences, the community has even evolved standardized community codes. However, the chemistry and physics knowledge in these fields continues to increase in complexity, and algorithms to incorporate all of that complexity are either unknown or limited by software deployment on advanced hardware architectures. Even though these disciplines have a strong tradition as consumers of HECC hardware resources, preparation in basic computational science practices (algorithms, software, hardware) is not specifically addressed in the graduate curriculum. Astrophysics, chemical separations, evolutionary biology, andâto a lesser extentâthe atmospheric sciences typify the academic model for the development of a large-scale and complex software project in which Ph.D. students integrate a new physics model or algorithm into a larger existing software infrastructure. The emphasis in the science disciplinary approach is a proof-of-principle piece of soft- ware, with less emphasis on âhardeningâ the software or making it extensible to larger problems or to alternative computing architectures. This is a practical outcome of two factors: (1) limitations in train- ing in computational science and engineering and (2) the finite time of a Ph.D. track. While computer time at the large supercomputing resource centers is readily available and no great effort is needed to

120 THE POTENTIAL IMPACT OF HIGH-END CAPABILITY COMPUTING obtain a block of time, software developed in-house often translates poorly to evolving chip designs on the massively parallel architectures of todayâs national resource machines. In response, a field typically develops its own computational leaders who, purely as a service to their communities, create more robust software and reliable implementations that are tailored to their needs. Those leaders will emerge only if there are appropriate reward mechanisms of career advancement, starting with the support for education and training to learn how to best advance their particular areas of computational science, especially in those science fields where computational readiness is still emerging. Two models for education and training are being used to advance the computational capabilities of our future workforce: the expansion of computational sciences within existing core disciplines and the development of a distinct undergraduate major or graduate training as a âcomputational technologist.â The first of these models faces the problem of expanding training and educational opportunities at the interface of the science field with computational and mathematical approaches, and to do this coher- ently within the time frame typical of a B.S. or Ph.D. degree. It would require integrating computational science topics into existing courses in core disciplines and creating new computational science courses that address current gaps in coverage in degree programs, which in turn call for flexibility in curricula and appropriate faculty incentives. With this approach, the rate at which standardized algorithms and improved software strategies developed by numerical analysts and computer scientists filter into the particular science is likely to be slowed. If education and training exist in undergraduate and graduate programs in a given science and engineering discipline, this is the most common model because it leads to a well-defined career path for the computational scientist within the discipline. The second model is to develop new academic programs in computational science and engineering that emphasize the concepts involved in starting from a physical problem in science and engineering and developing successful approximations for a physical, mathematical, analytic, discrete, or object model. The student would become robustly trained in linear algebra and partial differential equations, finite difference and finite element methods, particle methods (Monte Carlo and molecular dynamics), and other numerical areas of contemporary interest. A possible limitation is that standardized algorithms and software that are routinely available may need tailoring to suit the needs of the particular science field, which only a field expert can envision. This model is the less common since the reward system for the excellent work of a computational scientist is diluted across multiple scientific/engineering disciplines and because of inherent prejudices in some scientific fields that favor insight over implementation. The primary challenge is to define a career track for a computational generalist who can move smoothly into and out of science domains as the need arises for his or her expertise and have those contributions integrated into a departmental home that recognizes their value. Astrophysics, atmospheric sciences, and chemical separations are most ready for the first modelâ direct integration of computational science into the core discipline curriculumâwhile evolutionary biol- ogy has received the attention of statisticians, physicists, and computer scientists to develop something closer to the second model. Both models require expertise in large-scale simulation, such as efficient algorithms, data mining and visualization, parallel computing, and coding and hardware implementation. Ultimately both models will benefit any science field since both are drivers for the cross-disciplinary activity that is to be encouraged for the growing interdisciplinarity of science and engineering.