Charting a Path in a
Shifting Technical and
Post-Exascale Computing for the
National Nuclear Security Administration
Committee on Post-Exascale Computing for
the National Nuclear Security Administration
Computer Science and Telecommunications
Division on Engineering and Physical Sciences
Consensus Study Report
NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001
This activity was supported by the National Nuclear Security Administration under award number DE-EP0000026/89233121FNA400371. Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.
International Standard Book Number-13: 978-0-309-70108-2
International Standard Book Number-10: 0-309-70108-2
Digital Object Identifier: https://doi.org/10.17226/26916
This publication is available from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu.
Copyright 2023 by the National Academy of Sciences. National Academies of Sciences, Engineering, and Medicine and National Academies Press and the graphical logos for each are all trademarks of the National Academy of Sciences. All rights reserved.
Printed in the United States of America.
Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2023.
Charting a Path in a Shifting Technical and Geopolitical Landscape: Post-Exascale Computing for the National Nuclear Security Administration. Washington, DC: The National Academies Press. https://doi.org/10.17226/26916.
The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president.
The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. John L. Anderson is president.
The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president.
The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine.
Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org.
Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the study’s statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committee’s deliberations. Each report has been subjected to a rigorous and independent peer-review process and it represents the position of the National Academies on the statement of task.
Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies.
Rapid Expert Consultations published by the National Academies of Sciences, Engineering, and Medicine are authored by subject-matter experts on narrowly focused topics that can be supported by a body of evidence. The discussions contained in rapid expert consultations are considered those of the authors and do not contain policy recommendations. Rapid expert consultations are reviewed by the institution before release.
For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo.
COMMITTEE ON POST-EXASCALE COMPUTING FOR THE NATIONAL NUCLEAR SECURITY ADMINISTRATION
KATHERINE A. YELICK (NAE), University of California, Berkeley, Chair
JOHN B. BELL (NAS), Lawrence Berkeley National Laboratory
WILLIAM W. CARLSON, IDA Center for Computing Sciences
FREDERIC T. CHONG, University of Chicago; Infleqtion
DONA L. CRAWFORD, Lawrence Livermore National Laboratory (retired)
MARK E. DEAN (NAE), University of Tennessee, Knoxville
JACK J. DONGARRA (NAE), University of Tennessee, Knoxville
IAN T. FOSTER, University of Chicago; Argonne National Laboratory
CHARLES F. McMILLAN, Los Alamos National Laboratory (retired)
DANIEL I. MEIRON, California Institute of Technology
DANIEL A. REED, The University of Utah
KAREN E. WILLCOX (NAE), The University of Texas at Austin
THƠ H. NGUYỄN, Senior Program Officer, Study Director
JON K. EISENBERG, Senior Board Director
GABRIELLE M. RISICA, Program Officer
SHENAE A. BRADLEY, Administrative Assistant
NOTE: See Appendix D, Disclosure of Unavoidable Conflicts of Interest.
COMPUTER SCIENCE AND TELECOMMUNICATIONS BOARD
LAURA HAAS (NAE), University of Massachusetts Amherst, Chair
DAVID CULLER (NAE), University of California, Berkeley
ERIC HORVITZ (NAE), Microsoft Research
CHARLES ISBELL, Georgia Institute of Technology
ELIZABETH MYNATT, Georgia Institute of Technology
CRAIG PARTRIDGE, Colorado State University
DANIELA RUS (NAE), Massachusetts Institute of Technology
MARGO SELTZER (NAE), University of British Columbia
NAMBIRAJAN SESHADRI (NAE), University of California, San Diego
MOSHE Y. VARDI (NAS/NAE), Rice University
JON K. EISENBERG, Senior Board Director
SHENAE A. BRADLEY, Administrative Assistant
RENEE HAWKINS, Finance Business Partner
THƠ H. NGUYỄN, Senior Program Officer
GABRIELLE M. RISICA, Program Officer
BRENDAN ROACH, Program Officer
NNEKA UDEAGBALA, Associate Program Officer
This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their review of this report:
RANDAL BRYANT (NAE), Carnegie Mellon University
DAVID CULLER (NAE), Google
WILLIAM DALLY (NAE), NVIDIA Corporation
ALAN EDELMAN, Massachusetts Institute of Technology
DENNIS GANNON, University of Indiana
MARK HOROWITZ (NAE), Stanford University
DANIEL KATZ, University of Illinois at Urbana-Champaign
CHERRY MURRAY (NAS/NAE), University of Arizona
JIM RATHKOPF, Los Alamos National Laboratory; Lawrence Livermore National Laboratory
ROBERT ROSNER, University of Chicago
VALERIE TAYLOR, Argonne National Laboratory
Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations of this report, nor did they see the final draft before its release. The review of this report was overseen by WILLIAM GROPP (NAE), University of Illinois at Urbana-Champaign, and SUSAN GRAHAM (NAE), University of California, Berkeley. They were responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests entirely with the authoring committee and the National Academies.
The National Nuclear Security Administration (NNSA) relies on advanced computing capabilities for modeling and simulation to deliver its stockpile stewardship mission. Underpinning NNSA’s computing capabilities are leading-edge, high-performance computing (HPC) technologies, and a world-class scientific computing workforce. As the nuclear stockpile ages and evolves, the mission to accurately model and simulate weapons’ behavior becomes significantly more complex. Concurrently, the computing technology and market landscape is rapidly shifting on all fronts, further challenging NNSA’s ability to develop and deploy the kind of leadership computing capabilities needed to ensure the success of its mission. The committee believes that realizing post-exascale computing will require an integrated program that extends beyond hardware to include algorithms, software, and new operational models, as well as workforce development, among other considerations. This view informs the committee’s approach to this study. As this report describes, meeting these challenges will require a roadmap, sustained support and investment from the policy community, and visionary leadership at both NNSA and its three national laboratories to ensure that the computing platforms and talent are in place to meet NNSA requirements in a post-exascale era.1
1 This report uses “post-exascale era” as the 20-year period starting with the installation of the first DOE exascale system in 2022 and “post-exascale systems” as the leading-edge HPC systems that will follow the current exascale procurements. The committee has chosen not to describe these future systems as zettascale systems, because the focus is not on a particular floating-point rate but on time-to-solution for problems of the scale and accuracy needed for the future challenges associated with stockpile certification.
As mandated by Congress in the 2021 National Defense Authorization Act, a committee was established by the National Academies of Sciences, Engineering, and Medicine to review “the future of computing beyond exascale computing to meet national security needs at the National Nuclear Security Administration.” In the context of the NNSA mission needs, the committee was asked to evaluate future technology trajectories as well as the U.S. industrial base required to meet those needs. (See Appendix A for the complete statement of task.)
The committee engaged with leading high-performance computing (HPC) developers, international HPC operators (especially for nuclear energy and technology), and the NNSA/Advanced Simulation and Computing (ASC) program itself. The committee received briefings and reference material, both classified and unclassified; reviewed the results of related government working groups and studies; and deliberated extensively to assess the current state of the NNSA/ASC mission, future mission scope and needs, current HPC capabilities and technology directions, and scientific computing workforce needs. In response to review feedback on this report, the committee also asked for written inputs from NNSA on computing demands, computational patterns, and the role of artificial intelligence. A classified annex was deemed unnecessary, as it would not provide any additional information that would affect the report’s findings and recommendations. Examples of future mission drivers include aging of nuclear materials, high-resolution simulations of subcritical experiments, and understanding hypersonic flows in reentry environments, all with the ever-present need for assessment of margins and uncertainties. These will require advances not just in hardware capability but in mathematical modeling, algorithms, and software that go well beyond simple scaling of problem size or resolution. These models must be implementable in classified software that will run on these future machines and will depend on a wide range of software from operating systems to libraries developed by others.
High-performance computing supporting the nuclear deterrent has faced challenges in the past, most notably 30 years ago, when the United States entered a moratorium on full-scale underground nuclear testing. The challenges that NNSA and its laboratories face today are different from, but equally as daunting as, those at the beginning of the stockpile stewardship era. Thirty years ago, the technical path forward for computing was clear, and NNSA was able to leverage the growth path of the computing industry. Neither of these conditions holds today. Furthermore, the historical limits on timely solutions to relevant problems—floating-point operations per second—are rarely limiting today. As the laboratories articulated, memory access constraints often limit overall application performance. Thus, the committee advocates metrics such as solving hero calculations that today require a year of machine time in days. These types of metrics are far more relevant than the “scale” of the machine. Realization of these types of goals
would dramatically increase the effectiveness of the scientific and engineering staff of the laboratories in supporting the deterrent.
The committee believes that bold and transformative actions will be required for NNSA to continue to succeed in its evolving mission. These actions include the following:
- Development of an aggressive computing roadmap with relevant metrics on compelling application requirements;
- New organizational models to support talents to focus on both short-term and long-term problems;
- Bold, sustained investments in research and engineering of hardware, software, and algorithms;
- Innovative partnership models with both traditional and nontraditional partners for acquisition and deployment;
- Organizational models to support focus and creativity; and
- Expanded government-wide collaborations.
Embracing these approaches will require key organizational leadership that is able to create vision, strategy, and advocacy to meet the post-exascale challenges. Simply put, NNSA needs to fundamentally rethink its advanced computing research, engineering, acquisition, deployment, and partnership strategy. As this report details, a simple extension of the strategy developed over the past 30 years will be insufficient for mission success; it must be reimagined to ensure success.
This page intentionally left blank.