Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences
Proceedings of a Workshop—in Brief
The digital twin is an emerging technology that builds on the convergence of computer science, mathematics, engineering, and the life sciences. Digital twins have the potential to revolutionize atmospheric and climate sciences in particular, as they could be used, for example, to create global-scale interactive models of Earth to predict future weather and climate conditions over longer timescales.
On February 1–2, 2023, the National Academies of Sciences, Engineering, and Medicine hosted a public, virtual workshop to discuss characterizations of digital twins within the context of atmospheric, climate, and sustainability sciences and to identify methods for their development and use. Workshop panelists presented varied definitions and taxonomies of digital twins and highlighted key challenges as well as opportunities to translate promising practices to other fields. The second in a three-part series, this evidence-gathering workshop will inform a National Academies consensus study on research gaps and future directions to advance the mathematical, statistical, and computational foundations of digital twins in applications across science, medicine, engineering, and society.1
PLENARY SESSION: DEFINITIONS OF AND VISIONS FOR THE DIGITAL TWIN
During the plenary session, workshop participants heard presentations on the challenges and opportunities for Earth system digital twins, the history of climate modeling and paths toward traceable model hierarchies, and the use of exascale systems for atmospheric digital twins.
Umberto Modigliani, European Centre for Medium-Range Weather Forecasts (ECMWF), the plenary session’s first speaker, provided an overview of the European Union’s Destination Earth (DestinE) initiative,2 which aims to create higher-resolution simulations of the Earth system that are based on models that are more realistic than those in the past; better ways to combine observed and simulated information from the Earth system; and interactive and configurable access to data, models, and workflows. More realistic simulations at the global scale could translate to information at the regional scale that better supports decision-making for climate adaptation and mitigation through tight integration and interaction with impact sector models. Now in the first phase (2021–2024) of its 7- to 10-year program, DestinE is beginning
1 To learn more about the study and to watch videos of the workshop presentations, see https://www.nationalacademies.org/our-work/foundational-research-gaps-and-future-directions-for-digital-twins, accessed February 10, 2023.
2 The website for DestinE is https://digital-strategy.ec.europa.eu/en/policies/destination-earth, accessed March 9, 2023.
Modigliani explained that Earth system digital twins require unprecedented simulation capabilities––for example, ECMWF aspires to have a simulation on the order of 1–4 km at the global scale, which could enable the modeling of small scales of sea ice transformation and the development of more accurate forecasts. Earth system digital twins also demand increased access to observation capabilities, and efforts are under way to develop computing resources to leverage and integrate robust satellite information and other impact sector data. Furthermore, Earth system digital twins require exceptional digital technologies to address the opportunities and challenges associated with extreme-scale computing and big data. DestinE will have access to several pre-exascale systems via EuroHPC, although he pointed out that none of the available computing facilities are solely dedicated to climate change, climate extremes, or geophysical applications.
Modigliani indicated that improved data handling is also critical for the success of Earth system digital twins. The envisioned DestinE simulations could produce 1 PB of data per day, and he said that these data must be accessible to all DestinE users. While ECMWF will handle the modeling and the digital engine infrastructure for the digital twins, the European Organisation for the Exploitation of Meteorological Satellites will manage a data bridge to administer data access via a sophisticated application programming interface (for the computational work) and a data lake, which serves as a repository to store and process data that may be unstructured or structured. Policy makers could eventually access a platform operated by the European Space Agency to better understand how future events (e.g., heat waves) might affect the gross domestic product and to run adaptation scenarios. Current challenges include federating resource management across DestinE and existing infrastructures as well as collaborating across science, technology, and service programs. To be most successful, he emphasized that DestinE would benefit from international partnerships.
Venkatramani Balaji, Schmidt Futures, the plenary session’s second speaker, presented a brief overview of the history of climate modeling, starting with a one-dimensional model response to carbon dioxide doubling in 1967. This early work revealed that studying climate change requires conducting simulations over long periods of time. He explained that as the general circulation model evolved in subsequent decades, concerns arose that its columns (where processes that cannot be resolved, such as clouds, reside) place restrictions on model structures, leading to slow progress. Parameterizing clouds is difficult because of their variety and interdependence. Clouds are also sensitive to small-scale dynamics and details of microphysics; parametrizing turbulence is necessary to understand how the small scale interacts with the large scale. He asserted that no resolution exists at which all features can be resolved unless extreme scales of computational capability are reached.
Balaji next discussed how model resolution has evolved over time, describing an example with only ~10 times improvement in 50 years. He suggested that climate models should be capable of 100 simulations of at least 100 years each in 100 days. Uncertainty—including chaotic (i.e., internal variability), scenario (i.e., dependent on human and policy actions), and structural (i.e., imperfect understanding of a system)—is also a key consideration for model design. Furthermore, he stressed that strong scaling is not possible with today’s computers, which have become bigger but not faster.
Balaji noted that while various machine learning (ML) methods have been applied successfully for stationary problems (e.g., short forecasts), boundary conditions change over time in climate studies. One cannot use data for training into the future, because no observations of the future exist, he continued, but one could use short-running, high-resolution models to train simpler models to address the non-stationarity of climate. He observed that although digital twins were first introduced for
well-understood engineered systems such as aircraft engines, digital twins are now discussed in the context of imperfectly understood systems. Even if climate and weather diverge further as model-free methods become successful in weather forecasting, he anticipated that theory will still play an important role. Model calibration is needed at any resolution, he added, as are methods for fast sampling of uncertainty. In closing, Balaji remarked that because high-resolution models should not play a dominant role, a hierarchy of models is needed. He suggested that ML offers systematic methods for calibration and emulation in a formal mathematical way to achieve traceable model hierarchies.
Mark Taylor, Sandia National Laboratories, the plenary session’s final speaker, discussed the need for more credible cloud physics to achieve digital twins for Earth’s atmosphere. For example, a global cloud-resolving model (GCRM) aims to simulate as much as possible using first principles. Although GCRMs are the “backbone” of any digital twin system, they require exascale resources to obtain necessary throughput and confront challenges in ingesting and processing results. He pointed out that computers are becoming more powerful but not necessarily more efficient; although GCRMs are expected to run well on exascale systems, they are expensive, with 1 megawatt-hour per simulated day required.
Taylor described his experience porting the following two types of cloud-resolving models to the Department of Energy’s (DOE’s) upcoming exascale computers:5 (1) SCREAM (Simple Cloud-Resolving Energy Exascale Earth System Model [E3SM]6 Atmosphere Model), the GCRM, runs at 3 km, with a goal to run at 1 km; and (2) E3SM-Multiscale Modeling Framework (MMF) is a low-resolution climate model with cloud-resolving “superparameterization.”
Taylor explained that SCREAM and E3SM-MMF are running well on graphics processing units (GPUs) systems. However, both use significant resources. GPU nodes consume 4–8 times more power than central processing unit (CPU) nodes but need to be 4–8 times faster. Furthermore, GPUs had only 5 times the performance on a per-watt basis in 10 years compared to CPUs. Until more powerful machines are developed, he anticipated that GCRMs will continue to run on GPUs, because GPUs are 1.5–3 times more efficient than CPUs on a per-watt basis.
Incorporating questions from workshop participants, Ruby Leung, Pacific Northwest National Laboratory, moderated a discussion among the plenary speakers. She posed a question about what distinguishes a digital twin from a simulation. Taylor replied that a digital twin is a simulation with more efficient approaches. Balaji posited that digital twins represent the modeling and simulation efforts under way for decades. Modigliani focused on the potential for digital twins to explore what-if scenarios and emphasized that communities seek the approach with the most realistic results. Balaji stressed that in addition to being used for forecasting, digital twins should be employed for experimentation to learn more about climate physics in general.
Leung asked about the statistical considerations and learning paradigms required to combine simulations and observations. Modigliani said that ECMWF uses four-dimensional variational data assimilation to study extreme weather, which leverages both the observations and the model to create an initial state for forecast simulations. Balaji advised using a model that imposes physical consistency on the observations to make sense of disparate observational streams. He added that because “model-free” methods are still trained on model output, more work remains to move directly to observational streams.
Leung inquired about the extent to which GPUs are used to perform physics-based atmospheric simulations. Taylor remarked that E3SM-MMF runs the full atmosphere simulation on GPUs. However, he said that modeling centers do not yet run the full coupled system on GPUs—DOE is ~3 years from achieving that goal with E3SM-MMF. Modigliani emphasized the need to find the
right version that performs better than CPUs; ECMWF aims to have such a version in a few years.
Leung wondered if a regional-scale model could function as a digital twin that bypasses exascale computing requirements. Modigliani noted that global-scale models are required for any weather forecasting of more than a few days into the future. Leung asked how global digital twin outputs could be interpolated to benefit local decision-making, and Modigliani replied that integrating more specific models into global-scale models guides local decisions—for example, DestinE’s extreme weather digital twin includes a global-scale model and a local-level simulation.
PANEL 1: CURRENT METHODS AND PRACTICES
During the first panel, workshop participants heard brief presentations on methods and practices for the use of digital twins.
Yuyu Zhou, Iowa State University, discussed efforts to develop a model to reduce energy use for sustainable urban planning. The current model estimates energy use at the single building level and for an entire city of buildings by integrating the building prototypes, the assessor’s parcel data, building footprint data, and building floor numbers, and it includes city-scale auto-calibration to improve the energy use estimate. The model enables the estimation of both the spatial pattern of energy use for each building and the hourly temporal pattern (e.g., the commercial area uses the least energy at midnight and the most at noon). He noted that the model could also be used to investigate the impacts of extreme events (e.g., heat waves) or human activities (e.g., building occupant behavior) on building energy use.
Zhou explained that real-time data from sensors and Internet of Things7 devices could be assimilated into the building energy use model to develop a digital twin with a more comprehensive, dynamic, and interactive visual representation of a city’s buildings. Transportation data and social media data could also be integrated to enhance the representation of building occupant behavior. He asserted that a future digital twin could monitor real-time building energy use and optimize a city’s energy performance in real time.
Christiane Jablonowski, University of Michigan, observed that the modeling community has created digital representations of reality for years; however, digital twins could integrate new capabilities such as high-resolution representations of climate system processes as well as advancements in ML and artificial intelligence (AI). She provided a brief overview of the current state of modeling and possible future trajectories, beginning with a description of typical phenomena at temporal and spatial scales (Figure 1). She said that the microscale will never be captured by atmospheric models in any detail, and the current generation of weather models (1–3 km grid scale) has reached its limit at the mesoscale. Modelers are currently most comfortable in the synoptic regime, the site of daily weather. Climate system timescales introduce new uncertainties, as they are in the boundary value problem category, not the initial value problem category. She added that the complexity and resolution of climate models has advanced greatly over the past several decades and will continue to increase with hybrid approaches and ML.
Jablonowski emphasized that selecting the appropriate spatial and temporal scales for digital twins is critical to determine the following: the phenomena that can be represented in a model; the correct equation set for the fluid flow; the required physical parameterizations, conservation principles, and exchange processes (i.e., a good weather model today is not a good climate model tomorrow); the model complexity (e.g., ocean, ice, and chemistry components are often not well tuned); decisions about coupling and related timescales that are key to making trustworthy predictions; and whether AI and ML methods as well as observations could inform and speed up current models.
Jean-François Lamarque, consultant and formerly of the National Center for Atmospheric Research’s Climate and Global Dynamics Laboratory, shared the definition of a digital twin, as used in the digital twin Wikipedia entry: “a high-fidelity model of the system, which can
7 The Internet of Things is the “networking capability that allows information to be sent to and received from objects and devices using the Internet” (Merriam-Webster).
be used to emulate the actual system … the digital twin concept consists of three distinct parts: the physical object or process and its physical environment, the digital representation of the object or process, and the communication channel between the physical and virtual representations.”8 He emphasized that a research question should determine the tool selection; in the current generation and in the foreseeable future, no single tool exists to answer all relevant questions. Thus, he suggested evaluating the advantages and disadvantages of each tool’s approach as well as the timescales of interest. Three current tools include a coarse-resolution Earth system model, a high-resolution climate and weather model, and emulators and ML models. Key questions to evaluate the usefulness of each tool include the accuracy of the digital twin in representing the Earth system, the strength of the communication channel between the physical and digital representations, and the usefulness of the digital twin for climate research.
Gavin A. Schmidt, Goddard Institute for Space Studies of the National Aeronautics and Space Administration, explained that digital twins extend beyond the Earth system modeling that has occurred for decades. digital twins leverage previously untapped data streams, although whether such streams exist for climate change remains to be seen. Digital twins should explore a full range of possible scenarios, he continued, but only a small number of scenarios can be run with the current technology. A more efficient means to tap into information for downstream uses (e.g., urban planning) would be beneficial, and processes to update information regularly are needed. He added that higher resolution does not necessarily lead to more accurate climate predictions. Furthermore, ML does not overcome systematic biases or reveal missing processes.
Schmidt highlighted the value of improving the skill and usability of climate projections but noted that improvements in initial value problem skill do not automatically enhance boundary value problem skill. He stressed that no “digital twin for climate” exists, but digital twin technology could be used to strengthen climate models; for example, systematic biases could be reduced via ML-driven calibration, new physics-constrained parametrizations could be leveraged, and data usability could be improved.
Incorporating questions from workshop participants, Xinyue Ye, Texas A&M University, and Colin Parris, General
Electric, moderated a discussion among the four panelists. Parris posed a question about how to ensure the quality of both data from varied sources and the output that supports decision-making. Zhou explained that the output from the building energy use model was validated using several sources (e.g., utility company data and survey data). When a digital twin of city buildings emerges in the future, he said that more data could be used to improve the model, and interactions between the physical and the virtual aspects of the digital twin could be enhanced to increase confidence in the data and the output. Schmidt emphasized that data or models are not perfect and uncertainty models help evaluate their effectiveness in representing the real world. Lamarque urged researchers to evaluate multiple observations together, instead of independently, to better understand a system.
Ye inquired about strategies for working with decision-makers to define the requirements of digital twins to ensure that they will be useful. Schmidt remarked that policy makers want to know how the tools could help answer their specific questions but do not need all of the details about how they work. Lamarque mentioned that if too many details are hidden and decision-makers cannot understand the provenance of the information and the uncertainties, a lack of confidence in the results and thus in the decision-making could emerge. Jablonowski asserted that talking to stakeholders about their understanding and helping them learn to interpret uncertainty is critical. Zhou agreed that communication with and education for decision-makers is key.
Parris asked if a full range of scenarios has to be run to make projections for the next decade. Schmidt observed that although some things will not change much in the next 10 years, confidence is lacking in the accuracy of boundary force problems at that timescale. At the 20-to 30-year timescale, the range of scenarios separates, and updating with real-time changes in emissions, technology, and economics is important. He asserted that more bespoke, policy-specific scenarios (e.g., effects of an electric vehicle credit) are needed. Lamarque emphasized the need to find the interplay between the tools that are available and the questions that need to be answered.
Ye posed a question about the ability of the current technology to capture extreme behavior and reliable uncertainty analysis. Schmidt described success in capturing extremes for weather forecasting (e.g., hurricane tracks) but noted that climate change predictions are difficult to validate. Almost all extreme events are at the tail of the distribution; thus, he pointed out that the observational and conceptual challenges of assessing extremes still exist with higher-resolution digital twins.
Parris wondered how the computational challenges for sustainability science compare to those of climate science. Zhou explained that the challenges are similar; for example, many more simulations should be run when scaling from the building to the city level. Cities have thousands of buildings, each with varying conditions (e.g., evapotranspiration and microclimate) that impact energy use uniquely. To develop a more realistic digital twin for a city’s buildings, he stressed that improved computation (e.g., edge computing) is essential. Schmidt mentioned that different fields have different perspectives of a “heavy computational load”: economists might run a simulation in minutes, while climate scientists might need months or years. Computational capacity is often not used as effectively as possible, he continued, but increased access to high-performance computing could address this issue.
Ye asked about the role of computing capacity problems in the development of digital twins. Schmidt noted that a spread of models with different capabilities and different processes to estimate structural uncertainty and to sample a range of uncertainties will continue to be important. Jablonowski encouraged investments that improve access to extreme-scale computing resources and broaden community engagement in the digital twin endeavor.
PANEL 2: KEY TECHNICAL CHALLENGES AND OPPORTUNITIES
During the second panel, workshop participants heard brief presentations on challenges and opportunities for the use of digital twins.
Tapio Schneider, California Institute of Technology, provided an engineering-specific definition of a
digital twin: a digital reproduction of a system that is accurate in detail and is updated in real time with data from the system, allowing for rapid experimentation and prototyping as well as experimental verification and validation where necessary. However, he pointed out that Earth system models are very different: achieving an accurately detailed digital reproduction of the Earth system is not feasible for the purposes of climate prediction, and the use of data for climate prediction (i.e., improving the representation of uncertain processes) is fundamentally different from the use of data for weather prediction (i.e., state estimation)—so continuously updating with data from the real system is less relevant for climate prediction. He suggested a three-pronged approach to advance climate modeling: (1) theory to promote parametric sparsity and generalizability out of observational data distributions, (2) computing to achieve the highest feasible resolution and to generate training data, and (3) calibration and uncertainty quantification by learning from computationally generated and observational data. Learning about processes from diverse data that do not come in the form of input–output pairs of uncertain processes is a key challenge, but he advised that new algorithms that accelerate data assimilation with ML emulators could address this problem.
Mike Pritchard, NVIDIA/University of California, Irvine, offered another description of a digital twin, which he suggested is a surrogate for a deterministic weather prediction system that, once trained, allows for much larger ensembles. Challenges exist in understanding the credibility of these tools as surrogates for the systems on which they are trained, and tests to ensure accountability would be beneficial. He noted that digital twins could help overcome the latency and compression barrier that prevents stakeholders from exploiting the full detail of the Coupled Model Intercomparison Project (CMIP) Phase 69 library, and he described the potential value of pretraining digital twins to regenerate missing detail in between sparsely stored checkpoints at short intervals. Other challenges for the use of digital twins include the lack of available specialists to train ML models. Understanding the extrapolation boundary of current data-driven weather models is also important, as is finding ways to reproducibly and reliably achieve long-term stability despite inevitable imperfections. He concluded that digital twins offer a useful interface between predictions and the stakeholders who could leverage them.
Omar Ghattas, The University of Texas at Austin, described the continuous two-way flow of information between a physical system and a digital twin in the form of sensor, observational, and experimental data. These data are assimilated into the digital twin, and optimal decisions (i.e., control and experimental design) flow from the digital twin to the physical system. In such a tightly coupled system, if stable components are assembled incorrectly, an unstable procedure could emerge. He also explained that although real-time operation is not relevant to climate scales, it is relevant for climate-related issues (e.g., decisions about deploying firefighters to mitigate wildfires). Digital twins are built for high-consequence decision-making about critical systems, which demand uncertainty quantification (e.g., Bayesian data assimilation, stochastic optimal control), he continued, and real-time and uncertainty quantification settings demand reduced-order/surrogate models that are predictive over changing parameter, state, and decision spaces—all of which present massive challenges.
Abhinav Saxena, GE Research, depicted the three pillars of sustainability—environmental protection, economic viability, and social equity. Understanding how the environment and the climate are evolving in combination with how human behaviors are changing and how energy is being used impacts how critical infrastructure could be sustained. He noted that energy generation systems (e.g., gas and wind turbines) and other assets (e.g., engines that consume energy and produce gases) require detailed modeling to be operated more sustainably and efficiently; better weather and climate models are also needed to decarbonize the grid with carbon-free energy generation and microgrid optimization and to achieve energy efficient operations via energy optimization and resilient operation. He asserted that digital twins could
help make decades-old systems that experience severe degradation and multiple anomalies more resilient. In this context of life-cycle sustainment, digital twins have the potential to guarantee reliability, optimize maintenance and operations, reduce waste and maximize part life, and reduce costs. He summarized that because physical engineering systems and Earth systems interact with each other, their corresponding digital twins should work in conjunction to best reflect the behaviors of these physical systems and subsequently optimize operations aided by these digital twins toward sustainability.
Elizabeth A. Barnes, Colorado State University, stated that because duplicating the complexity of the Earth system will never be possible, the term “digital twin” is misleading. With the explosion of ML, questions arise about the extent to which a digital twin would be composed of physical theory, numerical integration, and/ or ML approximations. Although achieving a “true digital twin” is unlikely, she explained that any future Earth system model will have information (e.g., observations) as an input leading to a target output (e.g., prediction, detection, discovery). She asserted that model developers should be clear about a model’s intended purpose, and explainability (i.e., understanding the steps from input to output) is an essential component of that model’s success. Explainability influences trust in predictions, which affects whether and how these tools could be used more broadly, allows for fine tuning and optimization, and promotes learning new science. She expressed her excitement about ML approaches that could integrate complex human behavior into models of the Earth system.
Incorporating questions from workshop participants, Julianne Chung, Emory University, and Jim Kinter, George Mason University, moderated a discussion among the five panelists. Chung and Kinter asked about the reliability of digital twins in capturing processes and system interactions that vary across scales, as well as about how digital twin results should be communicated to different audiences given uncertainties for decision-making. Saxena replied that reliability depends on a model’s level of fidelity; the digital twin should continuously interact with the physical system, learn, and adapt. Trust and explainability are essential to use the prediction from the digital twin to operate the physical system, he added. Ghattas said that the community should not strive to achieve accuracy across all fields and scales. He urged researchers to consider what they want to predict, use a Bayesian framework to infer the uncertainties of the models, and equip the predictions with uncertainty systematically and rationally. Bayesian model selection enables one to attribute probabilities that different models are consistent with the data in meaningful quantities for prediction. Barnes noted that as models become more complex, considering how to assess whether the digital twin is reliable will be critical. Pritchard mentioned the opportunity to use scorecards from Numerical Weather Prediction Centers that provide clear metrics and encouraged further work to develop the right scorecard related to CMIP simulation details. Schneider concurred that developing metrics to assess the quality of a climate model is important; however, he explained that successful weather prediction does not guarantee successful climate prediction because the problems are very different. Because stakeholders’ needs vary, no “best” communication strategy exists: a hierarchy of models (e.g., hazard models, catastrophe models) that could better assess risk would be useful for decision-making.
Chung wondered whether uncertainty quantification and calibration are well-posed problems for Earth system simulation. Schneider described them as ill-posed problems because the number of degrees of freedom that are unresolved far exceeds the number of degrees of freedom available in the data. Therefore, additional prior information (e.g., governing equations of physics and conservation laws, domain-specific knowledge) is needed to reduce the demands on the data. Ghattas added that most inverse problems are ill-posed. The data are informative about the parameters in the low-dimensional manifold, and the other dimensions are handled via a regularization operator; for example, the Bayesian framework allows one to bring in prior knowledge to fill the gaps. He stressed that Bayesian inversion is extremely challenging on large-scale ill-posed problems and reduced-order models and surrogates would be useful.
Kinter asked the panelists to share examples of useful digital twins with current model fidelity and computational resources. Schneider mentioned that climate models have been useful in predicting the global mean temperature increase. A demand for zip code–level climate information exists, but current models are not fit for that purpose. Barnes described progress in identifying sources of predictability in the Earth system. She asserted that relevant information from imperfect climate models is vital for understanding how the real world will behave in the future. Saxena explained that when GE wants to install a new wind farm, models provide a sufficient level of detail to determine how to protect assets from weather and environmental impacts. Pritchard reiterated that digital twins enable massive ensembles, creating an opportunity to study the tail statistics of rare events and to assimilate new data streams.
PANEL 3: TRANSLATION OF PROMISING PRACTICES TO OTHER FIELDS
During the third panel, workshop participants heard brief presentations from experts in polar climate, AI algorithms, ocean science, carbon cycle science, and applied mathematics; they discussed how digital twins could be useful in their research areas and where digital twins could have the greatest future impacts.
Cecilia Bitz, University of Washington, championed the use of digital twins to understand Earth system components such as sea ice. By increasing the realism of the ocean and atmosphere and achieving improvements in those components, improvements in downstream components could be realized. Highlighting opportunities for advances in sea ice components, she referenced ECMWF’s high-resolution simulation of sea ice—with kilometer-scale flows of sea ice in the horizontal and large openings dynamically occurring—as an example of the progress enabled by moving to high resolution to view dynamic features. She expressed interest in broadening the kind of physics in the digital twin framework as well as developing parameterizations to be more efficient, which is not only a significant challenge but also a great opportunity to expand the capabilities of the surface component.
Anima Anandkumar, California Institute of Technology/ NVIDIA, discussed efforts to incorporate fine-scale features in climate simulations using neural operators. She explained that ML with standard frameworks captures only finite dimensional information; however, neural operators (which are designed to learn mappings between function spaces) enable finer scales, evaluate throughout the domain, and capture physics beyond the training data. Training and testing can be conducted at different resolutions, and constraints can be incorporated into the training. She encouraged using data-driven approaches to learn from a wealth of historical weather data and to incorporate extreme weather events into models—these approaches encompass a wide range of fields and new possibilities for generative AI in science and engineering.
Emanuele Di Lorenzo, Brown University, described work to engage coastal stakeholders and researchers to co-design strategies for coastal adaptation and resilience in Georgia. Supported by a community-driven effort, the Coastal Equity and Resilience (CEAR) Hub’s10 initial modeling and forecasting capabilities provide water-level information at the scale where people live. A network of ~65 sensors that are distributed and interconnected wirelessly along the coast around critical infrastructure provides actionable data that are important for decision-makers, which are streamed into dashboards that are continuously redesigned with their input. He explained that as the focus on equity and climate justice expands, new factors related to resilience are emerging (e.g., health and socioeconomic well-being) that demand new community-driven metrics. The CEAR Hub plans to expand its sensor network to measure air and water quality and urban heating and combine these new sources of data with social data to develop the metrics.
Anna Michalak, Carnegie Institution for Science, explained that the carbon cycle science community focuses on questions around quantification (i.e., how greenhouse gas emissions and uptake vary geographically at different resolutions as well as their variability over time), attribution (i.e., constraining the processes that drive the variability seen in space and time, which
requires a mechanistic-level understanding), and prediction (e.g., global system impact if emissions hold a particular trajectory or if climate changes in a particular way). These questions are difficult because carbon cycle scientists work in a data-poor environment—both in situ and remote sensing observations are sparse in space and time, and fluxes cannot be measured directly beyond the kilometer scale. The community has moved toward the use of ensembles of models as well as ensembles of ensembles to confront this problem. She said that the current best strategy to address the uncertainty associated with quantification, attribution, and prediction is to run multiple simulations of multiple models, with the whole community working on different incarnations of these models. She also suggested simplifying models before increasing their complexity to understand fundamental mechanistic relationships. The “holy grail” for digital twins in carbon cycle science, she continued, would be to use them for hypothesis testing, to diagnose extreme events, and for prediction. .
John Harlim, The Pennsylvania State University, defined digital twins as a combination of data-driven computation and modeling that could involve data simulation, modeling from first principles, and ML algorithms. He emphasized that the fundamental success of digital twins depends on their ability to compensate for the modeling error that causes incompatibility of numerical weather prediction models and climate prediction models. If one could compensate appropriately for this modeling error, the same model proven to be optimal for state estimation could also accurately predict climatological statistics. However, he asserted that this is not achievable for real-world applications. Specific domain knowledge is critical to narrow the hypothesis space of models, which would help the ML algorithm find the underlying mechanisms. Using a model-free approach, he continued, is analogous to allowing the algorithm to find a solution from a very large hypothesis space of models. Such a practice could reduce the bias, but this bias has to be balanced with the variance error. He stressed that the success of digital twins in other fields depends on whether enough informative training data exist to conduct reliable estimations.
Incorporating questions from workshop participants, Leung and Chung moderated a discussion among the five panelists. Leung wondered how limited the field is by data when powerful new techniques could be leveraged to extend beyond training data. Michalak replied that a process-based, mechanistic understanding is needed to anticipate future climate system evolution. She said that new modeling techniques could be used to better leverage limited observations, which could assist with uncertainty quantification; however, these new approaches would not fundamentally change the information content for existing observations. She emphasized that new tools offer promise but are not a panacea across use cases and disciplines. Anandkumar elaborated on the ability of these new tools to extrapolate beyond training data. She said that neural operators are being used as surrogates for solving partial differential equations (PDEs) and other equations that can be embedded; at the same time, data could be combined with physics for nonlinear representations. Michalak added that this is feasible only if the challenge is on the PDE side, not if the challenge relates to the parametric and structural uncertainties in the models.
Chung asked how uncertainties could be better understood, quantified, and communicated. Anandkumar responded that ML has great potential, although it is still an emerging approach; with its increased speed, thousands of ensemble members could be created—having this many ensemble members and the ability to incorporate uncertainties is critical. The next step could be to use emerging frameworks such as diffusion models, as long as they incorporate uncertainty accurately. Harlim noted that developing a digital twin that predicts the response of second-order statistics would be very difficult, especially when the system is spatially extended and non-homogeneous. He noted that the ensemble mean is widely accepted to provide accurate predictions; however, a question remains about whether co-variants provide uncertainty about the estimations.
Leung inquired about strategies to work with decision-makers to define the requirements of digital twins. Di Lorenzo advocated for co-designed, community-driven research projects that use a transdisciplinary
approach. Meeting with project stakeholders and community leaders raises awareness, increases engagement, and creates ownership; the scientist’s role is to provide support to articulate the problem posed by the stakeholders. To initiate such projects, he said that scientists should identify boundary organizations with existing ties to the community. Bitz described her work with Indigenous communities in Alaska to better understand the threats of coastal erosion, which prioritizes listening to their concerns and building trusted relationships. She urged scientists to use their knowledge to engage in problems directly and to collaborate with scientists from other domains. Michalak suggested fostering relationships with the private sector to ensure that its investments in climate solutions have the maximum possible impact.
Leung posed a question about the difference between digital twins and the modeling that has been ongoing since the 1960s. Di Lorenzo noted that digital twins are tools with applications for decision-making in the broader community rather than just the scientific community. If the digital twin is meant to serve the broader community, however, he said that the term “digital twin” is too confusing. Anandkumar observed that digital twins are data-driven, whereas modeling over the past decades has primarily used data for calibration. Leung also wondered how social science data could inform digital twin studies. Bitz explained that ground truthing data with local community knowledge is a key part of model development, and social scientists could facilitate that process. Di Lorenzo urged researchers to include social dimensions in digital twin platforms, as thinking only about the physical world is an obsolete approach.
PANEL 4: DISCUSSION ON TRANSPARENCY, SOCIETAL BENEFIT, AND EQUITY
Incorporating questions from workshop participants, Ye and Parris moderated the workshop’s final discussion among three experts on transparency, societal benefit, and equity considerations for the use of digital twins: Amy McGovern, University of Oklahoma; Mike Goodchild, University of California, Santa Barbara; and Mark Asch, Université de Picardie Jules Verne. Ye invited the panelists to define digital twins. McGovern described digital twins as high-fidelity copies such that the digital twin and the original observations are as indistinguishable as possible and allow exploration of what-if scenarios. She added that a human dimension is crucial for the digital twin to enable decision-making. Focusing on the capacity for “high resolution,” Goodchild commented that a question arises about what threshold has been passed, as resolution has become finer and will continue to become finer. However, he noted that the human, decision-making, and sustainability components are defining characteristics of digital twins. Asch underscored the need to educate decision-makers in how to use the tools that scientists develop; much work remains to communicate that digital twins are decision-making tools, not “magic wands.” Parris inquired about how a digital twin extends beyond modeling. Goodchild highlighted the use of digital twins to visualize fine-resolution geospatial data, which will appeal to a broad audience, although visualizing uncertainty in these data is very difficult. McGovern explained that today’s modeling world includes data in different formats and scales, with and without documentation—a digital twin could provide a consistent form to access these data, which could enable AI and visualization. Asch urged more attention toward improving the modeling of uncertainty, especially in light of recent advances in computational power. He emphasized that decisions derived from digital twins are probabilistic, based on the relationship between value and risk, not deterministic. In response to a question from Ye about unique approaches to visualize uncertainty, McGovern described an initiative where those creating visualizations are conducting interviews with end users to understand how uncertainty affects their trust in the model. A next step in the project is allowing the end users to manipulate the underlying data to better understand uncertainty.
Parris asked the panelists to share examples of digital twins that provide societal benefit. Goodchild described the late 1990s concept of the Digital Earth as a “prime mover” in this space and noted that the literature over the past 30 years includes many examples of interfaces between “digital twins” and the decision-making process, especially in industry. McGovern said
that DestinE could allow people to explore how climate and weather will directly impact them; however, she stressed that more work remains for DestinE to reach its full potential. Asch suggested drawing from the social sciences, humanities, and political sciences to help quantify qualitative information. Integrating more diverse beliefs and values into science is critical, he continued, although enabling cross-disciplinary collaboration is difficult within existing funding streams. In response to a question from Parris about how digital twins integrate natural and human systems, Asch described work in the Philippines to model the spread of viral epidemics. He noted that creating dashboards is an effective way for end users to interact with a complicated problem; however, more work remains to model social and psychological phenomena. Goodchild highlighted the value of understanding interactions between humans and their environment in terms of attitudes and perceptions, and he referenced the National Science Foundation’s coupled natural and human systems program and others’ successes with agent-based modeling, where the behavior of humans is modeled through a set of rules.
Ye posed a question about the trade-offs of using data in a digital twin and maintaining privacy. Goodchild remarked that location privacy is becoming problematic in the United States. Location data are of commercial interest, which makes it even more difficult to impose regulations. He posited that the issue of buying and selling location data without individuals’ awareness should be given more attention. With finer resolution, these ethical issues become critical for digital twins, and he suggested implementing a regulation similar to the Health Insurance Portability and Accountability Act (HIPAA). McGovern acknowledged the disadvantages of location data but also highlighted the advantages. She noted the need to evaluate trade-offs carefully to improve models while protecting privacy; a HIPAA-like regulation could make it difficult to obtain useful information for digital twins. Asch commented that the potential for abuse is significant, but high-level data-sharing agreements and data security technology could address this issue.
Parris inquired about the role of bias in digital twins. McGovern observed that although bias might be helpful, most often it is harmful and should be discussed more often in relation to weather and climate data. The first step is recognizing that bias exists at all stages of digital twin creation, and that systemic and historical biases affect which data are missing. Goodchild added that limiting the spatial extent of digital twins could help to address this issue, and Asch proposed that critical reasoning be used to detect bias. Given these issues, Ye asked how to improve confidence in digital twins. Asch stressed that transparency and reproducibility are key to increasing digital twin acceptance, and users should be able to follow a digital twin’s reasoning as well as understand how to use and exploit it. McGovern stated that co-development with the end user helps advance both confidence and trustworthiness. Goodchild explained that some uncertainty in what digital twins predict and how they operate will always exist. He said that ethical issues related to digital twin reusability could also arise, and enforcing fitness for use is essential; “repurposing” is a significant problem for the software industry to confront.
Parris posed a question about strategies to build a diverse community of researchers for digital twins. Goodchild suggested first identifying the subsets of problems for which a digital twin could be useful. Asch said that understanding the principles of modeling and the problem-solving process for digital twins is key. He reiterated the value of bringing philosophy and science together to better develop and use these tools, emphasizing reasoning as a means to help democratize them. He also encouraged increasing engagement with students in developing countries. McGovern stressed that, instead of waiting for students to enter the pipeline, the current workforce should be given meaningful problems to solve as well as training on AI methods; furthermore, offering AI certificate programs at community colleges plays an important role in creating a more diverse workforce.
DISCLAIMER This Proceedings of a Workshop—in Brief was prepared by Linda Casola as a factual summary of what occurred at the workshop. The statements made are those of the rapporteur or individual workshop participants and do not necessarily represent the views of all workshop participants; the planning committee; or the National Academies of Sciences, Engineering, and Medicine.
COMMITTEE ON FOUNDATIONAL RESEARCH GAPS AND FUTURE DIRECTIONS FOR DIGITAL TWINS Karen Willcox (Chair), Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin; Derek Bingham, Simon Fraser University; Caroline Chung, MD Anderson Cancer Center; Julianne Chung, Emory University; Carolina Cruz-Neira, University of Central Florida; Conrad J. Grant, Johns Hopkins University Applied Physics Laboratory; James L. Kinter III, George Mason University; Ruby Leung, Pacific Northwest National Laboratory; Parviz Moin, Stanford University; Lucila Ohno-Machado, Yale University; Colin J. Parris, General Electric; Irene Qualters, Los Alamos National Laboratory; Ines Thiele, National University of Ireland, Galway; Conrad Tucker, Carnegie Mellon University; Rebecca Willett, University of Chicago; and Xinyue Ye, Texas A&M University–College Station. * Italic indicates workshop planning committee member.
REVIEWERS To ensure that it meets institutional standards for quality and objectivity, this Proceedings of a Workshop—in Brief was reviewed by Jeffrey Anderson, National Center for Atmospheric Research; Bryan Bunnell, National Academies of Sciences, Engineering, and Medicine; and Xinyue Ye, Texas A&M University–College Station. Katiria Ortiz, National Academies of Sciences, Engineering, and Medicine, served as the review coordinator.
STAFF Patricia Razafindrambinina, Associate Program Officer, Board on Atmospheric Sciences and Climate, Workshop Director; Brittany Segundo, Program Officer, Board on Mathematical Sciences and Analytics (BMSA), Study Director; Kavita Berger, Director, Board on Life Sciences; Beth Cady, Senior Program Officer, National Academy of Engineering; Jon Eisenberg, Director, Computer Science and Telecommunications Board (CSTB); Samantha Koretsky, Research Assistant, BMSA; Tho Nguyen, Senior Program Officer, CSTB; Michelle Schwalbe, Director, National Materials and Manufacturing Board (NMMB) and BMSA; Erik B. Svedberg, Senior Program Officer, NMMB; and Nneka Udeagbala, Associate Program Officer, CSTB.
SPONSORS This project was supported by Contract FA9550-22-1-0535 with the Department of Defense (Air Force Office of Scientific Research and Defense Advanced Research Projects Agency), Award Number DE-SC0022878 with the Department of Energy, Award HHSN263201800029I with the National Institutes of Health, and Award AWD-001543 with the National Science Foundation.
This material is based on work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, and Office of Biological and Environmental Research.
This project has been funded in part with federal funds from the National Cancer Institute, National Institute of Biomedical Imaging and Bioengineering, National Library of Medicine, and Office of Data Science Strategy from the National Institutes of Health, Department of Health and Human Services.
Any opinions, findings, conclusions, or recommendations expressed do not necessarily reflect the views of the National Science Foundation.
This proceedings was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
SUGGESTED CITATION National Academies of Sciences, Engineering, and Medicine. 2023. Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. https://doi.org/10.17226/26921.