Page 57 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

5

Modeling, Validation, and Data Science

Key Themes

In the sessions devoted to modeling, validation, and related topics in data science and analytics, the following key themes emerged from the presentations and discussions:

The space weather field is simultaneously data starved and not efficiently using available data.

– Existing data are often not well matched to current modeling needs because of the data format, latency, lack of metadata, lack of calibration or intercalibration, or limited knowledge of data errors. In many cases, models need additional data that are not currently available.

Data assimilation (DA) is a tool that holds significant promise for space weather research and forecasting. DA is not a “one-size-fits-all” approach, rather its use will require tuning across domains that cover disparate physics and vast ranges of temporal and spatial scales.

– The current potential for applying DA to space weather models and predictions is limited by a lack of (1) suitable data; (2) characterization of data uncertainty; and (3) computational capacity. Commercial data buys may be part of the data solution; however, the error sources in such data will need to be well documented. Cloud computing capacity is growing but may turn out to be an expensive solution.

Ensemble modeling, particularly with multi-model ensembles, has the potential to significantly improve space weather predictions; however, the widespread use of ensemble models will require additional research and development as well as additional resources, such as greatly increased data storage and computational capacities.
Machine learning (ML) is a promising approach for understanding space weather, but its use is currently limited by data quality and data quantity. Furthermore, the space weather community does not yet have the data ecosystem needed to create ML-ready data sets. All domains lack data related to extreme events. In ML space weather applications, a greater emphasis is needed for understanding the underlying physical phenomena, as opposed to typical ML applications treating the model as a “black box” and the prediction as the only final outcome.
Robust uncertainty quantification methodologies will be important for approaching space weather as a system science. Model uncertainty and data-representativeness uncertainty will need to be quantified in a systematic way across the different models.
Observing system simulation experiments (OSSEs) can be valuable in cost–benefit evaluations of particular data sets to be used in ML and DA approaches, but not all domains are mature enough for the OSSEs to be useful.
Space weather data science would benefit from further cross-agency effort to coordinate data-archival standards, promote data fusion and reuse, and support data revitalization for ML, ensemble modeling, and data assimilation.

Page 58 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

The committee’s statement of task requested it to “consider needs, gaps, and opportunities in space weather modeling and validation, including a review of the status of data assimilation and ensemble approaches.” These topics were addressed in the workshop’s Data Science and Analytics session, which was spread over days 3 and 4 of the workshop. This chapter summarizes the contents of two keynote talks and four panels from that session, which addressed different aspects of modeling and validation challenges in the context of advances in computational resources, tools, and ongoing needs, especially in terms of data curation.

The background against which the panel’s discussions took place is the profound contradiction that exists in space weather research; that is, it is in a “big data” regime, yet not enough data are available for data science applications. This discipline is unique in the need to understand (and compute) relevant physics on scales that span meters to megameters and sub-seconds to decades as well as across vast density, temperature, and electromagnetic field scales. Thus the modeling and data challenges facing space weather scientists are significant.

The 2016 National Science Foundation (NSF) Portfolio Review of the Geospace Section of the Division of Atmospheric and Geospace Science briefly mentioned aspects of data science in the recommendations for data exploitation tied to the previous Heliophysics Decadal Survey. The gap analysis for the National Aeronautics and Space Administration’s (NASA’s) Space Weather Science Application Program focused on at-risk and needed-but-not-yet-available observations. Within the gap analysis the topics of data science and analytics were not directly addressed, although a few of the sub-elements such as data assimilation (DA) and ensemble modeling were mentioned. Data science and analytics in support of space weather system science was not addressed specifically in either report.

KEYNOTES: DATA ASSIMILATION AND MACHINE LEARNING

The workshop’s data science and analytics sessions were kicked off by two keynote speakers, Richard Todling of NASA’s Goddard Space Flight Center and Enrico Camporeale of the University of Colorado Boulder and the National Oceanic and Atmospheric Administration’s (NOAA’s) Space Weather Prediction Center. The keynote speakers were asked to address two questions:

Compared with “Earth system science,” space science is usually data starved. How do we leverage data assimilation and machine learning to overcome this?
What is the role of testbeds in paving the way for data assimilation and machine learning from research to operations?

From Terrestrial Weather to Space Weather

Todling’s presentation began with a brief overview of the use of DA in terrestrial weather forecasting. DA is a mathematical method in which observations and numerical model data are combined to create an optimal representation of the state of a system. In the case of terrestrial weather prediction, a typical DA approach is to compare data from various satellites and ground-based observations with a previous numerical model forecast and then update the model state to better match those new data. This is an iterative process in which model states are constantly being revised to reflect observations of the present state to produce a new forecast for the next time step, which is then used as a new basis of data comparison.

Page 59 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

Terrestrial weather prediction has steadily improved because of several factors, Todling said: increased model resolution, improved representation of physical processes, and the increasingly large amounts of data from observations that are assimilated into the models through advanced DA techniques. Over time, a wide variety of different DA techniques have been developed. There are two main DA types—variational DA and sequential DA—each with its own variations, and many of these have been combined with ensemble models to create hybrid DA models.

Similarly, he said, there are a large number of different machine learning (ML) approaches to learning from data without guiding by models. In essence, ML looks for patterns in large collections of data that are used as the basis for making predictions. Compared to DA, in ML there are no a priori models that the data build on or are compared with. There are also a large number of hybrid models in which ML is used to assist DA or DA procedures are used to aid ML strategies (Figure 5-1).

With that background, Todling turned to a discussion of DA and ML in space weather and to comparing that use with their application in terrestrial weather forecasting. DA is used in a large number of space weather models covering the photosphere, the corona, flares, coronal mass ejections and solar wind, the magnetosphere, the ionosphere, and the thermosphere–ionosphere system. One difference between terrestrial and space weather systems, he noted, is that the spatial scales are greater and the particles and

FIGURE 5-1 Combining data assimilation (DA) and machine learning (ML) to tackle challenges of space weather forecasting. In tandem, these two methods provide a large toolkit with which to tackle space weather forecasting challenges.
SOURCE: Ricardo Todling, NASA Goddard Space Flight Center, presentation to workshop, April 13, 2022.

Page 60 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

electromagnetic disturbances travel across the system in time scales much faster than in the terrestrial atmosphere. The two types of methods, ML and DA, can play different roles in providing a hybrid tool for forecasting, with DA being used both within a domain and to bridge domains, and ML being used, for example, to provide process emulation, data down-scaling, and homogenization. Among the different DA and ML models used in both terrestrial and space weather forecasting, the models may be intercoupled to a varying degree, depending on the application.

The many different facilities and methods based on DA and ML have been key in terrestrial weather research for bridging temporal and spatial scales of relevance to the specific domain. As the role of uncertainties is becoming more central to evaluating forecasts and to providing actionable predictions, Todling said, it is important to understand that DA and ML do not, by themselves, provide error estimates, but they can be used for example in ensemble or Monte Carlo–type applications, which can provide such estimates.

The terrestrial weather modeling community has adopted a few standard modeling frameworks within which to develop further capabilities, although, according to Todling, that community has largely decided that modeling frameworks are not adequate for application of or research on DA techniques. In the area of space weather forecasting, DA and data-informed “nudging” have been used in modeling across solar, heliosphere, magnetosphere and thermosphere/ionosphere domains but mostly in “proof of concept” studies, not yet reaching an operational state.

The space weather community has begun to establish frameworks within which models—especially cross-domain, cross-system models—can be developed, but it is not clear that there is a need (yet) for a single space-weather-focused DA/ML framework. Frameworks in general allow flexibility via module-based units at each step. The Joint Effort for Data-Assimilation Integration (JEDI) framework has been adopted across some research programs for use in terrestrial weather modeling, Todling said, and it could possibly be appropriate for the space-weather regime as well.

A key message from Todling’s presentation was that there is not a one-size-fits-all approach that will work across all physics-based domains or disciplines. The DA experience from terrestrial weather modeling can and should provide lessons learned for application to space weather research, but DA is still a relatively new tool for space weather, and its use in that area can be expected to have challenges requiring unique solutions.

Machine Learning in Space Weather Forecasting

In the next presentation Camporeale began by claiming that ML is “reinventing space weather.” To back up this contention he provided an extensive list of topics in space weather to which ML has been applied, including the forecasting of global or average indices, such as the disturbance storm time (Dst) index, solar wind classification, solar wind speed forecasting, and predicting the arrival time of coronal mass ejections (CMEs). The potential to transform space weather, he said, derives from the fact that ML holistically estimates a system’s behavior, while physics-based models are deterministic, describing only processes that are included in the governing equations.

In explaining why ML is so well suited to space physics problems, Camporeale said that the basic reason is that physical properties, such as invariance and symmetry, along with conservation laws, drastically reduce the search space of the ML parameters. He added that ML should be able to describe any system that follows the laws of physics. The major challenges to the use of ML in space weather are data quality and data quantity.

In examining the path forward for ML in space weather, Camporeale offered a series of questions and challenges that must be addressed. For example, a pervasive question in research on ML-based space weather forecasts is what he called “the information problem,” the challenge of determining the minimum amount of physical information required to make a forecast. The “gray-box problem” centers on the question

Page 61 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

of the best way to optimally use physics understanding and the large data set covering the Sun–Earth system. For example, in the case of probabilistic estimates of regional ground magnetic perturbations, it was shown that estimates provided both by physical models and by pure ML were inferior to approaches combining the two.

The “surrogate problem” focuses on determining which components in the “Sun-to-mud” chain can be replaced by an approximated black-box surrogate model with an acceptable trade-off between computational speed and decreased accuracy. The “uncertainty problem” involves determining how to incorporate the uncertainties in the data throughout the model outcomes and ML-based forecasts. Camporeale said that propagating uncertainties through the space weather chain from solar images to magnetospheric and ground-based observations to a single-point prediction is a complex and computationally demanding task.

The “too often too quiet problem” arises from the fact that as the geomagnetic storms are rare, the space weather data sets are imbalanced, being dominated by quiet conditions. This creates a serious problem for the ML algorithms, he said, and it also poses challenges for defining meaningful metrics that assess the ability of a model to predict interesting but rare events.

The final challenge on the path forward is the “knowledge discovery and explainability problem,” which centers on distilling some understanding of the physical mechanisms from the ML-based black-box predictions.

In closing, Camporeale said that these six problems (the information problem, the gray-box problem, the surrogate problem, the uncertainty problem, the rare events problem, and the knowledge discovery and explainability problem) are not specific to space weather but also pose fundamental challenges in the fields of artificial intelligence (AI) and uncertainty quantification.

Discussion

The discussion following the session addressed a number of questions from the Zoom chat. Many of the two speakers’ comments were focused on the possible roles of ML, interpretable ML, training intervals for ML-based models, and the lack of ML-based models in scoreboards. The role of ML as compared with physics-based models was also discussed, with comments such as “ML cannot learn what has not yet [been] seen” and questions such as “Can ML help inform which measurements we will need in the future to improve forecasts?”

Camporeale said that ML can have a clear role in populating a well-delineated “state space” that is otherwise sparsely observed. Furthermore, ML may play a role in identifying less than catastrophic but still very rare disturbances that have major effects on society, such as the Starlink failures that occurred during a time of minor space weather activity.¹ Overall, participants’ comments indicated that the useful role of ML processes is not yet fully understood or appreciated.

Finally, there was some discussion as to whether the space weather DA applications are sufficiently similar to those used in terrestrial weather applications to make JEDI a useful tool for space weather.

MACHINE LEARNING AND VALIDATION

The Data Science and Analytics: Machine Learning and Validation panel, moderated by committee member KD Leka, had presentations from Jacob Bortnik of the University of California, Los Angeles; Asti Bhatt of SRI; Shasha Zou of the University of Michigan; Morris Cohen of the Georgia Institute of Technology;

___________________

¹ T. Malik, 2022, “SpaceX Says a Geomagnetic Storm Just Doomed 40 Starlink Internet Satellites,” Space.com, February 9, https://www.space.com/spacex-starlink-satellites-lost-geomagnetic-storm.

Page 62 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

David Fouhey of the University of Michigan; and Hannah Marlowe of Amazon Web Services. The panelists were asked to address two questions:

What investments are needed to produce physics-informed machine learning for space weather?
Are there adequate data sources and curation to implement machine learning for research, validation, and the determination of uncertainty?

A panel discussion followed the presentations.

Bortnik began the presentations by providing an overview of how ML can be used to extract knowledge from data. In general ML applications the goal is to build a black-box model that can make forecasts and predictions, which do not offer any insights into the underlying physical mechanisms. He said that there should be dual goals of both building a model that predicts well and extracting physical insight and understanding. There are various ways to use ML that allow developing these physical insights, including requiring ML systems to obey physical laws (e.g., ensuring symmetries, invariants, and conserved quantities), using interpretable models (transparency into which inputs control which outputs), using models that show how information gets transmitted in the system (i.e., causality flow), and extracting governing equations from the models to gain understanding of the physics of the system. Bortnik noted that it will be important to incorporate these elements into space weather ML systems, even if such approaches are not yet well developed.

In the future, Bortnik said, all space scientists will need a working knowledge of ML both because the rapidly growing data volumes cannot be analyzed in traditional ways and because ML supersedes physics models in many cases. Thus, space science education will need to include ML principles as well as provide experience in building ML models. This will require new books, classes, and curricula; more workshops and meetings devoted to the topic; and the development of new and more powerful algorithms for analyzing physical systems. Figure 5-2 summarizes ideas for the application of ML in the Earth and space sciences.

FIGURE 5-2 Ten ideas for applying machine learning in the Earth and space sciences.
SOURCE: Jacob Bortnik, University of California, Los Angeles, presentation to the workshop, April 13, 2022.

Page 63 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

Next Bhatt spoke about ML in space science from the perspective of a data provider. She said that while ML methods have been around longer than most large data sets, it was investment in the creation of ML-ready data sets that led to the expansion of their use in ML applications. However, she said that, generally speaking, the space weather community does not yet have the data ecosystem needed to create ML-ready data sets. The space physics data infrastructure, especially the ground-based observations, is severely lacking in its ability to provide standardized data products. As a result, ML today is being applied to only a small subset of available space weather data.

Bhatt argued that space-weather data providers need to adopt standardized metadata as well as the FAIR practices—that is, ensuring that data are findable, accessible, interoperable, and reusable. At this point, most space-weather data do not fulfill those requirements. To ensure that FAIR practices are universally applied to both data and model-output, the space-weather community and the funding agencies need to support the enactment of FAIR data.

In the next presentation, Zou highlighted “melting” boundaries between disciplines and answered the session’s two questions with that focus in mind. Concerning the investments needed to produce physics-informed ML for space weather, she emphasized the need for “sustained yet agile funding programs” to support interdisciplinary teams and interdisciplinary academic courses and programs designed to train the next generation of space physicists and, in particular, to equip them with the skills and knowledge they need to apply ML in their research. On the question of adequate data sources for ML applications, Zou said that data are still sparse in large part due to the disparate spatial and temporal scales relevant to space weather phenomena. She suggested establishing satellite constellations that would provide multi-point observations in diverse space plasma regimes. She also said that closer collaboration is needed between government agencies and the private sector to harness the boom of the private space industry and AI technologies.

Morris, considering the question on the adequacy of data, highlighted the fact that while space science data are “big data,” they are also misbehaving data, rife with missing and inconsistent information. Space weather ML applications aim to produce interpretable outcomes from a fusion of theory, models, and multiple data sets that span decades, often without overlap for intercalibration. As such, these problems are of interest also to the data science community, and he recommended marketing the potential for new advances applicable also to other research fields to data scientists. To encourage advances in this area, Morris suggested incentivizing space-based data science course development at computing programs and departments, funding new space scientist faculty hires in computer science departments, and specifically tackling the challenge of “misbehaved data” in data science using the space weather data as an example.

Fouhey, an ML and computer vision (CV) expert involved in space weather problems, spoke about how to get more ML and CV people involved in space research. First, it is important to provide easy entry points for ML experts. Second, space weather problems that are interesting and have research benefits and societal impacts should be on offer. Beyond this, Fouhey noted that it takes time to become proficient in a new field; therefore, ML experts would benefit from discipline-scientist-led “welcome mats” (such as the NASA-established frontier development labs) as well as sustained cross-discipline funding. Looking to the future, Fouhey said that nurturing strong collaborations between space-weather research and ML-centered research could lead to (1) data pipelines and instruments that let advances in ML aid space weather research, (2) work that advances both ML and space weather, and (3) the co-design of future instruments optimized for ML-based support and investigations.

Finally, Marlowe offered a high-level view of what is important in a successful use of ML: (1) Make sure that the data are ML-ready—in particular, that they are available and optimized for consumption. (2) “Get technology out of the way” so that researchers can focus on the most important parts of the problem. (3) Make sure that the right people are available through collaborative teams that have a blend of technical and domain experts. (4) Ensure appropriate training and knowledge sharing so that workers can effectively

Page 64 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

ask and answer questions of when or whether to apply ML. Finally, he concluded, it is important to encourage and lean in to new and innovative approaches and learn from failures across the disciplines.

Discussion

The first topics addressed in the panel discussion were how to engage ML experts in solving space weather questions and, in particular, how to establish collaborations by building on the “low-hanging fruit.” One suggestion for an entry point was using ML to classify data in various ways, a process that would require both physical understanding and ML techniques. But there does need to be an acknowledgment that gray-box/black-box approaches can make physicists inherently uncomfortable.

Two questions to the panel were posed by the moderator: Were there sufficient data and curation available to use ML for research, validation, and uncertainty quantification, and how does one go from a curated data set being used to test an ML application to using ML in operations, where the data are not nearly as high quality, continuous, or validated?

Bortnik said that the answer will probably always be “no” to having enough data, but that should not prevent the pursuit of these approaches. Fouhey continued that the handling of operational (lower-quality) data is already a data science research topic, which reinforces the point that space physics provides research topics that are of interest to ML and data scientists as well. Marlowe emphasized the importance of the “unglamorous” but time-consuming data preparation and curation tasks that can take more than half of the time of any ML project. Tomoko Matsuo spoke about the role of observing system simulation experiments (OSSEs) in ML validation in evaluating model performance in different data ecosystems.

Crowd-sourcing through “coding challenges” has been successfully used in many fields to attract broad participation from the public and from new research communities. Jack Ireland noted that these require clear evaluation benchmarks and well-posed questions. Defining evaluation benchmarks for ML space weather applications could help their development as well as assisting in crowd-sourcing and attracting broader awareness (e.g., through prizes) of these topics. Public engagement could be enhanced through prize-offering challenges, such as the recent TopCoder challenge to develop comet-detecting algorithms, for the dual-purpose of advancing space science and bringing the excitement of space physics to a broad group of coding experts.

Lastly, classroom education in geosciences programs is needed at both the undergraduate and graduate level; topics would include ML pre-curated data sets and benchmarks as well as space weather domain knowledge.

Zou made the comment that it might be possible to encourage cross-discipline training by offering various incentives, such as certificate programs for students in other fields. While some certificate programs in ML and statistics exist, there are only a few opportunities for ML and statistics experts to get certificates in space weather subdisciplines.

The session also inspired a spirited discussion on the Zoom chat concurrent with the presentations. Strong arguments were made that all scientists should have ML expertise, and there was strong support for choreographed cross-disciplinary programs (e.g., the Frontier Development Lab developed by NASA/JPL). In developing these collaborations, it will be important to ensure that the problems and ML approaches are interesting to both parties, offering analysis-ready challenging data sets to the ML experts and novel methodologies to the physicists. For example, the Earth Science Information Partners (ESIP) Data Readiness Cluster provides standards for developing AI-ready data sets.

Two questions were posed concerning the potential power of OSSEs: (1) Are the models sufficiently well-defined that the OSSE methodology makes sense? and (2) How does one carry out an OSSE-like analysis for ML models? Replies to these questions suggested that very high-quality models need to be at the center of the test, but that the answers depend on the domain—some domains are mature enough for OSSEs while others might not be.

Page 65 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

DATA FUSION AND ASSIMILATION

The Data Fusion and Assimilation Panel, moderated by Charles Norton, had six panelists: Mark Cheung of Lockheed Martin; Bernie Jackson of the University of California, San Diego; Slava Merkin of the Applied Physics Laboratory (APL) of Johns Hopkins University; Alex Chartier, also of APL; Tomoko Matsuo of the University of Colorado Boulder; and Eric Blasch of the U.S. Air Force Office of Scientific Research. The panelists were asked to address three questions:

What are the new data assimilation/fusion approaches that will likely lead to improved space weather forecasting performance?
Do you anticipate adequate data resources for these schemes? If not, how can data buys or other investments alleviate shortcomings?
How can we quantify uncertainty in data assimilation schemes that use multi-source observations?

A discussion followed the panelists’ presentations.

In the first presentation, Cheung addressed each of the three questions in turn. Concerning promising new approaches, he said that the large dimensionality of space weather problems requires efficient, scalable DA techniques, such as density estimation and data generation (e.g., normalizing flows), physics-informed neural networks, and neural network–based surrogate models. He said that the data resources are not adequate—either now or in the future—for modeling solar magnetic activity and its space weather impacts in the heliosphere; an adequate result would require multi-point measurements, potentially including solar polar magnetic field measurements. He added that the Solar Dynamics Observatory, the key data source today, does not have redundancy or continuity, and he said that it is important to now invest in future infrastructure, such as ngGONG or space-borne vector magnetographs. Data buys could help solve the problem if they would lead to lower costs of the magnetograph data.

On quantifying uncertainty, Cheung pointed to the power of approximate Bayesian computation methods and said that the model quality (or appropriateness) could be assessed by developing appropriate cross-domain, cross-source cost functions.

Jackson spoke about the short-term opportunity to improve global heliospheric analyses by taking full advantage of all (potentially) available data, including the Worldwide Interplanetary Scintillation Stations (WIPSS). A longer-term solution should combine ground-based remote-sensing data with space-borne heliospheric imagers.

Merkin talked about combining first-principle and data-derived approaches in global geospace modeling. Merkin said that the types of solutions will depend on the data types and availability and the physics-based models under consideration. Within the geospace, not only are the data sparse in spatial sampling but they are also unevenly sampled, with some variables measured in situ, others remotely sensed, and some not at all. He said that uncertainties arise both from missing physics and from challenges of modeling the vast range of spatial and temporal scales. Merkin expressed caution about uncertainty estimation through single-point data-model comparisons, as the uncertainty should be assessed using accuracy over a time window and metrics that weight physical relevance.

Alex Chartier asserted that the largest source of uncertainty comes from the electromagnetic and kinetic energy input into the ionosphere and upper atmosphere, which are crucial drivers of plasma convection and other (high latitude) space weather phenomena. Because these inputs have a high degree of temporal and spatial variability, they cannot be observed from a single vantage point; the best way to observe those parameters would be a low-Earth-orbit satellite constellation. While the 66-satellite Iridium constellation provides Active Magnetosphere and Planetary Electrodynamics Response Experiment (AMPERE) magnetometer data, other types of data needed include electric fields, particle precipitation, and plasma conductivity, ideally at high temporal and spatial resolution and well curated to remove biases and characterize uncertainties.

Page 66 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

Matsuo advocated using systems approaches to unify observational and modeling capabilities, building on existing infrastructure (e.g., from the National Weather Service DA and ensemble weather forecasting facilities). While OSSEs can be useful for testing such systems, physics missing from the system description can give rise to degeneracies that make it difficult to determine whether the errors arise from the data or from the missing physics.

It is important to quantify not just model errors, Matsuo said, but also the “representativeness errors” that indicate systemic problems in data interpretation and handling (such as retrieval process uncertainties related to how sensor data are transformed into geophysical data products). Furthermore, it was pointed out that the space science community faces some self-imposed conflicts between the interoperability requirement of DA and the stand-alone science justification requirement of NASA missions.

In the final presentation, Blasch identified nonlinear, non-Gaussian evidential reasoning as a new technique that can be used to improve space weather forecasting. He suggested that the best way would be to use evidential reasoning as the last step in a process that otherwise relies on ML, which would speed up the system and also capture the uncertainty. Finally, data fusion with nonlinear, non-Gaussian DA (e.g., the unscented Kalman filter) can be used to make estimates of the states of nonlinear systems.

Blasch commented that the challenge in quantifying uncertainty is that there are so many different types of it—an ontology of uncertainty in the data fusion community identifies more than 50 sources of uncertainty. It is important to identify the types of uncertainty, be they absolute or relative, forward or inverse, epistemic or aleatoric.

Discussion

In the discussion, Norton asked about data fusion products that are not currently available and that are not directly observable. Generally speaking, even with its limitations, data fusion opens possibilities—especially multi-sensor approaches can help resolve new physical parameters. Successful examples include deducing the heliospheric magnetic field using ground-based Faraday rotation measurements of galactic sources combined with models of ionospheric total electron content, or constructing an extreme ultraviolet irradiance emulator from the Solar Dynamics Observatory (SDO) Atmospheric Imaging Assembly instrument using training data (fusion) from the SDO’s Extreme Ultraviolet Variability Experiment.

Next Norton passed along the question of which are the most effective data with which to improve DA-based models. Matsuo noted that the ionospheric plasma eddy diffusion and conductance could both be useful additions, but anything that can constrain the models is helpful.

Chartier commented that data buys for space weather purposes are only possible if someone builds the instruments. To that end, he suggested to “think big” and communicate the need for specific types of sensors and instruments on constellations to the space companies, suggesting agreements that are mutually beneficial.

An extensive Zoom chat took place during the session. The discussion initially centered on solar data, with (unanswered) questions regarding data buys for solar magnetograms, and whether L4 and L5 views would be adequate (which is not yet clear). Ongoing efforts include combining modeling, DA, and available data to improve boundary inputs to global models, and DA with multi-view solar wind data, which relies on data that are at this point not operationally available. ML (with past measurements as input) could augment sparse data that could then be assimilated; however, it is not clear how independent the resulting data would be for statistical purposes.

For measurements of particles and fields, networks of small or nanosatellites might be capable of covering the spatial and temporal scales without the cost becoming prohibitive. However, as the correlation lengths and time-scales of interest are still open questions, the total number of necessary spacecraft remains unknown.

Page 67 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

In any case, the challenge remains of enticing companies to build commercial off-the-shelf (COTS) science-grade instruments. OSSEs were once again mentioned as means to constrain the options.

In contrast with a statement by an earlier panelist, one chat participant expressed support for using JEDI as a framework, stating that it is interoperable with different forecast models. Another participant said that using the Faraday-rotation analysis of fields and electron density in the heliosphere and the ionosphere might suffer from noise created by near-Earth auroral kilometric radiation.

The Zoom chat discussions also covered the recent SpaceX incident in which more than 40 satellites were lost due to orbit maneuvers during a G1-level geomagnetic storm. Later analysis showed that while the storm itself was not remarkable, these conditions led to significant increases in the atmospheric neutral density. This complex event revealed problems in the operational Mass Spectrometer and Incoherent Scatter radar atmosphere model, highlighting the potential for problems during larger G5-level storms, given the challenges presented by this G1 storm. Given the data availability, the incident was seen as a good testbed for ML approaches.

ENSEMBLE MODELING

The Ensemble Modeling Panel, moderated by Mary Hudson, had six panelists: Sean Elvidge of the University of Birmingham, Eric Adamson of the Space Weather Prediction Center (SWPC) of NOAA, Dan Welling of the University of Texas at Arlington, Nick Pedatella of the University Corporation for Atmospheric Research, Kent Tobiska of Space Environment Technologies, and Edmund Henley of the UK Meteorological Office. The following questions were posed to this panel:

The terrestrial weather community does multiple-model ensemble modeling. Is that practical for space weather in the near term?
What data sources are needed but unavailable (proprietary, classified, etc.) that are hampering next steps? Do we work to get them available or can other methods (ML, data curation, fusion, etc.) suffice?

A discussion period followed the presentations.

Elvidge’s general overview noted that ensembles can be used in a variety of ways: for DA, with multi-model ensembles (MMEs), or in uncertainty quantification. While the use of MMEs is in its infancy, it has demonstrated an ability to reduce errors, which has led to its use especially to reduce model-propagation uncertainty in DA. A key question in using MMEs is how to generate independent (orthogonal) ensemble members that are needed to estimate covariance matrices, to reduce propagation errors, and to ascertain that errors in individual contributions will effectively cancel. In closing, Elvidge noted that ensemble modeling does require an investment in computational resources and data storage, as the amount of data increases linearly with the number of ensemble members and with the number of different models.

Adamson described the state of ensemble model deployment at SWPC. At present, he said, while no ensembles are in operation, their potential to constrain uncertainties is recognized, and development work is ongoing in the solar/solar wind domain. The uncertainties in, for example, CME models are mainly observational and can be addressed by improved coronagraphs and new measurement vantage points.

Adamson returned to the two types of ensemble models, those comprising multiple realizations of the same model versus MMEs. As an example of the former, he showed a graph with predictions made by the Air Force Data Assimilative Photospheric Flux Transport (ADAPT) model, which is a 12-member ensemble with initial condition variation in supergranulation (Figure 5-3). Different initial conditions led to very different predictions for the arrival times of the CME.

Page 68 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

FIGURE 5-3 Variation in the Air Force Data Assimilative Photospheric Flux Transport (ADAPT) results due to differing boundary conditions for the solar wind speed at Earth orbit. Blue and green lines are different model runs, while the black line is the actual observation.
SOURCE: Eric Adamson, NOAA Space Weather Prediction Center, presentation to workshop, April 14, 2022.

SWPC has done little with MMEs, Adamson said, in large part because so much of the uncertainty is due to a lack of data. Although MMEs can be resource-intensive, they can be of interest in some cases—for example, in understanding how to effectively prune the ensembles, especially in the context of coupled-model systems.

Welling spoke about the use of ensembles in the geospace regime, particularly global magnetohydrodynamics coupled to ionosphere and ring current models. At present, he said, the use of ensemble models is very limited in this area and is mostly of the proof-of-concept type. As the validation efforts are not well refined, the performance of ensemble-based approaches remains unclear. However, he continued, early work has shown that ensembles can increase the predictive skill of geospace models. In particular, studying model sensitivity to boundary conditions and missing or new physics, exploring the physics hidden in input parameters, and quantifying uncertainties across all inputs were mentioned as “low-hanging fruit.”

To maximize the effectiveness of ensemble modeling requires a number of new resources, he said. Covering the driving by upstream L1 and near-Earth monitors and monitoring the impacts by a dense network of ground magnetometers, auroral imagers would make it possible to validate what Welling described as “one of our weakest points of modeling—auroral conductance and dynamics.” Finally, he commented on the challenges in geospace DA: the data sampling needs to cover large volumes, and the processes described cover a range from very local to global. This often leads to point-source sampling that perturbs DA locally with undesirable global effects for the models. Assimilation of ionospheric electrodynamics, auroral observations, and ground magnetometers may be possible, although the viability needs to be assessed.

Page 69 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

Pedatella spoke about the need to develop and evaluate new approaches for generating ensembles to address issues such as the ensemble not reflecting the uncertainty in the input parameters or in the model itself, or sensitivity of the model to initial and boundary conditions caused by chaotic dynamics. MMEs are widely used in weather and climate applications, and Pedatella saw opportunities for their use also in space weather. However, implementing MMEs for ionosphere–thermosphere applications will require new research as well as investments in the computing infrastructure, running the simulations, and distributing the results.

Tobiska described ensemble modeling of the thermosphere at the 18th Space Control Squadron of the U.S. Space Force. The High-Accuracy Satellite Drag Model (HASDM) is run near-continuously to predict thermospheric density for the coming 72 hours and used to inform satellite operations. There is a recognized need to reduce the absolute error in the model, Tobiska said.

Henley offered an operational perspective on the quantification of uncertainties in ensemble modeling. He began by noting that the operational needs are uncertainties due to current conditions—weather-of-the-day uncertainties—rather than uncertainty in performance against climatological references. As the classic ensemble approach to uncertainty is expensive, he recommended cheaper options, such as models that use simpler physics, reduced-order models with simplified internal dynamics, surrogate/emulator approaches as used in climate science, or ML-based models of the uncertainties themselves. These cheaper options may suffice for operational forecast needs.

Henley also made the point that it is important to communicate with the users to understand their needs and educate them on new opportunities. Resources spent on quantifying uncertainties are wasted if they do not inform the users’ decision making, while providing that information in more helpful and targeted formats might increase their use. Terrestrial weather and climate forecasting rely on social science input to help identify users’ needs, he said, including how to interpret uncertainties. As an example he mentioned Decider, a multi-criterion group decision-making support system that can help forecasters supply uncertainties that will be informative to a particular set of users.

Discussion

Hudson began the discussion period asking why ensembles of models should be preferred over the single model in the ensemble that is demonstrably the best. Pedatella responded that because it is rare that one model will always be the best, combining multiple models can ensure better results over a range of different situations. Elvidge added that not only is it often difficult to tell which model is best, but even if one is consistently better than the others, an ensemble of models will outperform the single best one.

To Hudson’s request for a definition of ensemble models, Henley said it can be anything from changing something about the model (e.g., initial or boundary conditions) to altering the models being used. Tobiska said that the MME approach of using a spread of different models makes it possible to quantify the “uncertainty in our collective knowledge of the system.” In the case of individual models providing probabilistic forecasts, Elvidge said that the ensemble output represents a probability distribution from which the collective uncertainty in the understanding of the system can be estimated.

In the case of applying DA or ensemble modeling to the magnetosphere, Welling said that distributed multipoint measurements can provide localized nudging without an overall large impact, but the success of the process will depend on the details of the assimilated information.

In the Zoom chat participants brought up the issue of MMEs versus ensemble DA versus a broader definition of a “perturbed physics, empirical, statistical, and machine-learning model.” The point was made that MMEs are preferred over a single “best-performing” model, both because the different models, despite their performance, can bring useful information and because a combination of multiple models can outperform even the best single model.

Page 70 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

From the operational perspective, single models rarely perform well under all conditions and scenarios. MMEs are useful in operational settings because in these settings it is difficult to know a priori which regime is present. The MMEs provide information about variance across models but not always information on their causes. One example is a solar wind ensemble model where the differences in individual models of the solar wind are small compared with those arising from a CME, indicating that MME for the solar wind component does not bring appreciable gains.

The Zoom chat touched on the lack of research of assimilating data (e.g., from ground magnetometers to magnetospheric MHD models); several comments cited the inherent difficulties in imposing observational constraints from the ionosphere to the magnetosphere as well as the fact that there are easier and perhaps more fruitful research topics available.

The chat participants noted that winds and temperatures from lower-atmosphere weather models are used for the forcing from the lower thermosphere and that MMEs of these weather models can capture the uncertainty to improve thermosphere-model ensemble generation. For whole-atmosphere DA, the driving from the lower atmosphere is incorporated from assimilating observations in the lower atmosphere, with some constraint in the mesosphere and lower thermosphere (MLT) provided by research-satellite data (i.e., TIMED/SABER and Aura/MLS). Wind estimates from meteor radars have been tested but not yet shown to be effective.

DATA AND MODEL RESOURCES AND CURATION

Another panel was devoted to data and model resources and their curation, with a specific focus on R2O2R (research-to-operations and operations-to-research) efforts. Moderated by committee member Anthea Coster, the panel had six presenters: Carrie Black of NSF, William Schreiner of the University Corporation for Atmospheric Research, Jack Ireland of NASA, Rob Redmon of NOAA, Masha Kuznetsova of NASA’s Community Coordinated Modeling Center, and Alec Engell of NextGen Federal Systems. The following questions were asked of this panel:

Is a more coordinated/sustained data curation effort needed to support R2O2R?
What data sources are needed but unavailable (proprietary, classified, etc.) that are hampering next steps? Do we work to get them available or can machine learning, data curation, etc., take care of it, and how?

Black offered an overview of NSF data programs in space science, which are served by two NSF divisions, the Division of Astronomical Sciences and the Division of Atmospheric and Geospace Sciences noting that the agency is on the research end of research-to-operations. The unmet data infrastructure needs of that community include an easy-access, user-friendly data infrastructure, file and data standardization, documentation of data for record keeping and end users, and the development of one or more data repositories. NSF supports solar and space physics data systems, such as the Madrigal Database for archival and real-time data from upper atmospheric science instruments and the Community Coordinated Modeling Center for space research models. Although NSF does provide funding for the CubeSat program, Black said that most of its support is focused on ground-based data acquisition.

While NSF has the FAIR (findable, accessible, interoperable, reusable) data policy in place, Black said, data management plan requirements are not uniform, and the reporting of these topics lacks guidance or enforcement. Noting that the decadal survey strategy considers the “data and computing infrastructure needed to support the research strategy and the long-term utility, usability, and accessibility of acquired data,” she suggested that community participation in the decadal survey might be augmented by developing curricula for data resource use and open access and by forming working groups to address some of the topics.

Page 71 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

Schreiner spoke about data curation and analytics issues specifically related to radio occultation data. He offered a historical overview of missions providing radio occultation data, beginning with GPS/MET (for GPS Meteorological experiment) in 1995 and culminating with COSMIC-2 (Constellation Observing System for Meteorology, Ionosphere, and Climate 2) and today’s commercial data buys from global navigation satellite systems. Commercial radio occultation data are an integral part of assessing the state of the ionosphere both now and in the foreseeable future; COSMIC-2 combined with the commercial buys can provide global coverage with some gaps poleward of ±40 degrees latitude. However, the radio occultation data needs standard data and metadata formats, and the large data volumes and complex models require cloud-computing environments. In general, more coordinated and sustained data curation efforts as well as community-developed space weather assimilation models are needed to support R2O2R. Concerning data needs, Schreiner pointed to low-latency access, gridded products, standardization of data formats, and data-proximate cloud computing environments that would enable science applications and assimilative space weather models.

Ireland spoke about NASA’s Heliophysics Digital Resource Library (HDRL), which makes heliophysics data, tools, and services available to the broader community, with a focus on the research side of R2O. The high-level strategy behind HDRL can be broken down into four main categories of goals: preserve (i.e., maintain and improve existing archives), discover (support researchers’ efforts), explore further (enable big data research by bringing together high-end computing and large data sets), and share and publish (support collaborative research and publishing platforms). Recognizing the increasing volume and diversity of solar data, HDRL wishes to help users accelerate heliophysics research. To do so, NASA has set out four broad strategies: enabling open science, lowering current barriers to doing research, implementing new critical capabilities, and being responsive to changing community needs. The science infrastructure for these strategies has four components: the Heliophysics Data and Model Consortium, the Space Physics Data Facility, the Solar Data Analysis Center, and various collaborators, such as the Community Coordinated Modeling Center (CCMC) and the Center for HelioAnalytics. In other words, the path forward will involve both NASA internal deliberations and community inclusion and direction.

Redmon suggested that one way to support R2O2R would be to use new programs such as NASA’s HDRL (described earlier by Ireland) and NOAA’s Space Weather Follow On (SWFO) program to take advantage of interoperable data and metadata standards (e.g., SPASE, the Space Physics Archive Search and Extract) and existing interfaces (e.g., the Heliophysics Application Programming Interface and the Deep Space Climate Observatory [DSCOVR]) to develop benchmark data sets similar to augment climate data records. Another strategy would be to take advantage of collaborations among government, academia, and industry, such as Earth Science Information Partners.

Redmond called for continuing efforts to bring existing but unavailable data to the research community to drive future R2O2R, for example, via the Space Weather Operations, Research, and Mitigation (SWORM) Subcommittee. Notably, he said, NOAA is working to migrate all its data holdings to the cloud and that it is embracing the FAIR policy. Further ideas included tackling modeling challenges with crowd-sourcing (with prize money) and applying ML or AI to anonymize proprietary or protected data for R2O2R use.

Kuznetsova described the Community Coordinated Modeling Center and its capabilities. She focused on the role that the “shared proving grounds” between CCMC and the SWPC play in R2O2R (Figure 5-4), providing, for example, researchers access to operational data streams, model inputs, and simulation outputs through the CCMC.

Kuznetsova listed a number of steps that CCMC is planning to take to improve its processes and products, such as establishing and following best practices for on-boarding and implementation, improving the quality of simulation archives, improving the robustness and speed of simulations, and a move toward “plug and play” models as part of open-source pilot projects. The CCMC also wishes to increase involvement of the modelers in transitioning models to CCMC and implementing GPU-ready code, but the funding for these items was not addressed.

Page 72 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

FIGURE 5-4 A research-to-operations/operations-to-research (R2O2R) pipeline between the operational Space Weather Prediction Center (SWPC) and the Community Coordinated Modeling Center (CCMC).
SOURCE: Masha Kuznetsova, NASA Community Coordinated Modeling Center, presentation to workshop, April 13, 2022.

Engell promoted coordinated high-level infrastructure and execution as a way to support R2O2R, and he called for the definition of best engineering practices drafted by a focused team with outside expertise. He listed a number of specific challenges, including the diversity of data, lack of curation standards, limited availability of operational and research models, and the scarcity of data from both space- and ground-based instruments. He pointed out the many data curation and R2O2R efforts that are progressing in parallel (e.g., NASA’s HPDE/HPCLoud, the SWPC Testbed, the Space Force’s Space Domain Awareness Environmental Toolkit for Defense [SET4D], the Air Force’s Unified Data Library, NOAA AI efforts), and he characterized this as a positive development. The Earth sciences community is developing substantial infrastructure, specifically Pangeo,² as a community platform for big data geoscience, and the space weather community could learn from that effort and work toward building a “Panhelio” facility.

Discussion

The Zoom chat included significant discussion centered on concerns about guaranteeing the longevity of data curation and archiving. The potential role of the World Data Center was mentioned, and its limited role in space weather data curation and archiving. Cloud resources was seen as preferable to facility-based storage systems, but some expressed concerns about the longevity of that solution.

___________________

² See Pangeo, “Hompage,” https://pangeo.io, accessed August 10, 2022.

Page 73 Cite

Suggested Citation:"5 Modeling, Validation, and Data Science." National Academies of Sciences, Engineering, and Medicine. 2022. Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop. Washington, DC: The National Academies Press. doi: 10.17226/26712.

×

The very definition of “long term” was discussed in light of 3- to 5-year planning and proposal funding cycles, NSF facility hardware divesting requirements (which do not include long-term data archival), and the challenges of archiving historical data, including data owned by principal investigators.

With regard to data buys, concerns were expressed about quality control, longevity, reliability, and user access. NOAA’s current radio occultation data buys are made available to the research community with a 24-hour delay, but the committee heard nothing about standards or practices regarding these data.

One commenter said that radio frequency data from the solar corona are key to space weather research and forecast operations, but their archiving, curation, and continued acquisition were not addressed during the workshop, and the NSF support focused on these observations is small.

Similarly, there was concern about the data gap that will be left when the Defense Meteorological Satellite Program (DMSP) and the Polar Operational Environmental Satellite (POES) programs are retired. This will especially affect the ability to monitor spacecraft charging in the auroral zone, forcing operators to resort to models for anomaly resolution.

Further discussion regarding the archiving of SWPC operational models and forecasting products revolved around NOAA’s National Centers for Environmental Information (NCEI), which is intended to archive all SWPC operational forecasts and models. It was unclear what exactly NCEI is mandated to archive, and that lack of clarity mirrors the current lack of resources, especially when coordinating across agencies.

The topic of interagency coordination becomes even more problematic when considering data sharing agreements with international missions. For NASA, such agreements would involve an office other than the science mission directorate responsible for the mission to direct the HDRL to act according to the agreement. The international aspect also points to the question of whether better and more sustained data curation would support the use of non-NASA data sets in NASA proposals; this question was viewed by the panelists as a high-level policy issue.

Planning the Future Space Weather Operations and Research Infrastructure: Proceedings of the Phase II Workshop (2022)

Chapter: 5 Modeling, Validation, and Data Science

5

Modeling, Validation, and Data Science

KEYNOTES: DATA ASSIMILATION AND MACHINE LEARNING

From Terrestrial Weather to Space Weather

Machine Learning in Space Weather Forecasting

Discussion

MACHINE LEARNING AND VALIDATION

Discussion

DATA FUSION AND ASSIMILATION

Discussion

ENSEMBLE MODELING

Discussion

DATA AND MODEL RESOURCES AND CURATION

Discussion

Welcome to OpenBook!

Get Email Updates