Skip to main content

Currently Skimming:

2 Context for Automated Research Workflows
Pages 19-36

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 19...
... Technological developments go hand in hand with scientific progress, and advances in computing and automation are no exception. Computing plays a central role throughout research workflows, from computerized models used for simulation and prediction, to control of equipment and data analysis, to publication.
From page 20...
... FIGURE 2-1  Knowledge discovery loop. NOTE: Automated research workflows can automate and close the loop of scientific discovery.
From page 21...
... To cast the discussion in modern machine learning (ML) terms, the closedloop research workflow in Figure 2-1 encapsulates a form of reinforcement learning (Sutton and Barto, 2018)
From page 22...
... . Reproducibility and Replicability Reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis.
From page 23...
... . AI and ML techniques deployed within ARWs not only can drive an experiment and mine the literature to suggest future experiments, but also may enhance research reliability and productivity by facilitating the reuse of workflows and improving the ability of researchers to monitor workflow execution and detect anomalies (Deelman et al., 2019)
From page 24...
... As the nature of research problems and the cyberinfrastructure platform for exploring them have become more powerful and complex, scientific workflow engines have played a crucial role in harnessing and coordinating distributed computing and data resources. Scientific workflow engines are software tools that capture the computational analysis pipeline of a research project, providing provenance tracking and other functions that facilitate automation, reproducibility, and reusability.
From page 25...
... There are also several distributed computing and automation frameworks with a narrower focus that capture specific execution patterns. These may also be intrinsically part of the "workflow." Examples include tools such as Spark or Hadoop that enable a large number of data processing tasks at scale, or cloud data stores such as BigTable that can execute queries across a large distributed data set.
From page 26...
... . As the next generation of scientific workflow engines expands, automation of the scientific process can lead to a step change in the rate of discovery in many fields.
From page 27...
... Progress is being made in the number and diversity of domain-specific and general data repositories that support FAIR principles and provide archival functionality for long-term access to data and related research objects. Examples can be found in the Registry of Research Data Repositories.3 Progress in Domain-Relevant Artificial Intelligence and Machine Learning Another key factor in building ARWs is the continued advances in learning algorithms for specific domains.
From page 28...
... Understanding and managing the interplay between models derived from domain knowledge, ML, and how the system iteratively drives experimental design constitute a continuing task for ARW development across domains. IMPLEMENTING AUTOMATED RESEARCH WORKFLOWS: A CHANGING SCIENTIFIC PARADIGM Over the past two decades, scientific workflow systems have matured as powerful tools, especially for "resource allocation, task scheduling, performance optimization, and static coordination of tasks on a potentially heterogeneous set of resources" (Altintas et al., 2019)
From page 29...
... . Team science requires tools for managing, capturing, and advancing team collaboration, contribution, and communication as an open process, in addition to the discovery process and its reproducibility.
From page 30...
... Most workflow systems require that the collaboration adopt a specific set of tools and specific methodology for its research. That is, the workflow engines or other enabling tools may embody ways of conducting the work that need to be aligned with the human participants.
From page 31...
... However, the shift from individual workflow development to team science also creates the need for workflow systems to capture the process for validation, seamless integration, and repeatability of the team's activity. Figure 2-3 illustrates in lighter blue the system hierarchy supporting the discovery loop by which the research team interacts with the scientific workflow engine and other software tools to run ML or AI algorithms or methods in a computing infrastructure using data to learn about the model and then to design new experiments based on what is learned.
From page 32...
... POLICY AND INDUSTRY CONTEXT FOR AUTOMATED RESEARCH WORKFLOWS Public Policy Readiness Policy makers and funding agencies in the United States and Europe have articulated a research vision at a scale and complexity that implies robust support for the development and sustainability of ARWs. That is, while not explicitly singling out "support for ARWs," they point to the societal and economic benefits that AI and ML can bring about.
From page 33...
... It also recognizes the need to educate an AI-savvy scientific workforce. In June 2021, OSTP announced the formation of the National Artificial Intelligence Research Resource Task Force as part of implementing the NAIIA.
From page 34...
... In January 2020, the government allocated £300 million to UKRI to fund research infrastructure. Many UK research institutes and infrastructures are also playing key positions and providing pivotal input into the EOSC and have led the initial computational development 5 See https://ec.europa.eu/digital-single-market/en/europe-investing-digital-digital-europe-programme.
From page 35...
... Industrial Use of Workflows This discussion of industrial use of workflows focuses primarily on research applications. Industrial development and use of computational workflows extends beyond -- and predates -- the use of workflows in the realm of scientific research.
From page 36...
... Companies may also use proprietary workflow tools that store and manage data in nonstandard proprietary formats. Since there is little incentive for toolmakers to agree to standards among themselves, researchers may be unable to access or utilize data even if they are technically open.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.