National Academies Press: OpenBook
« Previous: Summary
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page19
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page20
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page21
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page22
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page23
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page24
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page25
Suggested Citation:"1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2022. Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: The National Academies Press. doi: 10.17226/26532.
×
Page26

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

1 Introduction The needs and demands placed on science to address a range of urgent problems are growing. The world is faced with complex, interrelated challenges in which the way forward lies hidden or dispersed across disciplines and organizations. Treatment for and immunization against COVID-19 is the most immediate example at the time of this report, but so, too, is the push in other disease areas such as cancers and Alzheimer’s disease, as well as in climate change, natural disaster prevention and mitigation (earthquake risk assessment, hurricane forecasting), agriculture (feeding a growing world population with finite resources), and other critical areas. For centuries, scientific research has progressed through iteration of a workflow built on experimentation or observation and analysis of the resulting data. While computers and automation technologies have played a central role in research workflows for decades to acquire, process, and analyze data, these same computing and automation technologies can now also control the acquisition of data, for example, through the design of new experiments or decision making about new observations. The committee uses the term automated research workflows to describe scientific research processes that are emerging across a variety of disciplines and fields. automated research workflows (ARWs) integrate computation, laboratory automation, and tools from artificial intelligence in the performance of tasks that make up the research process, such as designing experiments, observations, and simulations; collecting and analyzing data; and learning from the results to inform further experiments, observations, and simulations. While the 19 PREPUBLICATION COPY—Uncorrected Proofs

specific tools and resources used and the tasks performed vary by field, the common goal of researchers implementing ARWs is to accelerate scientific knowledge generation, potentially by orders of magnitude, while achieving greater control and reproducibility in the scientific process. This enhanced capability, in turn, is enabling researchers to address qualitatively new questions and collaborate more effectively. Artificial intelligence (AI) and machine learning (ML) techniques play an increasingly important role in ARWs, from uses in data exploration and analysis to the driving and directing of the larger research process as a closed-loop system where AI and ML analyses of results direct the next cycle of experimental design and planning. The committee believes that ARWs constitute the next significant advance in the ongoing revolution in scientific research driven by advances in information technology and associated hardware infrastructure. Although the design of such ARWs remains in the hands of humans, execution can be automated and accelerated. ARWs can help manage and exploit the exponentially expanding amount and availability of data. Within these data may well lie the solutions to many problems the world faces today, as well as to problems that we will confront tomorrow. Without technological assistance and automation, it would be impossible for humans to review, much less use, the enormous data resources that may prove pivotal to discovery. ARWs can increase the speed and quality of discovery. At the same time, ARWs provide a way to satisfy pressing demands across fields to increase interoperability, reproducibility, replicability, and trustworthiness by better tracking results, recording data, establishing provenance, and creating more consistent metadata than even the most dedicated researchers can provide by themselves. Thus, ARWs can also support more transparent and reliable science. A growing number of research funders are encouraging or requiring that the data, methods (including analytical 20 PREPUBLICATION COPY—Uncorrected Proofs

code), and other artifacts underlying the work that they support be openly available. Wide implementation of ARWs could encourage researchers to make more of their research data findable, accessible, interoperable, and reusable (FAIR) and facilitate data and software reuse and sharing in trusted repositories. Although open and FAIR research outputs are not a prerequisite to the use of ARWs, they are highly desirable and complementary. WORK OF THE COMMITTEE In 2019, the National Academies of Sciences, Engineering, and Medicine’s Board on Research Data and Information, in collaboration with the Board on Mathematical Sciences and Analytics and the Computer Science and Telecommunications Board, launched a study aimed at examining current efforts to develop advanced and automated workflows to accelerate research progress, including wider use of artificial intelligence (see Box 1-1 for the committee’s statement of task). An expert committee undertook the study with support from Schmidt Futures. To accomplish its task, the committee held an initial meeting on August 13, 2019, in Washington, DC, to organize the study process, identify information needs, and develop the agenda and identify participants for a public workshop. As the primary information-gathering mechanism, a 2-day virtual workshop, “Opportunities for Accelerating Scientific Discovery: Realizing the Potential of Advanced and Automated Workflows,” was held March 16–17, 2020. The committee defined the subtopics for the study and identified more than 25 outstanding experts to participate in the workshop. Presentation and discussion topics included research use cases, mathematical and algorithmic barriers, trajectories for supporting tools and systems, standards and social context, policy and educational implications, communities and sustainable funding, and transparency and accountability (see Appendix A for the full agenda and the National 21 PREPUBLICATION COPY—Uncorrected Proofs

Academies’ website for copies of the presentations 1). More than 20 virtual committee meetings using collaborative authoring tools were then held to discuss, draft, and finalize this consensus report. BOX 1-1 Committee Statement of Task An ad hoc committee of the National Academies of Sciences, Engineering, and Medicine will conduct a study that examines current efforts to develop advanced and automated workflows for scientific research. The study will also identify promising research approaches to accelerating progress in the effectiveness and utilization of workflow systems and tools. The committee’s primary information gathering will consist of a workshop that examines the status of research workflows in several example fields, key barriers and enablers, and emerging opportunities. The workshop will explore the role of open science, in the form of broad access to research articles, data, and analytical code, and other enabling factors. Based on insights from the workshop, a review of the literature, and other inputs, the committee will produce a consensus report that identifies research needs and priorities in the use of advanced and automated workflows for scientific research. SCOPE OF THE STUDY AND ORGANIZATION OF THIS REPORT In its deliberations, the committee recognized the multifaceted nature of its charge. Different disciplines of research have very different practices relative to ARWs—in terms of specific tools and platforms and, more generally, propensity to incorporate workflows into their processes in the first place. In addition, several lines of thought that emerged from the March 2020 workshop are germane not just to the task at hand, but more broadly across the scientific enterprise. These themes are (1breaking down academic silos, (2) providing incentives for greater collaboration among researchers, (3) ensuring greater interoperability across 1 Copies of the speaker presentations are available at: https://www.nationalacademies.org/event/03-16- 2020/realizing-opportunities-for-advanced-and-automated-workflows-in-scientific-research-second-meeting. 22 PREPUBLICATION COPY—Uncorrected Proofs

technologies, (4) sharing of a broader range of research outputs, and (5) striking an appropriate balance between access to and protection of data. In this report, we filter these issues through the lens of ARWs, recognizing that other National Academies committees have explored many of these issues in greater depth. Another topic that emerged at the March 2020 workshop and in recent literature is the role of scientific workflow engines as important enablers of effective development and implementation of ARWs. The committee recognizes this technology as critical for advancing the utilization of ARWs, and we refer to various tools and resources throughout this report. However, this more specific aspect of workflow management is distinct from our broader consideration of ARWs. Our recommendations concern technical issues to be addressed through future research, as well as associated cultural, educational, and policy-related issues. The report is intended to create awareness, momentum, and synergies to realize the potential of ARWs in scholarly discovery. Issues and questions related to workflows for research that motivate our work included the following: ● How do ARWs affect the research process in various fields and disciplines? ● What are the barriers to implementing ARWs and how do they operate in different disciplines? These barriers include inadequate workflow literacy, resistance to adoption of scientific automation, insufficient appreciation of dangers of p-hacking, selection bias and model overfitting, and concerns about privacy protections of data and procedures (especially for cloud-based workflow systems). ● How does open research (encompassing open science and open scholarship), in the form of open availability of articles, data, code, and other research products, contribute to the 23 PREPUBLICATION COPY—Uncorrected Proofs

utility and attractiveness of ARWs? And, conversely, how do ARWs contribute to the utility of open research? ● What technical and operational issues arise in implementing ARWs? ● What are the current regulatory enablers and barriers affecting the adoption of ARWs in various fields and disciplines? ● What considerations related to costs for equipment, software, staffing, and training come into play in the process of adopting ARWs? ● What are the implications of broader use of ARWs for educational approaches for students and faculty? ● Are there promising areas for investment and activity on the part of research funders and research institutions in the development and implementation of ARWs? ● And last, but critically, how do researchers need to evolve their practices to use ARWs in a manner that enables them to reap the benefits of automation while not losing the benefits of serendipity of discovery? To address these questions, the report is organized as follows. Chapter 2 highlights the context for ARWs with a focus on the evolution and development of technologies and relevant policies that may facilitate or inhibit their greater use. Chapter 3 provides case studies of workflows across disciplines in the sciences and humanities; they are based on March 2020 workshop presentations, a review of the literature, and the committee’s own experience. Chapter 4 looks at crosscutting issues across disciplines that ARWs can help with relative to research integrity, reproducibility and replicability, and dissemination. This examination feeds into 24 PREPUBLICATION COPY—Uncorrected Proofs

Chapter 5’s consideration of the barriers and opportunities across fields, to which the committee offers its recommendations. Chapter 6 presents the committee’s findings and recommendations and offers concluding thoughts and potential next steps for researchers and institutions in both the public and private sectors, funders, and policy makers. 25 PREPUBLICATION COPY—Uncorrected Proofs

Next: 2 Context for Automated Research Workflows »
Automated Research Workflows For Accelerated Discovery: Closing the Knowledge Discovery Loop Get This Book
×
Buy Paperback | $45.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The needs and demands placed on science to address a range of urgent problems are growing. The world is faced with complex, interrelated challenges in which the way forward lies hidden or dispersed across disciplines and organizations. For centuries, scientific research has progressed through iteration of a workflow built on experimentation or observation and analysis of the resulting data. While computers and automation technologies have played a central role in research workflows for decades to acquire, process, and analyze data, these same computing and automation technologies can now also control the acquisition of data, for example, through the design of new experiments or decision making about new observations.

The term automated research workflow (ARW) describes scientific research processes that are emerging across a variety of disciplines and fields. ARWs integrate computation, laboratory automation, and tools from artificial intelligence in the performance of tasks that make up the research process, such as designing experiments, observations, and simulations; collecting and analyzing data; and learning from the results to inform further experiments, observations, and simulations. The common goal of researchers implementing ARWs is to accelerate scientific knowledge generation, potentially by orders of magnitude, while achieving greater control and reproducibility in the scientific process.

Automated Research Workflows for Accelerated Discovery: Closing the Knowledge Discovery Loop examines current efforts to develop advanced and automated workflows to accelerate research progress, including wider use of artificial intelligence. This report identifies research needs and priorities in the use of advanced and automated workflows for scientific research. Automated Research Workflows for Accelerated Discovery is intended to create awareness, momentum, and synergies to realize the potential of ARWs in scholarly discovery.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!