Skip to main content

Currently Skimming:

5 Overcoming Barriers to Wider Use of Automated Research Workflows
Pages 111-135

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 111...
... Beyond technical challenges, discussion at the March 2020 workshop and other information indicates that the same conditions that slow or prevent change in other aspects of the research enterprise are in play here as well. These conditions include the tendency to maintain academic silos and a focus of research funders on investigator-led projects rather than underlying infrastructure.
From page 112...
... Publishers themselves have made significant efforts to encourage data sharing over the past decade and have established community norms for rigor and transparency in data generation (see discussion in Chapter 4)
From page 113...
... Research culture and values are set by funders, academic institutions, publishers, regulators, and tooling platforms, and are often affected by national policies. These entities will play a major role in changing researcher incentives in ways that foster more rapid and effective adoption of ARWs and ensure that community standards for quality assurance, transparency, and reproducibility are upheld.
From page 114...
... ● Incentivizing collaboration and team science. OVERCOMING BARRIERS IN THE RESEARCH CULTURE Cultural changes in the research enterprise are necessary for effective adoption and use of ARWs.
From page 115...
... Automated workflows, community standards, and collaborative approaches are tools designed to support researchers in reliable scientific innovation. They should be used where appropriate but should not replace active and individual human oversight where needed to detect the unexpected.
From page 116...
... This will require integrating domain science training with data science training and relevant software engineering into academic programs across all disciplines at both the undergraduate and graduate levels. In addition, research teams will need additional specialized expertise from research software engineers, computational scientists, and data stewards.
From page 117...
... For example, research software engineers are key players in the development of ARWs and other research workflows, with their own career paths. Organizations such as the United States Research Software Engineer Association, 1 the Society of Research Software Engineering, 2 and the Campus Research Computing Consortium 3 are working to build community among research computing and data professionals.
From page 118...
... . Faculty will need to grapple with difficult questions about existing domain science topics that will need to be dropped to provide time to introduce workflow topics.
From page 119...
... Universities may lack the resources to support the major investments needed to revise domain science curricula to include advanced workflow training. These hurdles can be overcome with extramural financial support from foundations, federal agencies, and industry (Kusnezov, 2020)
From page 120...
... Investment Priorities to Advance ARWs For several of the use cases discussed in Chapter 3, the development of tools and technologies constitutes a key enabler for accelerating progress. For example, materials researchers examined existing research workflow management systems and ended up building their own due to the need for a system that enables dynamic rerouting, facilitates constant communication among researchers, incorporates error management capability, and is flexible.
From page 121...
... A sustainable infrastructure requires software engineers, test engineers, and release engineers, but none of these are typically funded in research grants. Several institutions are working on software sustainability, such as the Software Sustainability Institute, WSSSPE (Working Towards Sustainable Software for Science Practices and Experience)
From page 122...
... Leading domain repositories provide quality FAIR curation and simplify discoverability. They also help develop leading 122 PREPUBLICATION COPY -- Uncorrected Proofs
From page 123...
... Many domain repositories are poorly or inconsistently funded and thus are forced to spend significant staff time on fundraising that could be spent on data services. Support is also needed for related organizations that provide important infrastructure for the data ecosystem, such as Crossref, Datacite, the Research Data Alliance (RDA)
From page 124...
... For example, there has been considerable progress in community efforts to develop standards in areas such as registries (Dockstore, an app store for bioinformatics, 4; WorkflowHub, 5 a registry for describing, sharing, and publishing scientific computational workflows) , services for monitoring and testing (LifeMonitor, 6 OpenEBench 7)
From page 125...
... federal government providing shared resources that have allowed research communities to harness information technologies to significantly advance their work. Examples include the establishment of national supercomputer centers in the 1980s by NSF in partnership with academic institutions, the development of GenBank and other digital data resources in the life sciences by NIH and National Library of Medicine starting in the 1980s, and NSF's advanced cyberinfrastructure program launched in the 125 PREPUBLICATION COPY -- Uncorrected Proofs
From page 126...
... Relevant programs and efforts by DOE, DARPA, and international efforts on the part of the European Union and UK Research and Innovation were also discussed at the workshop and are highlighted in Chapter 2. It is difficult for individual institutions to make cyberinfrastructure investments in the same way as they view, for example, a mass spectrometer or other large physical "thing." One possible model is the Harvard Dataverse, with tens of thousands of data sets deposited for sharing and over 1.5 million downloads.
From page 127...
... As examples in Europe and the United States, two large consortia have sustained their systems through collaborations and shared resources: CERN, the European research program in advanced physics, and CERT, the Community Emergency Response Team in the United States. Another suggestion was to create a stable endowment, akin to the Smithsonian Institution, that can serve as a common resource.
From page 128...
... . RDA was started in 2013 and aims to build "the social and technical infrastructure to enable open sharing and re-use of data." 12 The Research Data Framework 13 was initiated by the National Institute of Standards and Technology in 2019 and is aimed at increasing the supply of trustworthy research data across domains by developing a "strategy for various roles in the research data management ecosystem." Additionally, the FAIR for Research Software Working Group is convened as an RDA Working Group, FORCE11 Working Group, and Research Software Alliance Task Force.
From page 129...
... . At the same time as a privacy-aware public shares personal information in unprecedented ways through social media and other avenues, many also express reservations about its use in research.
From page 130...
... Even areas of research that do not directly work with personal data must consider privacy issues. The goal of making the workflow itself transparent strengthens reproducibility but could impinge on privacy under certain circumstances, for example, by revealing personal information about specific researchers.
From page 131...
... . The principles and guidelines espoused by these initiatives overlap to a significant degree, with the need for human review, protecting data privacy and security, uncovering and addressing bias, and support for transparency and reproducibility generally being invoked.
From page 132...
... . Research and its associated data production and use or reuse is also international, making the effectiveness of a single national government on shaping global policy challenging.
From page 133...
... . As Oren Etzioni, the CEO of the Allen Institute for Artificial Intelligence, told attendees at a NASEM convened workshop in 2018, "systems use data from the past to generate models to predict the future, so if society's past was racist and sexist, the models will carry that bias into the future and also, for technical reasons, exacerbate it" (NASEM, 2018c)
From page 134...
... The authors concluded, "The iREDS approach shifts the paradigm of research ethics training from merely telling researchers what is and is not ethical, to empowering them to incorporate ethical practices into their research workflow." 134 PREPUBLICATION COPY -- Uncorrected Proofs


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.