Skip to main content

Currently Skimming:

4 Automatic Research Workflows and Implications for Advancing Research Integrity, Reproducibility, and Dissemination
Pages 97-110

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 97...
... They need to do this in structures that range from small, vertically structured labs involving supervision and mentorship to huge dispersed networks across institutions with no formal lines of authority. In this context comes another overlay: the need to conduct experiments in a manner that allows others not only to understand the findings, but also to have access to and use of the data and methods to arrive at those findings.
From page 98...
... As noted in the National Academies' 2017 report Fostering Integrity in Research: In theory, if not always in practice, all the data contributing to a research result can now be stored electronically and communicated to interested researchers. However, this trend toward greater transparency has created tasks and responsibilities for researchers and the research enterprise that did not previously exist, such as creating, documenting, storing, and sharing scientific software and immense databases and providing guidance in the use of these new digital objects (NASEM, 2017, pp.
From page 99...
... to adequately capture and report all elements of an experiment, and then imposes similar challenges for peer reviewers in adequately assessing all this detailed information, with few incentives to authors or reviewers to undertake this effort. ARWs provide a significant opportunity to address these issues and hence enhance research integrity by ● Enabling automated capture and retention of data and their associated metadata in cyberinfrastructure deployed across the research life cycle.
From page 100...
... It involves assessment of which data sets to prioritize for the considerable effort involved in curation, as well as training, incentives to prioritize the effort above other tasks such as conducting further experiments that 1 HARKING, or "Hypothesizing after the Results are Known," was so-named by psychologist Norman Kerr in 1998. P-hacking refers to manipulating statistical data to show that a result is more significant than it is.
From page 101...
... The growing ubiquity and complexity of computation in the research process across many disciplines presents additional challenges to independently reproducing results. Examples of these challenges include the use of nonpublic data and code in research, the costs of retrofitting long-standing research projects with tools that automatically capture logs of computational decisions, and incomplete information about the computing environment where the research was originally performed (NASEM, 2019b)
From page 102...
... ● Increasing efficiencies and eliminating a potential source of errors in onboarding new research team members, and better supporting knowledge transfer as research teams change. ● Supporting broader and more stringent review and validation of findings, including through formal peer review during publication as well as by the broader research community when data and associated methods, materials, and code are published.
From page 103...
... . Blockchain can potentially lock in protocols and outputs so that it is clear that nothing has been interfered with, whether deliberately or through poor research practices.
From page 104...
... Notable examples include efforts to facilitate data sharing, bring clarity to author contributions, and enable interdisciplinarity and more rapid utilization of research findings through "convergent" 104 PREPUBLICATION COPY -- Uncorrected Proofs
From page 105...
... For example, StatReviewer aims to check that the statistics and methods in manuscripts are sound, and UNSILO's Evaluate tool 5 uses advanced machine intelligence and natural language understanding to help authors, editors, reviewers, and publishers carry out evaluation and screening of submitted manuscripts. Many tools use text mining, ML, network analysis, and other methods to filter published research to support researchers in keeping on top of the relevant literature (e.g., Researcher app, most bibliographic reference manager apps)
From page 106...
... A shift in how research is conducted and produced is allowing the community to rethink publishing approaches beyond simply automating existing practices, but rather to better utilize technologies aligned with the way research outputs are being produced. For example, models are now available to rapidly publish (typically within a few days)
From page 107...
... Containers can be disseminated along with articles and enable reproducibility and broader sharing. Workflows can thus effectively be distributed with their entire runtime environment and versioning, allowing for preservation of provenance information.
From page 108...
... As the speed of research updates created by automated research is likely to increase, we need to consider how to adequately review these outputs, especially given that peer reviewers are already overwhelmed. Publishers are developing article transfer mechanisms of various forms to minimize subsequent review as a manuscript passes between journals looking for acceptance.
From page 109...
... A collaboration between Wellcome, the Sanger Institute, and F1000 Research produced the first such publications in 2021. However, in the race to keep up with the volume of outputs, it is important not to lose the detail of peer review sufficient to adequately assess workflow systems and hence the impact that they could have on research and trust in the scholarly system.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.