Skip to main content

Currently Skimming:


Pages 57-96

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 57...
... It now comprises 10 professional societies worldwide, including the Association of Computers and the Humanities, based in the United States, and the European Association for Digital Humanities, among others. Its mission is to promote and support "digital research and teaching across all arts and humanities disciplines, acting as a community-based advisory force, and supporting excellence in research, publication, collaboration, and training." ADHO members publish peer-reviewed journals (such as DSH: Digital Scholarship in the Humanities)
From page 58...
... Similar to the sciences and engineering, ARWs in the humanities result in a hybrid environment that integrates human feedback and contributions with ongoing automated analysis of linguistic sources. The machines can mine data, but a human in the loop must provide the training material that drives the artificial intelligence systems and corrects raw material that gets fed back into the system.
From page 59...
... . Data sits at the core of what federal agencies, and state and local agencies, are asked to do." The following examples illustrate how new data resources and advanced analytics are being applied in the social and behavioral sciences.
From page 60...
... As in other disciplines, the amount of data becoming available to social and behavioral scientists presents challenges related to the size of the data sets, as well as ensuring the integrity of identifiable personal information, reproducibility of results, and archiving. According to Lane (2020)
From page 61...
... The social and behavioral sciences have several existing practices and institutions that can help facilitate the development and implementation of ARWs. For example, there are established organizations charged with data stewardship and related training such as the Inter-university Consortium for Political and Social Research and the National Opinion Research Center.
From page 63...
... During the workshop and in discussions, the committee considered how automated research workflows (ARWs) contribute to these crosscutting research issues.
From page 64...
... ARWs provide a significant opportunity to address these issues and hence enhance research integrity by • Enabling automated capture and retention of data and their associated metadata in cyberinfrastructure deployed across the research life cycle. • Better documentation and reporting of the details of the methods, increas ing the ability of other researchers to scrutinize the work and potentially reducing the possibility of data being falsified or results being selectively reported.
From page 65...
... An overreliance on naive machine learning (ML) can in itself introduce p-hacking and other errors, warned Rebecca Nugent at the March 2020 workshop (Nugent, 2020)
From page 66...
... , reproducibility involves "obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis." Replicability involves "obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data."
From page 67...
... . Blockchain can potentially lock in protocols and outputs so that it is clear that nothing has been interfered with, whether deliberately or through poor research practices.
From page 68...
... Notable examples include efforts to facilitate data sharing, bring clarity to author contributions, and enable interdisciplinarity and more rapid utilization of research findings through "convergent" approaches such as We Share Data's Data Sharing Seminar Series for Societies4 (McNutt, 2017; McNutt et al., 2018)
From page 69...
... .8 Other platforms include the Gates Foundation's Gates Open Research9 and Open Research Europe.10 These models use transparent peer review and support inarticle visualization of data and tools such as Code Ocean11 and Whole Tale12 so that readers and peer reviewers can assess the code, edit it, and reanalyze the data on the fly within the article without the labor-intensive need to set up the relevant computational environment. Some publishers are also exploring publication of electronic notebooks and including these directly into publishing, and a few in chemistry are including these directly into publishing workflows (AGU, 2021)
From page 70...
... As the speed of research updates created by automated research is likely to increase, we need to consider how to adequately review these outputs, especially given that peer reviewers are already overwhelmed. Publishers are developing article transfer mechanisms of various forms to minimize subsequent review as a manuscript passes between journals looking for acceptance.
From page 71...
... , and the balance in seniority between those roles, as well as credit for work if the automated workflows are generating their own further research questions based on the previous data. With authorship in traditionally published outlets as a significant component toward promotion, tenure, and funding decisions, the balance between the contribution of the workflow and the human needs to be carefully thought through.
From page 73...
... The committee identified five main challenges to wider use of ARWs and offers ideas to address them related to the incentive system, the current research culture, education and training needs, sustainability, and privacy and ethical concerns. REIMAGINING INCENTIVES There has been extensive discussion in recent years about the perverse or misaligned incentives for researchers that result from hypercompetition and the inappropriate use of bibliometric measures in evaluation (Teitelbaum, 2008; Casadevall and Fang, 2012; Stephan, 2012; DORA, 2013; Alberts et al., 2014; NASEM, 2017)
From page 74...
... Publishers themselves have made significant efforts to encourage data sharing over the past decade and have established community norms for rigor and transparency in data generation (see discussion in Chapter 4)
From page 75...
... OVERCOMING BARRIERS IN THE RESEARCH CULTURE Cultural changes in the research enterprise are necessary for effective adoption and use of ARWs. It will be important to develop these processes in a way that promotes ARWs as tools that can support both reliability and innovation in discovery, rather than falling into the trope of "machines replacing humans." This inaccurate representation has been seen extensively with the advent of AI in medicine, including articles in the popular press about whether "AI will replace doctors," and it has hampered progress.
From page 76...
... This will require integrating domain science training with data science training and relevant software engineering into academic programs across all disciplines at both the undergraduate and graduate levels. In addition, research teams will need additional specialized expertise from research software engineers, computational scientists, and data stewards.
From page 77...
... For example, research software engineers are key players in the development of ARWs and other research workflows, with their own career paths. Organizations such as the United States Research Software Engineer Association,1 the Society of Research Software Engineering,2 and the Campus Research Computing Consortium3 are working to build community among research computing and data professionals.
From page 78...
... . Although it is not necessary for all discipline experts to acquire expert proficiency in data science or coding, they should have enough background to critically assess the "black box" aspects of many workflow tools so they can understand and make adjustments for any likely biases inherent in the system.
From page 79...
... Laboratory automation technology is a major driver for advances in experimental domains such as chemical synthesis of pharmaceuticals and materials research. To realize the potential of ARWs, it is essential that software tools for aspects of workflows that transcend disciplines -- for example, those involving AI and ML methods for designing experiments and learning from data -- become interoperable and broad purpose.
From page 80...
... One of the workshop speakers cited digital music as an analogy; to implement ARWs, communities need to move to shared data resources in the cloud that are available for a myriad of uses, similar to music streaming services. Creating and sustaining community data resources involves many challenges, including funding, deciding which data sets should be stored and maintained, and facilitating interoperability between them.
From page 81...
... For example, there has been considerable progress in community efforts to develop standards in areas such as registries (Dockstore, an app store for bioinformatics;4 WorkflowHub,5 a registry for describing, sharing, and publishing scientific computational workflows) , services for monitoring and testing (LifeMonitor,6 OpenEBench7)
From page 82...
... federal government providing shared resources that have allowed research communities to harness information technologies to significantly advance their work. Examples include the establishment of national supercomputer centers in the 1980s by NSF in partnership with academic institutions, the development of GenBank and other digital data resources in the life sciences by NIH and the National Library of Medicine starting in the 1980s, and NSF's advanced cyberinfrastructure program launched in the early 2000s.
From page 83...
... . As discussed in Chapter 2, it is inherently more difficult to fund development and maintenance of productionquality software (workflow engines, automated tools, etc.)
From page 84...
... . RDA was started in 2013 and aims to build "the social and technical infrastructure to enable open sharing and re-use of data."12 The Research Data Framework13 was initiated by the National Institute of Standards and Technology in 2019 and is aimed at increasing the supply of trustworthy research data across domains by developing a "strategy for various roles in the research data management ecosystem." Additionally, the FAIR for Research Software Working Group is convened as an RDA Working Group, FORCE11 Working Group, and Research Software Alliance Task Force.
From page 85...
... Even areas of research that do not directly work with personal data must consider privacy issues. The goal of making the workflow itself transparent strengthens reproducibility but could impinge on privacy under certain circumstances, for example, by revealing personal information about specific researchers.
From page 86...
... . Ideas proposed at the workshop included embedding compliance in the design of the software for open research data services and standards for the architecture of the sharing and access system (Burgelman, 2020)
From page 87...
... The authors concluded, "The iREDS approach shifts the paradigm of research ethics training from merely telling researchers what is and is not ethical, to empowering them to incorporate ethical practices into their research workflow."
From page 89...
... Yet new twists will need to be considered and addressed. Concerns about privacy, ethics, and trust arising in many domains of human activity become even more relevant to the entire research enterprise as we increase use of artificial intelligence (AI)
From page 90...
... Finding C and Recommendation 5 are also supported mainly in Chapter 5, again, with points drawn from the use cases. FINDINGS AND RECOMMENDATIONS Finding A: Accelerating Discovery In many disciplines, the emergence of automated research workflows (ARWs)
From page 91...
... In addition, incorporating emerging principles and guidelines for responsible artificial intelligence and machine learning advocated by various organizations, such as building in human review of algorithms, uncovering and addressing bias, and supporting transparency and reproducibility, will also help to secure the benefits of ARWs. RECOMMENDATION 1: Design Principles Organizations that fund, perform, and disseminate research, along with scientific societies, should support and enable automated research workflows (ARWs)
From page 92...
... Multidisciplinary, multirole collaboration is essential to realize the potential of ARWs. RECOMMENDATION 2: Infrastructure, Code, and Data Sustainability Research funders, working with other stakeholders such as societies, research institutions, and publishers, should place greater priority on approaches to ensuring the creation and sustainability of key systems, tools, platforms, and data archives for automated research workflows (ARWs)
From page 93...
... Finding D: Legal and Policy Issues In addition to barriers to progress that exist within the research process itself, there are legal and policy issues that affect implementation of automated research workflows in specific domains that will require international multistakeholder efforts to address. RECOMMENDATION 5: Preserving Privacy Research enterprise funders, performers, publishers, and beneficiaries should work with governments, data privacy experts, and other entities to address the legal, policy, and associated technical barriers to implementing automated research workflows in use-inspired applications in specific domains and explore solutions to make the outputs available through privacy-preserving algorithms, federated learning approaches to using data, and other methods.
From page 95...
... 2018. Evolving Role of Scientific Workflows in a Highly Networked, Collaborative and Dynamic Data-Driven World.
From page 96...
... conditions for research data management. Presentation at the Workshop on Opportunities for Accelerating Scientific Discovery: Realizing the Potential of Advanced and Automated Workflows, March 16–17.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.