Skip to main content

Currently Skimming:

3 Understanding Reproducibility and Replicability
Pages 39-54

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 39...
... Because the terms reproducibility and replicability are used differently across different scientific disciplines, introducing confu sion to a complicated set of challenges and solutions, the committee also details its definitions and highlights the scope and expression of the problems of non-reproducibility and non-replicability across science and engineering research. THE EVOLVING PRACTICES OF SCIENCE Scientific research has evolved from an activity mainly undertaken by individuals operating in a few locations to many teams, large communities, and complex organizations involving hundreds to thousands of individuals worldwide.
From page 40...
... For example, public health researchers mine large databases and social media, searching for patterns, while earth scientists run massive simulations of complex systems to learn about the past, which can offer insight into possible future events. Another change in science is an increased pressure to publish new scientific discoveries in prestigious and what some consider high-impact journals, such as Nature and Science.1 This pressure is felt worldwide, across disciplines, and by researchers at all levels but is perhaps most acute for researchers at the beginning of their scientific careers who are trying to establish a strong scientific record to increase their chances of obtaining tenure at an academic institution and grants for future work.
From page 41...
... In response to the threat, biomedical researchers developed a wide variety of approaches to address the concern, including an emphasis on randomized experiments with masking (also known as blinding) , reliance on meta-analytic summaries over individual trial results, proper sizing and power of experiments, and the introduction of trial registration and detailed experimental protocols.
From page 42...
... These articles introduced new concerns about the availability of data and code and highlighted problems of publication bias, selective reporting, and misaligned incentives that cause positive results to be favored for publication over negative or nonconfirmatory results.3 Some news articles focused on issues in biomedical research and clinical trials, which were discussed in the general media partly as a result of lawsuits and settlements over widely used drugs (Fugh-Berman, 2010)
From page 43...
... B2: "Reproducibility" refers to independent researchers arriving at the same results using their own data and methods, while "rep licability" refers to a different team arriving at the same results using the original author's artifacts. B1 and B2 are in opposition of each other with respect to which term involves reusing the original authors' digital artifacts of research ("research compendium")
From page 44...
... for computer science was published in 2016 as a system for badges attached to articles published by the society. The ACM declared that its definitions were inspired by the metrology vocabulary, and it associated using an original author's digital artifacts to "replicability," and developing completely new digital artifacts to "reproducibility." These terminological distinctions contradict the usage in computational science, where reproducibility is associated with transparency and access to the author's digital artifacts, and also with social sciences, economics, clinical studies, and other domains, where replication studies collect new data to verify the original findings.
From page 45...
... Thus, reproducibility includes the act of a second researcher recomputing the original results, and it can be satisfied with the availability of data, code, and methods that makes that recomputation possible. This definition of reproducibility refers to the transparency and reproducibility of computations: that is, it is synonymous with "computational reproducibility," and we use the terms interchangeably in this report. When a new study is conducted and new data are collected, aimed at the same or a similar scientific question as a previous one, we define it as a replication.
From page 46...
... In general, whenever new data are obtained that constitute the results of a study aimed at answering the same scientific question as another study, the degree of consistency of the results from the two studies constitutes their degree of replication. Two important constraints on the replicability of scientific results rest in limits to the precision of measurement and the potential for altered results due to sometimes subtle variation in the methods and steps performed in a scientific study.
From page 47...
... It is useful to note that precision is different from the accuracy of a measurement system, as shown in Figure 3-1, demonstrating the differences using an archery target containing three arrows. In Figure 3-1, A, the three arrows are in the outer ring, not close together and not close to the bull's eye, illustrating low accuracy and low precision (i.e., the shots have not been accurate and are not highly precise)
From page 48...
... If the exact location of the bull's eye is unknown, one must not presume that a more precise set of measures is necessarily more accurate; the results may simply be subject to a more consistent bias, moving them in a consistent way in a particular direction and distance from the true target. It is often useful in science to describe quantitatively the central tendency and degree of dispersion among a set of repeated measurements of the same entity and to compare one set of measurements with a second set.
From page 49...
... . When one is interested in comparing the degree to which the set of measurements obtained in one study are consistent with the set of measurements obtained in a second study, the committee characterizes this as a test of replicability because it entails the comparison of two studies aimed at the same scientific question where each obtained its own data.
From page 50...
... proximity of the mean value (central tendency) of the second set relative to the mean value of the first set, measured both in physical units and relative to the standard error of the estimate 2.
From page 51...
... A simple visual inspection of the means and standard errors for measurements obtained by different laboratories may be sufficient for a judgment about their replicability. For example, in Figure 3-2, it is evident that the bottom two measurement results have relatively tight precision and means that are nearly identical, so it seems reasonable these can be considered to have replicated one another.
From page 52...
... One of the assumptions of the scientific process is that rigorously conducted studies "and accurate reporting of the results will enable the soundest decisions" and that a series of rigorous studies aimed at the same research question "will offer successively ever-better approximations to the truth" (Wood et al., 2019, p.
From page 53...
... 338) in which nearly 30 independent research teams were given the same raw dataset and asked the same questions: "whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players and whether this relation is moderated by measures of explicit and implicit bias in the referees' country of origin." The results showed wide variation, with 69 percent of the teams reporting a significant positive effect and 31 percent not finding a significant relationship.
From page 54...
... FINDING 3-1: In general, when a researcher transparently reports a study and makes available the underlying digital artifacts, such as data and code, the results should be computationally reproducible. In con trast, even when a study was rigorously conducted according to best practices, correctly analyzed, and transparently reported, it may fail to be replicated.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.