Skip to main content

Currently Skimming:

2 Overview and Case Studies
Pages 8-34

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 8...
... Victoria Stodden (University of Illinois, Urbana-Champaign) gave an overview of statistical challenges of reproducibility, and Yoav Benjamini (Tel Aviv University)
From page 9...
... The issue crosses research areas but is especially relevant in preclinical research that uses animal models as a prelude to human research, according to Tabak. He noted that science is often viewed as selfcorrecting and is therefore assumed to be immune from reproducibility problems.
From page 10...
... , innovation, and grant support. The biomedical research ecosystem should have research integrity at its foundation and balance robust research training (including biostatistics, basic scien tific coursework, and experimental design fundamentals)
From page 11...
... Qualters emphasized the importance of understanding how data are generated and what methodologic approaches were used, as well as what tool employed is crucial for reproducibility. There are powerful tools available to measure reliability and the confidence associated with statistical results, but they are based on assumptions about the underlying data and theory about relationships and causality.
From page 12...
... The Data Access and Research Transparency initiative has increased the number of journals committed to providing complete, publicly available replication materials for all published work, specifically by • Requiring authors to ensure that cited data are available at the time of publication through a trusted digital repository (journals may specify which trusted digital repository shall be used) ; • Requiring authors to delineate clearly the analytic procedures upon which their published claims rely and, where possible, to provide access to all relevant analytic materials; • Maintaining a consistent data citation policy that increases the credit that data creators and suppliers receive for their work; and • Ensuring that journal style guides, codes of ethics, publication manuals, and other forms of guidance are updated and expanded to include strong requirements for data access and research transparency.
From page 13...
... d Improvements would allow researchers to advance technology more ­ asily and e practitioners to develop new products faster, and they would reduce the amount of "noise" in the research literature. Another possible way in which improving reproducibility could benefit IEEE may depend, according to Setti, on making the review process more reliable, making it more difficult to plagiarize a paper, and making it easier to discover false results and avoid retractions.
From page 14...
... • Reward positive efforts of authors who contribute to the reproducibility of research. --  well-prepared reproducibility contribution requires time and ­ ffort, A e which currently may be a disincentive due to "publish or perish" pressure.
From page 15...
... offering more than 2,000 experimental services to help perform scientific replications. Science Exchange has utilized this network to undertake independent replication of preclinical research.
From page 16...
... L ­ omax says that this study clarifies what is needed to perform replications of preclinical research and illustrates how difficult it can be to replicate published research. Lomax concluded by listing some of Science Exchange's other current projects, including partnering with the Prostate Cancer Foundation and PeerJ to look at reproducibility of prostate cancer research, participating in the Reproducibility Initiative, and partnering with reagent companies to validate antibodies.
From page 17...
... NIH is emphasizing the importance of rigorous preclinical research that underlies key decisions before taking that research to human trials. A participant wondered how free software such as R is influencing current analysis.
From page 18...
... Empirical reproducibility can entail its own special constraints. Stodden shared the example of a 2014 workshop held by the National Academies of Sciences, Engineering, and Medicine's Institute for Laboratory Animal Research that discussed reproducibility issues in research with animals and animal models (NASEM, 2015)
From page 19...
... largely remain inaccessible in the scholarly record. With respect to the use of computation in conducting research and what it means to be at the standard necessary to establish an acceptable new branch of the scientific method, Stodden believes that the community needs deal with reproducibility issues.
From page 20...
... , the file drawer problem, overuse and misuse of p-values, and lack of multiple testing adjustments; • Low power, poor experimental design, and nonrandom sampling; • Data preparation, treatment of outliers, recombination of data sets, and insufficient reporting and tracking practices; • Inappropriate tests or models and model misspecification; • Model robustness to parameter changes and data perturbations; and • Investigator bias toward previous findings, and conflicts of interest. Stodden noted that not all of these issues are inherently bad (e.g., having small samples)
From page 21...
... held a workshop 4 about what reproducibility means in the high-­ performance computing context, what the next steps might be, and how we might improve reproducibility in that context. She summarized three broad reproducibility issues to include the following: 1.
From page 22...
... Case Studies Animal Phenotyping Yoav Benjamini, Tel Aviv University Yoav Benjamini began by explaining that while reproducibility and replicability have only recently come to the forefront of many scientific disciplines, they have been prevalent issues in mouse phenotyping research for several decades (Mann, 1994; Lehrer, 2010)
From page 23...
... mixed-model interaction cannot be eliminated by design. Initially, the existence of significant GxL interaction was considered a lack of replicability, but Benjamini argued that the existence of GxL interaction cannot be avoided in part because the genotyping by laboratory effect is unknown.
From page 24...
... To improve estimates of GxL variability, Benjamini proposed making use of large publicly available databases of mouse phenotyping results (e.g., the International Mouse Phenotyping Consortium5)
From page 25...
... A participant questioned the generalizability of the approach, specifically if the approaches to analyzing, modeling, and judging reproducibility of mice data apply to other research areas or if a different framework would need to be developed. Benjamini responded that this could be generalized by identifying the real uncertainty (Mosteller and Tukey, 1977)
From page 26...
... that allowed capital punishment to resume. In 1978, the National Research Council released a report that assessed the research cited by the Supreme Court and found that there was little evidence from social science to suggest that executions deterred homicide (NRC, 1978)
From page 27...
... The deterrent effect shown in literature from this time period is outlined in Table 2.1. Wolfers noted that the broader context is one of robust debate in the policy world.
From page 28...
... . Wolfers noted that research intending to analyze the deterrent effect of the death penalty could be approached by measuring a causal effect of an experiment with subjects.
From page 29...
... If only significant results are reported If only positive statistically significant results are reported 10 Each dot represents a hypothetical study 5 Figure 2-2 left Estimated Coefficient R02978 vector editable 0 -5 -10 0 1 2 3 4 Standard error Justin Wolfers, L e s s o n s fr o m r e plic a t in g d e a t h pe n a lt y r e s e a r c h 2 FIGURE 2.2  Funnel charts showing estimated coefficients and standard error if (a) all hypothetical study experiments are being reported and (b)
From page 30...
... NOTE: H0, No reporting bias implies that estimated effects should be unrelated to the standard error; H1, Results are more likely to be reported if the effect is at least twice the standard error. SOURCE: Courtesy of Justin W ­ olfers, University of Michigan, presentation to the workshop.
From page 31...
... However, homicide rates also rose in those states. Wolfers stressed the importance of comparison groups when doing analyses such as this from observational data.
From page 32...
... Donohue and Wolfers (2005) ultimately concluded that there are insufficient data to determine the influence of the death penalty while noting that the existing literature reflects problems including publication bias; neglect of comparison groups; coding errors; highly selected samples, functional forms, regressors, and samples; and overstatements about statistical significance.
From page 33...
... For example, because codes are often not shared, coding errors cannot be found. Wolfers suggested that the scientific process could be improved by requiring that every published paper include the archived data, but this does not address the possibility that an archived code might fail to run or deliver the expected results.
From page 34...
... The success of this approach depends on the culture of each discipline. Institutionally, Wolfers noted that many funders are excited about developing a replicability standard, and much of the movement in the natural sciences is coming from the funders insisting that data be in the public domain.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.