Skip to main content

Currently Skimming:

2 Ensuring the Integrity of Research Data
Pages 33-58

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 33...
... Yet other practices may be employed only within specific fields, for instance, the use of double-blind trials, or the independent verification of important results in separate laboratories. Although the pervasive use of high-speed computing and communications in research has vastly expanded the capabilities of researchers, if used i ­nappropriately or carelessly, digital technologies can lower the quality of data and compromise the integrity of research. Digitization may introduce ­spurious information into a representation, and complex digital analyses of data can yield misleading results if researchers are not scrupulously careful in monitoring and understanding the analysis process.
From page 34...
... As an example of the challenges posed by digital research data, Box 2-1 explores these issues in the context of particle physics research. Because digital data can be manipulated more easily than can other forms of data, digital data are particularly susceptible to distortion.
From page 35...
... The data processing system determines the momentum and energy of each particle radiated from a collision, and identifies how the particles are correlated in space and time. The thousands of detection devices, the magnetic field in which the collisions occur, and the properties of the complex digital data acquisition system must all be known accurately.
From page 36...
... Cases in which the manipulation affects the interpretation of the data will result in revocation of acceptance, and will be reported to the corresponding author's home institution or funding agency. -- The Journal of Cell Biology, Instructions to Authors, http://www.jcb.org/misc/ifora.shtml Having developed this policy, the editors at the Journal of Cell Biology began to screen all of the images in accepted articles for evidence of inappropriate manipulation.
From page 37...
... Initial inquiries from the journal emphasize that questions are being asked only about the presentation of data, not its integrity, and inquiries are kept strictly confidential between a journal and authors. The section on image manipulation in the White Paper on Promoting Integrity in Scientific Journal Publications by the Council of Science Editors, which was written by the editors at the Journal of Cell Biology, suggests that "journal editors should attempt to resolve the problem before a case is reported.
From page 38...
... Yes Yes Yes Does the journal have a scientific misconduct Yesg Yesh Yesi investigation or reporting policy in place? KEY: PNAS=Proceedings of the National Academy of Sciences; JCB=Journal of Cell Biology and ­other Rockefeller University Press; NEJM=New England Journal of Medicine; ACS=American ­Chemical Society journals; AGU=American Geophysical Union journals; FASEB=Federation of American Societies for Experimental Biology journals; IEEE=Institute of Electrical and ­Electronics Engineers journals; ESA=Ecological Society of America journals; AER=American Economic ­Review a FASEB is reviewing their policies as this goes to press.
From page 39...
... h Policies are "in place regarding reporting scientific misconduct, but these are internal and not listed externally." i "Cases of deliberate misrepresentation of data will result in rejection of the paper and will be reported to the corresponding author's home institution or funding agency." j "Cases in which the (image) manipulation affects the interpretation of the data will result in revocation of acceptance, and will be reported to the corresponding author's home institution or funding agency." SOURCES: Compiled from journal Web sites.
From page 40...
... For the purposes of this report, we have divided these individuals and groups into three categories -- data producers, data providers, and data users -- though it should be kept it mind that many individuals and organizations fall into more than one of these categories. Data producers are the scientists, engineers, students, and others who generate data, whether through observations, experiments, simulations, or the gather­ ing of information from other sources.
From page 41...
... This is essential for science and engineering to progress, but it is not sufficient because progress in understanding the world requires that knowledge be shared. This process of submitting research data and results derived from those data to the scrutiny of others provides for a collective means of establishing and confirming data integrity.
From page 42...
... In contrast to field-specific methods, some methods used to ensure data integrity extend across most fields of research. Examples include the review of data within research groups, replication of previous observations and experiments, peer review, the sharing of data and research results, and the retention of raw data for possible future use.
From page 43...
... Most data cannot be properly interpreted without at least some -- and frequently detailed -- understanding of the procedures, instruments, and processing used to generate those data. Thus, data integrity depends critically on communicating to other researchers and to the public the context in which data are generated and processed.
From page 44...
... Changes in the economics of scholarly publishing may put pressure on editors and publishers to lessen the emphasis on peer review as they strive to cut costs and increase efficiency. At the same time, digital technologies can strengthen peer review by c ­ atalyzing and facilitating new ways of reviewing publications.
From page 45...
... Schön was fired from Bell Laboratories and later left the United States. In a letter to the committee, he wrote that "I admit I made various mistakes in my sci entific work, which I deeply regret." Yet he maintained that he "observed experimentally the various physical effects reported in these publications." The committee concluded that Schön acted alone and that his 20 co-authors on the papers were not guilty of research misconduct.
From page 46...
... The emergence and growth of accessible databases such as GenBank and the Sloan Digital Sky Survey illustrate these opportunities in widely disparate disciplines.17 Many researchers post databases, draft papers, oral presentations, simulations, software packages, or other scholarly products on personal or institutional Web sites. Repositories, such as the Nature Precedings repository established by the Nature publishing group for the life sciences, allow researchers to share, discuss, and cite preliminary findings.18 The Web allows widespread dissemination of critiques, commentaries, blogs, and other communications.
From page 47...
... The emergence and growth of accessible databases such as GenBank and the Sloan Digital Sky Survey illustrate these opportunities in widely disparate disciplines.20 (Box 2-3 on clinical research in this chapter describes another example.) However, it can be difficult to verify the integrity of results based on large datasets that have undergone substantial processing.
From page 48...
... Although examples from many disciplines could be cited, a good example is the use of digital technologies in clinical research, including the conduct of clinical trials and plans to link clinical trial information with individuals' electronic health records. Access to the data behind the production of new drugs and other medical treat ments is often a contentious issue because of the proprietary traditions of the phar maceutical industry and concerns about the privacy and security of patients enrolled in clinical trials.
From page 49...
... This process presents daunting difficulties, including: • Health records include a broader range of terminology than clinical trials. For example, a myocardial infarction might be described in a medical record as coronary insufficiency, chest discomfort, or other terms that may be difficult to capture in an electronic system.
From page 50...
... Table 2-2 summarizes the policies of federal agencies regarding data integrity and data sharing. DATA INTEGRITY IN THE DIGITAL AGE AND THE ROLE OF DATA PROFESSIONALS In the digital age, the methods used to maintain data integrity are increasingly complex.
From page 51...
... They need to take steps to ensure that digital technologies enhance rather than detract from data integrity. These observations lead to the following general principle: Data Integrity Principle: Ensuring the integrity of research data is essential for advancing scientific, engineering, and medical knowledge and for maintaining public trust in the research enterprise.
From page 52...
... c Scientific misconduct training information available for the Jet Propulsion Lab, but not for other facilities. Extramural Grantsa NIHb NSF USDAc DOC Are grantees required to share data with Yes Yesf Nog Nog other researchers?
From page 53...
... . AFOSR ONR DOEd DOE HHSd EPA NASA Not Not No Nog Yesh Yes Yes stated stated Not Not Not Nog Yesh No Yes stated stated applicable Not Not No Nog Yesh Yes Not stated stated stated Yes Yes Yes Yes Yes Yes Yes SOURCES: Agency Web sites checked December 2008, and communications from agencies 2009.
From page 54...
... THE IMPORTANCE OF TRAINING The integrity of research data can suffer if researchers inadvertently or willfully ignore the professional standards of their field. Data integrity also can be negatively affected if researchers are unaware of these standards or are unaware of their importance.
From page 55...
... Research leaders also have an obligation to set a standard for responsible behavior and to monitor and guide the actions of the members of their groups. Implementing institutional policies at the group level, holding regular meetings to discuss data issues, and providing careful s ­ upervision all help to create a research environment in which the integrity of data is understood, valued, and ensured.26 As described earlier, the need for training in the standards of research has been made more urgent by the advance of the digital age.
From page 56...
... New faculty members, postdoctoral fellows, and graduate students who are acting as principal investigators or otherwise have responsibility for the management of data are required to take the workshop, which takes about an hour to complete. The workshop is organized around four online case studies in the following areas: ensuring data reliability, controlling access to data, maintaining data integrity, and following retention guidelines.
From page 57...
... Instead, they may have to rely on collaborations with colleagues who have specialized training in applying digital technologies in research. Through their in-depth knowledge of digital technologies and how those technologies can advance 27 The quality standards applied to microarray data in proteomics provide a good example of ongoing efforts to improve the data generated by a rapidly evolving technology.
From page 58...
... Chapters 3 and 4 return to the roles of data professionals in enabling access to and preserving research data. The following recommendation reflects their importance in ensuring data integrity.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.