Skip to main content

Currently Skimming:

Chapter 3: Data Collection and Informatics
Pages 19-29

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 19...
... , which contains detailed information about the structures of proteins. Since 1971, when it opened, the PDB has grown from an initial seven protein structures to more than 9,000, said Helen Berman, the data bank's director; in the process, it has evolved into far more than just a way for protein crystallographers to make their structures available to other researchers.
From page 20...
... PROTEIN CRYSTALLOGRAPHY To understand how proteins function, which is crucial, for example, for rational drug design and investigating the etiology of various diseases, researchers must learn what the proteins' structures are how the molecules' carbon, oxygen, nitrogen, hydrogen, and other atoms arrange themselves. To perform this mapping, researchers must crystallize a protein, expose the crystals to intense radiation, and measure the diffraction pattern formed when the radiation passes through the protein crystals.
From page 21...
... Indeed, after he applied "a little unconventional thinking" at a beam line at Argonne National Laboratory, he said, that beam line produced in 9 months as many protein structures as nine beam lines at Brookhaven National Laboratory turned out in a year. But good software for protein crystallography is not widely available, and Minor identified several reasons for that.
From page 22...
... THE PROTEIN DATA BANK Once researchers determine the structure of a protein, they are required to deposit the structure with the PDB, which has recently been moved from the Brookhaven National Laboratory to Rutgers University. Input into and access to data in the PDB now take place over the Internet, which is convenient for researchers, but Helen Berman identified several unresolved issues affecting access to the protein structures and other information in the database.
From page 23...
... So we have a difficult time convincing people in the academic realm to produce the kinds of software that are required for structural biology." "How software is developed and how software developers are recognized have to change," Berman said, "and there has to be a way for people that have new algorithms, new software, or new tools, to get funded, even if it's not sexy. For the greater good of the community, we have to find a better way of handling software development for structural biology." CULTURE COLLECTIONS For researchers who study bacteria and other microorganisms, culture collections are the only way to preserve a record of the creatures they have studied.
From page 24...
... As a result, the scientific community often must rely on places other than the major culture collections for its research materials; therefore, Cypess said, "80% of the materials that are currently used in the science establishment are undocumented and unstandardized." MUSEUMS AND BOTANIC GARDENS One often-overlooked source of research materials is the world's museums, said Leonard Krishtalka, director of the Natural History Museum at the University of Kansas. "I like to say that the massive amount of data housed in museums is really a stealth dataset.
From page 25...
... "The convention calls for the tropical countries of the world, which are roughly equal to the developing countries of the world and are home to the vast majority of the world's species, to promote access to and study of the biologic resources that are held within their international borders. But at the same time, it calls for those countries to regulate that access.
From page 26...
... The value comes from the packaging of the data, from understanding broad patterns in time and space; so the informatics element is much more important than simply knowing the name of a species or knowing that a particular specimen occurred in a particular place." Finally, the culture of ecology needs to change. "We tend to have a mystique about the ecologist who goes to a new place, sleeps on the ground for a half-year, collects a lot of data, and stumbles back into the laboratory with some new results." But if ecology is to benefit from the new databases, the field will
From page 27...
... The idea was that the data set was paid for with public funds and that giving other investigators access to the data would lead to an increased scientific return on the investment in the study, which was a large investment." The data had never been intended to be put into the public domain, Friedman said, and that has caused the investigators several problems. The consent forms signed by the study participants, for instance, did not mention putting the results in the public domain.
From page 28...
... But collecting data on people, particularly genetic data or detailed information about health and habits, is fraught with difficulties that researchers dealing with, say, protein structures or plants, do not face. Consider, for instance, the National Longitudinal Study of Adolescent Health.
From page 29...
... The collection and analysis of human DNA samples can shed light on important questions in human evolution and genetic variability, but progress in this area of research has been slowed by misunderstandings and concerns about the way this information will be used. And if data are collected on people with different cultural beliefs and practices, a whole new set of considerations arises, said Lynn Jorde, of the Department of Human Genetics at the University of Utah School of Medicine.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.