Skip to main content

Currently Skimming:

4 Designing a Successful Metagenomics Project: Best Practices and Future Needs
Pages 60-84

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 60...
... which, depending on the goals of a particular project, can be applied individually or together to obtain a new understanding of the numbers and abundance of microbial community members, their metabolic capabilities, and how these parameters change in response to external stimuli. This chapter identifies the advantages and limitations of each approach and explores the research needed to overcome barriers to understanding microbial communities.
From page 61...
... " in addition to assembling genomes. The development, about 15 years ago, of methods for rapid and efficient sequencing and assembly of large segments of DNA was critical for the revolution in microbial genomics and has led to the completion of more than 460 bacterial and archaeal genome sequences by January 2007.1 For  http://www.genomesonline.org.
From page 62...
... In some projects, sample collection may be confounded by the presence of limited amounts of DNA or the presence of contaminating DNA or other compounds that interfere with DNA extraction. These factors make it much more challenging to think about the generation of complete or nearly complete genome sequences from metagenomics projects.
From page 63...
... Finally, function-driven metagenomic analysis, like the soil resistome project described in Chapter 3, which starts with functional expression of an activity in a surrogate host, followed by sequencing and phylogenetic analysis provides another measure of community potential. Regardless of the methods employed to answer questions about community structure and function, the composition of any microbial community is likely to be profoundly affected by the habitat from which the sample was obtained.
From page 64...
... Each decision about the type, size, scale, number, and timing of sampling shapes the conclusions and inferences that can be drawn. The labor intensity of producing and analyzing metagenomic libraries aggravates sampling issues that are inherent in all ecological studies.
From page 65...
... As biological and computational methods become more efficient, it will be possible to draw more robust conclusions from more complex communities in more variable habitats. No matter the power of the methods now or in the future, it is essential to consider sampling issues and limitations at the beginning and throughout any metagenomics study of a complex community, and the sampling scheme must inform the interpretation of results.
From page 66...
... It also reduces the chances of recovering low-abundance members of the bacterial community. There appear to be no published reports comparing methods for removing host DNA for bacterial metagenomic analysis.
From page 67...
... 16S rRNA-Based Surveys The first category includes a set of methods based on analysis of 16S rRNA genes, which provide relatively rapid and cost-effective methods for assessing bacterial diversity and abundance. These types of assays are often used as a first step in larger metagenomics projects to evaluate bacterial diversity in potential samples of interest (soil samples from dif
From page 68...
... PCR amplification with primers that hybridize to highly conserved regions in bacterial or archaeal 16S rRNA genes (or eukaryotic microbial 18S rRNA genes) followed by cloning and sequencing yields an initial description of a microbial community.
From page 69...
... Indeed, in several published metagenomics studies there have been discrepancies between estimates of community diversity derived from PCR-based 16S rRNA gene surveys and those derived from whole-genome shotgun data, although in some studies the estimates are remarkably similar (Liles et al. 2003; Tyson et al.
From page 70...
... , but the combination of 16S rRNA gene sequencing and arrays is a unique and powerful tool for the characterization of any microbial community because it allows both the discovery of novel phyla and extensive cataloguing of each taxonomic unit present in a given environment. 16S rRNA Phylogenetic and Functional Anchors: A Hybrid Approach Metagenomic clones can be given a context, or "anchored," by looking for a gene that characterizes the clone or the organism that it came from.
From page 71...
... It is important to take differential species representation into account in selecting assembly strategies for metagenomic data to avoid classifying sequences from the most abundant species as repeats and throwing them out of assembly algorithms. In highly diverse microbial communities, even when very large amounts of DNA sequence data are generated (several billion to a trillion base pairs of DNA)
From page 72...
... Libraries subject to No library cloning No library cloning Libraries may show cloning bias bias bias cloning bias Can resolve Cannot Can resolve Can resolve homopolymers easily resolve homopolymers homopolymers homopolymers aMate pairs are two sequencing reads derived from the same clone, or molecule, one from each end. If the length of the clone, or molecule, is known, mate pair information constrains where these sequencing reads can be placed in an assembly.
From page 73...
... In later human metagenomics projects, sequence data generated from microbial-community DNA can be readily aligned with these 100 microbial genome scaffolds to help to validate metagenomics assemblies, answer questions related to phylogeny and metabolism (what species are contributing what genes to the community genome) and assist in the evaluation of gene flow between community members (by providing evidence of lateral gene transfer among community members)
From page 74...
... • Improvements in bioinformatics tools, improvements in the ability to deduce function from sequence, and completion of more reference microbial genome sequences are needed. Hybridization- and Array-Based Analyses Specific functional gene arrays have also been designed with probes corresponding to genes of interest in an environment (see Figure 4-3)
From page 75...
... • Enhance the database of annotated sequences of genes that have important environmental functions, and provide software for easy use in the analysis of metagenomic data and for probe and primer development.
From page 76...
... Furthermore, structural genomics projects that aim to improve the linking of sequence to function face a bottleneck: analysis can be done only on proteins that can be produced in large quantities, purified, and even crystallized. If a newly identified gene has only weak similarity to a gene whose product has been studied biochemically, if a similarity in sequence does not reflect a functional relationship, or if a particular gene can carry out multiple functions in the cell, sequence comparisons may lead to incorrect conclusions about function.
From page 77...
... With more complex communities, enormous amounts of DNA-sequence data will be required for assembly of even the most abundant members. Although advances leading to higher throughput and decreased costs for Sanger-based sequencing have occurred in the last 10 years, metagenomics projects will require new, higher-throughput, lower-cost sequencing technologies.
From page 78...
... Inadequate information about minor Development of improved methods for isolating members of communities, which single cells by microfluidics or cell sorting and is needed, for example, to identify for amplifying DNA and RNA from single cells; keystone species development of methods for subtraction and/or normalization of community DNA samples to facilitate the study of rare community members Sequencer FLX eliminates the need for library construction (in other words, the community DNA can be sequenced directly, without first being cloned into a laboratory host) and can generate more than 100 million base pairs of DNA sequence in a single run.
From page 79...
... Additional studies to validate the utility of the short reads clearly are warranted, but the initial data support the role of new sequencing technologies in future metagenomics studies because they will facilitate deeper sampling of environmental samples than is currently possible. At the same time, it is important that alternative strategies for enrichment of the less abundant members of communities, such as suppressive subtraction hybridization or flow sorting of cells, continue to be developed and implemented.
From page 80...
... The abundance of species varies so widely that it is unlikely that the least abundant members will be captured in a given metagenomic analysis. Their DNA may well be in the libraries, but the probability of identifying, out of the millions of sequence fragments, the relatively few that came from the same low abundance community member, is low.
From page 81...
... From a functional standpoint, such variability may be critically important in the overall metabolic potential of one community as compared to another. At first it may seem paradoxical to study single cells in order to understand communities, but in fact the function of any given community reflects the contributions of each of its members.
From page 82...
... . Methods for Culturing Uncultured Species Because the assembly of complete genome sequences is one of the major current limitations in metagenomics research, microbiologists are displaying renewed interest in the art of microbial cultivation.
From page 83...
... Basic understanding of genetics, metabolism, gene regulation, cell structure, and responses to the environment needs to advance to aid in the design of metagenomic research strategies and the interpretation of metagenomic data. Understanding Microbial Habitats and Collecting Metadata Although seemingly a small component of Figure 4-1, the "Describe environment" box is perhaps the cornerstone of metagenomics studies.
From page 84...
... DOWNSTREAM DEVELOPMENT OF METAGENOMICS Currently, metagenomics is heavily biased toward sequencing and its associated computational analyses, and pioneering functional analyses. While the current distribution of effort is appropriate for this initial exploratory phase, it will not be sufficient for the next phase of metagenomics, when more value will be desired from the sequence and its metadata.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.