Skip to main content

Currently Skimming:

6 The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree - Rohan S. Mehta, David Bryant, and Noah A. Rosenberg
Pages 113-136

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 113...
... We study the effects of species tree topology and branch lengths on the monophyly probability. These analyses reveal new behavior, including the maintenance of nontrivial monophyly prob abilities for gene lineage samples that span multiple species and even for lineages that do not derive from a monophyletic species group.
From page 114...
... M athematical computations under coalescent models have been central in developing a modern view of the descent of gene lineages along the branches of species phylogenies. Since early in the development of coalescent theory and phylogeography, coalescent formulas and related simulations have contributed to a probabilistic understanding of the shapes of multispecies gene trees (Tajima, 1983; Takahata and Nei, 1985; Neigel and Avise, 1986)
From page 115...
... computed probabilities of four different genealogical shapes: reciprocal monophyly of both species, monophyly of only one of the species, monophyly of only the other species, and monophyly of neither species. The computation permitted arbitrary species divergence times and sample sizes -- generalizing earlier small-sample computations (Tajima, 1983; Takahata and Nei, 1985; Neigel and Avise, 1986; Takahata and Slatkin, 1990; Wakeley, 2000)
From page 116...
... , which considered probability distributions for gene tree topologies under the multispecies coalescent model, our work generalizes a coalescent computation known only for small trees (Rosenberg, 2002, 2003) to arbitrary species trees.
From page 117...
... For convenience, we aggregate the Si and Ci with  into a parameter collection SC that we call the initialized species tree. Monophyly Events A monophyly event Ei is an assignment of labels to lineage classes S and C
From page 118...
... TABLE 6.2  Possible Monophyly Events for Two Disjoint Lineage Classes, S and C Monophyletic Groups Description Notation S Monophyly of S ES C Monophyly of C Ec Only S Paraphyly of C ESC′ Only C Paraphyly of S ES′C Both S and C Reciprocal monophyly ESC Neither S nor C Polyphyly ES′C′ one branch, leading from node x to its immediate predecessor on . We refer to this branch with the shared label x.
From page 119...
... S lineages appear in blue, C lineages in orange, and M lineages in green. The figure illustrates reciprocal monophyly.
From page 120...
... The Central Recursion Overview We develop a recursion for the probability of a particular output state nx and monophyly event Ex for a branch x given the initialized species O i subtree x . We use the law of total probability to write the desired prob SC I ability as a sum over all possible input states nx of the probability of the input state multiplied by the conditional probability of the output given the input.
From page 121...
... , using them to obtain probabilities for the remaining events. These three events require complete intraclass coalescence separately in the appropriate classes before interclass coalescences are possible.
From page 122...
... 6.3) describes the probability of an output state and monophyly event given an input state and the initialized species tree.
From page 123...
... (H) A case for reciprocal monophyly (Eq.
From page 124...
... This violation yields an output probability of KS = 0. Reciprocal Monophyly Monophyly events ESC and ES differ in that for ESC, unlike for ES, C and M lineages cannot coexist.
From page 125...
... 6.4, KSC = KS. Completing the Calculation Having obtained a recursion that propagates monophyly probabilities through a species tree, we apply Eq.
From page 126...
... Effect of Species Tree Height T To illustrate the features of monophyly probabilities, we now examine the effects on the probabilities of model parameters. First, we vary the tree height T and preserve relative branch length proportions, studying the limiting cases of T = 0 and T → ∞.
From page 127...
... As we will see in numerical examples, however, monotonicity of the monophyly probability with T is not guaranteed, and different initial sample sizes on the same species tree can generate different behavior. Effect of Relative Branch Lengths Next, to investigate the behavior of the monophyly probability as T increases, we devise a simple three-species, two-parameter scenario, subdividing the tree height T by a parameter r.
From page 128...
... If the branch length coefficient r is 0, then the tree has a polytomy, and if r = 1, then the tree reduces to a two-species tree.
From page 129...
... , it approaches a positive value strictly within the unit interval. These scenarios highlight the fact that depending on the relative branch lengths and distribution of lineage classes across species, the monophyly probability can be monotonically increasing in T, monotonically decreasing, or not monotonic at all.
From page 130...
... (E–J) Probabilities of monophyly events.
From page 131...
... 6.7 and to test if our theoretical results reasonably replicate patterns in real data, we perform an analysis of monophyly frequencies using Zea mays maize and teosinte genomic data (Chia et al., 2012)
From page 132...
... Eq. 6.7 relies on a model with selectively neutral loci and constant population size; a deviation from theoretical probabilities could suggest a violation of one of the model assumptions.
From page 133...
... The results also provide a step toward computations for monophyly events on three or more lineage groups considered jointly. As an empirical demonstration, we analyzed data from maize and teosinte, calculating theoretical and observed monophyly frequencies in four groups.
From page 134...
... The maize analysis illustrates how our framework can be used to study monophyly in multispecies genomic data. The formulas derived here allow for greater flexibility in studies of monophyly and its relationship to species trees, contributing to a more comprehensive toolkit for phylogeographic, systematic, and evolutionary studies.
From page 135...
... We concatenated SNPs within blocks, computed blockwise Hamming distance ­ matrices, and obtained gene trees using the hclust UPGMA (unweighted pair group method with arithmetic mean) clustering function in the R stats package.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.