• Record: found
  • Abstract: found
  • Article: found
Is Open Access

Genealogy-Based Methods for Inference of Historical Recombination and Gene Flow and Their Application in Saccharomyces cerevisiae

1 , 1 , 2 , 3 , *


Public Library of Science

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.

      Related collections

      Most cited references 31

      • Record: found
      • Abstract: found
      • Article: not found

      Application of phylogenetic networks in evolutionary studies.

      The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.
        • Record: found
        • Abstract: found
        • Article: not found

        Generating samples under a Wright-Fisher neutral model of genetic variation.

         R Hudson (2002)
        A Monte Carlo computer program is available to generate samples drawn from a population evolving according to a Wright-Fisher neutral model. The program assumes an infinite-sites model of mutation, and allows recombination, gene conversion, symmetric migration among subpopulations, and a variety of demographic histories. The samples produced can be used to investigate the sampling properties of any sample statistic under these neutral models.
          • Record: found
          • Abstract: found
          • Article: not found

          Population genomics of domestic and wild yeasts

          Since the completion of the genome sequence of Saccharomyces cerevisiae in 19961,2, there has been an exponential increase in complete genome sequences accompanied by great advances in our understanding of genome evolution. Although little is known about the natural and life histories of yeasts in the wild, there are an increasing number of studies looking at ecological and geographic distributions3,4, population structure5-8, and sexual versus asexual reproduction9,10. Less well understood at the whole genome level are the evolutionary processes acting within populations and species leading to adaptation to different environments, phenotypic differences and reproductive isolation. Here we present one- to four-fold or more coverage of the genome sequences of over seventy isolates of the baker's yeast, S. cerevisiae, and its closest relative, S. paradoxus. We examine variation in gene content, SNPs, indels, copy numbers and transposable elements. We find that phenotypic variation broadly correlates with global genome-wide phylogenetic relationships. Interestingly, S. paradoxus populations are well delineated along geographic boundaries while the variation among worldwide S. cerevisiae isolates shows less differentiation and is comparable to a single S. paradoxus population. Rather than one or two domestication events leading to the extant baker's yeasts, the population structure of S. cerevisiae consists of a few well-defined geographically isolated lineages and many different mosaics of these lineages, supporting the idea that human influence provided the opportunity for cross-breeding and production of new combinations of pre-existing variation.

            Author and article information

            [1 ]Computer Science Division, University of California, Berkeley, California, United States of America
            [2 ]Department of Statistics, University of California, Berkeley, California, United States of America
            [3 ]Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
            University of Cambridge, United Kingdom
            Author notes

            Competing Interests: The authors have read the journal's policy and have the following conflict: P.J. is the spouse of an Associate Editor at PLOS ONE.

            Conceived and designed the experiments: PJ YS RB. Performed the experiments: PJ RB. Analyzed the data: PJ YS RB. Contributed reagents/materials/analysis tools: PJ YS RB. Wrote the paper: PJ YS RB.

            Role: Editor
            PLoS One
            PLoS ONE
            PLoS ONE
            Public Library of Science (San Francisco, USA )
            30 November 2012
            : 7
            : 11
            23226196 3511476 PONE-D-12-19295 10.1371/journal.pone.0046947

            This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

            Pages: 13
            This work was supported in part by a National Science Foundation CAREER Grant (DBI-0846015; and a Packard Fellowship for Science and Engineering ( to Y.S.S., and an Ellison Medical Foundation New Scholar Award in Aging ( to R.B.B. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
            Research Article
            Computational Biology
            Genome Evolution
            Population Genetics
            Gene Flow
            Evolutionary Modeling
            Population Modeling
            Sequence Analysis
            Evolutionary Biology
            Evolutionary Processes
            Population Genetics
            Gene Flow
            Population Genetics
            Gene Flow
            Model Organisms
            Yeast and Fungal Models
            Saccharomyces Cerevisiae
            Population Biology
            Population Genetics
            Gene Flow
            Theoretical Biology
            Probability Theory
            Stochastic Processes



            Comment on this article