104
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Robust Demographic Inference from Genomic and SNP Data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.

          Author Summary

          We present a new likelihood-based method to infer the past demography of a set of populations from large genomic datasets. Our method can be applied to arbitrarily complex models as the likelihood is estimated by coalescent simulations. Under simple scenarios, our method behaves similarly to a widely used diffusion-based method while showing better convergence properties. In addition, our approach can be applied to very complex models including as many as a dozen populations, and still retrieve parameters very accurately in a reasonable time. We apply our approach to estimate the past demography of four human populations for which non-coding whole genome diversity is available, estimating the degree of European admixture of a southwest African American population and that of a Kenyan population with an unsampled East African population. We also show the versatility of our framework by inferring the demographic history of African populations from SNP chip data with known ascertainment bias, and find a very old divergence time (>110 Ky) between Yorubas from Western Africa and Sans from Southern Africa.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: not found

          A high-coverage genome sequence from an archaic Denisovan individual.

          We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genomic scans for selective sweeps using SNP data.

            Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. Current methods for detecting selective sweeps have little or no robustness to the demographic assumptions and varying recombination rates, and provide no method for correcting for ascertainment biases. Here, we present several new tests aimed at detecting selective sweeps from genomic SNP data. Using extensive simulations, we show that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error). Our new test also provides estimates of the location of the selective sweep(s) and the magnitude of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence for selective sweeps is also found in many other regions, including genes known to be associated with disease risk such as DPP10 and COL4A3.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Bayesian inference of ancient human demography from individual genome sequences

              Besides their value for biomedicine, individual genome sequences are a rich source of information about human evolution. Here we describe an effort to estimate key evolutionary parameters from sequences for six individuals from diverse human populations. We use a Bayesian, coalescent-based approach to extract information about ancestral population sizes, divergence times, and migration rates from inferred genealogies at many neutrally evolving loci from across the genome. We introduce new methods for accommodating gene flow between populations and integrating over possible phasings of diploid genotypes. We also describe a custom pipeline for genotype inference to mitigate biases from heterogeneous sequencing technologies and coverage levels. Our analysis indicates that the San of Southern Africa diverged from other human populations 108–157 thousand years ago (kya), that Eurasians diverged from an ancestral African population 38–64 kya, and that the effective population size of the ancestors of all modern humans was ~9,000.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                October 2013
                October 2013
                24 October 2013
                : 9
                : 10
                : e1003905
                Affiliations
                [1 ]CMPG, Institute of Ecology and Evolution, Berne, Switzerland
                [2 ]Swiss Institute of Bioinformatics, Lausanne, Switzerland
                [3 ]Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
                [4 ]School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
                University of Washington, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: LE MF. Performed the experiments: LE ID EHS VCS. Analyzed the data: LE ID EHS MF VCS. Contributed reagents/materials/analysis tools: LE ID EHS MF VCS. Wrote the paper: LE ID EHS MF VCS.

                Article
                PGENETICS-D-13-00576
                10.1371/journal.pgen.1003905
                3812088
                24204310
                c3c982e9-65ce-49ce-b051-a64ecd17870f
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 28 February 2013
                : 11 September 2013
                Page count
                Pages: 17
                Funding
                This work was supported by Swiss NSF grants No 3100-126074, 31003A-143393, and CRSII3_141940 to LE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article

                Genetics
                Genetics

                Comments

                Comment on this article