Robust Demographic Inference from Genomic and SNP Data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.

Author Summary

We present a new likelihood-based method to infer the past demography of a set of populations from large genomic datasets. Our method can be applied to arbitrarily complex models as the likelihood is estimated by coalescent simulations. Under simple scenarios, our method behaves similarly to a widely used diffusion-based method while showing better convergence properties. In addition, our approach can be applied to very complex models including as many as a dozen populations, and still retrieve parameters very accurately in a reasonable time. We apply our approach to estimate the past demography of four human populations for which non-coding whole genome diversity is available, estimating the degree of European admixture of a southwest African American population and that of a Kenyan population with an unsampled East African population. We also show the versatility of our framework by inferring the demographic history of African populations from SNP chip data with known ascertainment bias, and find a very old divergence time (>110 Ky) between Yorubas from Western Africa and Sans from Southern Africa.

Related collections

Most cited references 51

Record: found
Abstract: found
Article: not found

A high-coverage genome sequence from an archaic Denisovan individual.

Matthias Meyer, Martin Kircher, Marie-Theres Gansauge … (2012)

We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.

0 comments Cited 779 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genomic scans for selective sweeps using SNP data.

Rasmus Nielsen, Scott Williamson, Yuseob Kim … (2005)

Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. Current methods for detecting selective sweeps have little or no robustness to the demographic assumptions and varying recombination rates, and provide no method for correcting for ascertainment biases. Here, we present several new tests aimed at detecting selective sweeps from genomic SNP data. Using extensive simulations, we show that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error). Our new test also provides estimates of the location of the selective sweep(s) and the magnitude of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence for selective sweeps is also found in many other regions, including genes known to be associated with disease risk such as DPP10 and COL4A3.

0 comments Cited 436 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Bayesian inference of ancient human demography from individual genome sequences

Ilan Gronau, Melissa Hubisz, Brad Gulko … (2011)

Besides their value for biomedicine, individual genome sequences are a rich source of information about human evolution. Here we describe an effort to estimate key evolutionary parameters from sequences for six individuals from diverse human populations. We use a Bayesian, coalescent-based approach to extract information about ancestral population sizes, divergence times, and migration rates from inferred genealogies at many neutrally evolving loci from across the genome. We introduce new methods for accommodating gene flow between populations and integrating over possible phasings of diploid genotypes. We also describe a custom pipeline for genotype inference to mitigate biases from heterogeneous sequencing technologies and coverage levels. Our analysis indicates that the San of Southern Africa diverged from other human populations 108–157 thousand years ago (kya), that Eurasians diverged from an ancestral African population 38–64 kya, and that the effective population size of the ancestors of all modern humans was ~9,000.

0 comments Cited 269 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Joshua M. Akey: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Genet

Journal ID (iso-abbrev): PLoS Genet

Journal ID (publisher-id): plos

Journal ID (pmc): plosgen

Title: PLoS Genetics

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-7390

ISSN (Electronic): 1553-7404

Publication date Collection: October 2013

Publication date (Print): October 2013

Publication date (Electronic): 24 October 2013

Volume: 9

Issue: 10

Electronic Location Identifier: e1003905

Affiliations

[1 ]CMPG, Institute of Ecology and Evolution, Berne, Switzerland

[2 ]Swiss Institute of Bioinformatics, Lausanne, Switzerland

[3 ]Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America

[4 ]School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

University of Washington, United States of America

Author notes

* E-mail: laurent.excoffier@ 123456iee.unibe.ch

The authors have declared that no competing interests exist.

Conceived and designed the experiments: LE MF. Performed the experiments: LE ID EHS VCS. Analyzed the data: LE ID EHS MF VCS. Contributed reagents/materials/analysis tools: LE ID EHS MF VCS. Wrote the paper: LE ID EHS MF VCS.

Article

Publisher ID: PGENETICS-D-13-00576

DOI: 10.1371/journal.pgen.1003905

PMC ID: 3812088

PubMed ID: 24204310

SO-VID: c3c982e9-65ce-49ce-b051-a64ecd17870f

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 28 February 2013

Date accepted : 11 September 2013

Page count

Pages: 17

Funding

This work was supported by Swiss NSF grants No 3100-126074, 31003A-143393, and CRSII3_141940 to LE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Robust Demographic Inference from Genomic and SNP Data

Read this article at

Abstract

Author Summary

Related collections

Genomic Prediction

Most cited references 51

A high-coverage genome sequence from an archaic Denisovan individual.

Genomic scans for selective sweeps using SNP data.

Bayesian inference of ancient human demography from individual genome sequences

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 26

Cited by 520

Most referenced authors 1,480