• Record: found
  • Abstract: found
  • Article: found
Is Open Access

Spatial Localization of Recent Ancestors for Admixed Individuals

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      Ancestry analysis from genetic data plays a critical role in studies of human disease and evolution. Recent work has introduced explicit models for the geographic distribution of genetic variation and has shown that such explicit models yield superior accuracy in ancestry inference over nonmodel-based methods. Here we extend such work to introduce a method that models admixture between ancestors from multiple sources across a geographic continuum. We devise efficient algorithms based on hidden Markov models to localize on a map the recent ancestors ( e.g., grandparents) of admixed individuals, joint with assigning ancestry at each locus in the genome. We validate our methods by using empirical data from individuals with mixed European ancestry from the Population Reference Sample study and show that our approach is able to localize their recent ancestors within an average of 470 km of the reported locations of their grandparents. Furthermore, simulations from real Population Reference Sample genotype data show that our method attains high accuracy in localizing recent ancestors of admixed individuals in Europe (an average of 550 km from their true location for localization of two ancestries in Europe, four generations ago). We explore the limits of ancestry localization under our approach and find that performance decreases as the number of distinct ancestries and generations since admixture increases. Finally, we build a map of expected localization accuracy across admixed individuals according to the location of origin within Europe of their ancestors.

      Related collections

      Most cited references 40

      • Record: found
      • Abstract: found
      • Article: not found

      Inference of population structure using multilocus genotype data.

      We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci-e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from approximately pritch/home. html.
        • Record: found
        • Abstract: found
        • Article: not found

        Principal components analysis corrects for stratification in genome-wide association studies.

        Population stratification--allele frequency differences between cases and controls due to systematic ancestry differences-can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.
          • Record: found
          • Abstract: found
          • Article: not found

          Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies.

          We describe extensions to the method of Pritchard et al. for inferring population structure from multilocus genotype data. Most importantly, we develop methods that allow for linkage between loci. The new model accounts for the correlations between linked loci that arise in admixed populations ("admixture linkage disequilibium"). This modification has several advantages, allowing (1) detection of admixture events farther back into the past, (2) inference of the population of origin of chromosomal regions, and (3) more accurate estimates of statistical uncertainty when linked loci are used. It is also of potential use for admixture mapping. In addition, we describe a new prior model for the allele frequencies within each population, which allows identification of subtle population subdivisions that were not detectable using the existing method. We present results applying the new methods to study admixture in African-Americans, recombination in Helicobacter pylori, and drift in populations of Drosophila melanogaster. The methods are implemented in a program, structure, version 2.0, which is available at

            Author and article information

            [* ]Department of Computer Science, University of California, Los Angeles, California 90095
            []Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095
            []Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California 90095
            [§ ]Department of Human Genetics, University of California, Los Angeles, California 90095
            [** ]Department of Human Genetics, University of Chicago, Chicago, Illinois 60637
            [†† ]Department of Pathology and Laboratory Medicine, Geffen School of Medicine at University of California, Los Angeles, California 90095
            Author notes
            [1 ]Corresponding author: Department of Pathology & Laboratory Medicine, Geffen School of Medicine at University of California, Los Angeles, 10833 Le Conte Ave, CHS 33-365, Los Angeles, CA 90095. E-mail: bpasaniuc@
            G3 (Bethesda)
            G3: Genes, Genomes, Genetics
            G3: Genes, Genomes, Genetics
            G3: Genes, Genomes, Genetics
            G3: Genes|Genomes|Genetics
            Genetics Society of America
            3 November 2014
            December 2014
            : 4
            : 12
            : 2505-2518
            Copyright © 2014 Yang et al.

            This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

            Pages: 14
            Custom metadata


            genetic variation, genetic continuum, admixture, localization, ancestry inference


            Comment on this article