+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: not found

      Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Identifying the ancestry of chromosomal segments of distinct ancestry has a wide range of applications from disease mapping to learning about history. Most methods require the use of unlinked markers; but, using all markers from genome-wide scanning arrays, it should in principle be possible to infer the ancestry of even very small segments with exquisite accuracy. We describe a method, HAPMIX, which employs an explicit population genetic model to perform such local ancestry inference based on fine-scale variation data. We show that HAPMIX outperforms other methods, and we explore its utility for inferring ancestry, learning about ancestral populations, and inferring dates of admixture. We validate the method empirically by applying it to populations that have experienced recent and ancient admixture: 935 African Americans from the United States and 29 Mozabites from North Africa. HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone.

          Author Summary

          The genomes of individuals from admixed populations consist of chromosomal segments of distinct ancestry. For example, the genomes of African American individuals contain segments of both African and European ancestry, so that a specific location in the genome may inherit 0, 1, or 2 copies of European ancestry. Inferring an individual's local ancestry, their number of copies of each ancestry at each location in the genome, has important applications in disease mapping and in understanding human history. Here we describe HAPMIX, a method that analyzes data from dense genotyping chips to infer local ancestry with very high precision. An important feature of HAPMIX is that it makes use of data from haplotypes (blocks of nearby markers), which are more informative for ancestry than individual markers. Our simulations demonstrate the utility of HAPMIX for local ancestry inference, and empirical applications to African American and Mozabite data sets uncover important aspects of the history of these populations.

          Related collections

          Most cited references 16

          • Record: found
          • Abstract: found
          • Article: not found

          A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

          We present a statistical model for patterns of genetic variation in samples of unrelated individuals from natural populations. This model is based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes. To capture the fact that, because of recombination, this clustering tends to be local in nature, our model allows cluster memberships to change continuously along the chromosome according to a hidden Markov model. This approach is flexible, allowing for both "block-like" patterns of linkage disequilibrium (LD) and gradual decline in LD with distance. The resulting model is also fast and, as a result, is practicable for large data sets (e.g., thousands of individuals typed at hundreds of thousands of markers). We illustrate the utility of the model by applying it to dense single-nucleotide-polymorphism genotype data for the tasks of imputing missing genotypes and estimating haplotypic phase. For imputing missing genotypes, methods based on this model are as accurate or more accurate than existing methods. For haplotype estimation, the point estimates are slightly less accurate than those from the best existing methods (e.g., for unrelated Centre d'Etude du Polymorphisme Humain individuals from the HapMap project, switch error was 0.055 for our method vs. 0.051 for PHASE) but require a small fraction of the computational cost. In addition, we demonstrate that the model accurately reflects uncertainty in its estimates, in that probabilities computed using the model are approximately well calibrated. The methods described in this article are implemented in a software package, fastPHASE, which is available from the Stephens Lab Web site.
            • Record: found
            • Abstract: found
            • Article: not found

            Genotype, haplotype and copy-number variation in worldwide human populations.

            Genome-wide patterns of variation across individuals provide a powerful source of data for uncovering the history of migration, range expansion, and adaptation of the human species. However, high-resolution surveys of variation in genotype, haplotype and copy number have generally focused on a small number of population groups. Here we report the analysis of high-quality genotypes at 525,910 single-nucleotide polymorphisms (SNPs) and 396 copy-number-variable loci in a worldwide sample of 29 populations. Analysis of SNP genotypes yields strongly supported fine-scale inferences about population structure. Increasing linkage disequilibrium is observed with increasing geographic distance from Africa, as expected under a serial founder effect for the out-of-Africa spread of human populations. New approaches for haplotype analysis produce inferences about population structure that complement results based on unphased SNPs. Despite a difference from SNPs in the frequency spectrum of the copy-number variants (CNVs) detected--including a comparatively large number of CNVs in previously unexamined populations from Oceania and the Americas--the global distribution of CNVs largely accords with population structure analyses for SNP data sets of similar size. Our results produce new inferences about inter-population variation, support the utility of CNVs in human population-genetic research, and serve as a genomic resource for human-genetic studies in diverse worldwide populations.
              • Record: found
              • Abstract: found
              • Article: not found

              A high-density admixture map for disease gene discovery in african americans.

              Admixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) provides a way of localizing genes that cause disease, in admixed ethnic groups such as African Americans, with approximately 100 times fewer markers than are required for whole-genome haplotype scans. However, it has not been possible to perform powerful scans with admixture mapping because the method requires a dense map of validated markers known to have large frequency differences between Europeans and Africans. To create such a map, we screened through databases containing approximately 450000 single-nucleotide polymorphisms (SNPs) for which frequencies had been estimated in African and European population samples. We experimentally confirmed the frequencies of the most promising SNPs in a multiethnic panel of unrelated samples and identified 3011 as a MALD map (1.2 cM average spacing). We estimate that this map is approximately 70% informative in differentiating African versus European origins of chromosomal segments. This map provides a practical and powerful tool, which is freely available without restriction, for screening for disease genes in African American patient cohorts. The map is especially appropriate for those diseases that differ in incidence between the parental African and European populations.

                Author and article information

                Role: Editor
                PLoS Genet
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                June 2009
                June 2009
                19 June 2009
                : 5
                : 6
                [1 ]Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America
                [2 ]Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
                [3 ]Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
                [4 ]Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
                [5 ]Johns Hopkins Allergy and Asthma Center, Division of Clinical Immunology, Department of Medicine, School of Medicine, Baltimore, Maryland, United States of America
                [6 ]Department of Biostatistics, Johns Hopkins School of Public Health, Baltimore, Maryland, United States of America
                [7 ]Inherited Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States of America
                [8 ]Department of Statistics, Oxford University, Oxford, United Kingdom
                [9 ]Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
                University of Chicago, United States of America
                Author notes

                Conceived and designed the experiments: ALP NP DR SM. Performed the experiments: ALP AT NP KCB NR IR THB RM DR SM. Analyzed the data: ALP AT NP DR SM. Contributed reagents/materials/analysis tools: ALP NP KCB NR IR THB RM DR SM. Wrote the paper: ALP AT NP KCB NR IR THB RM DR SM.

                This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
                Page count
                Pages: 18
                Research Article
                Genetics and Genomics/Bioinformatics
                Genetics and Genomics/Genetics of Disease
                Genetics and Genomics/Population Genetics



                Comment on this article