107
views
0
recommends
+1 Recommend
0 collections
    5
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations

      research-article
      1 , * , 2 , 3
      PLoS Genetics
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Multivariate statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been widely used to summarize the structure of human genetic variation, often in easily visualized two-dimensional maps. Many recent studies have reported similarity between geographic maps of population locations and MDS or PCA maps of genetic variation inferred from single-nucleotide polymorphisms (SNPs). However, this similarity has been evident primarily in a qualitative sense; and, because different multivariate techniques and marker sets have been used in different studies, it has not been possible to formally compare genetic variation datasets in terms of their levels of similarity with geography. In this study, using genome-wide SNP data from 128 populations worldwide, we perform a systematic analysis to quantitatively evaluate the similarity of genes and geography in different geographic regions. For each of a series of regions, we apply a Procrustes analysis approach to find an optimal transformation that maximizes the similarity between PCA maps of genetic variation and geographic maps of population locations. We consider examples in Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, as well as in a worldwide sample, finding that significant similarity between genes and geography exists in general at different geographic levels. The similarity is highest in our examples for Asia and, once highly distinctive populations have been removed, Sub-Saharan Africa. Our results provide a quantitative assessment of the geographic structure of human genetic variation worldwide, supporting the view that geography plays a strong role in giving rise to human population structure.

          Author Summary

          The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          The genetic structure and history of Africans and African Americans.

          Africa is the source of all modern humans, but characterization of genetic variation and of relationships among populations across the continent has been enigmatic. We studied 121 African populations, four African American populations, and 60 non-African populations for patterns of variation at 1327 nuclear microsatellite and insertion/deletion markers. We identified 14 ancestral population clusters in Africa that correlate with self-described ethnicity and shared cultural and/or linguistic properties. We observed high levels of mixed ancestry in most populations, reflecting historical migration events across the continent. Our data also provide evidence for shared ancestry among geographically diverse hunter-gatherer populations (Khoesan speakers and Pygmies). The ancestry of African Americans is predominantly from Niger-Kordofanian (approximately 71%), European (approximately 13%), and other African (approximately 8%) populations, although admixture levels varied considerably among individuals. This study helps tease apart the complex evolutionary history of Africans and African Americans, aiding both anthropological and genetic epidemiologic studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genotype, haplotype and copy-number variation in worldwide human populations.

            Genome-wide patterns of variation across individuals provide a powerful source of data for uncovering the history of migration, range expansion, and adaptation of the human species. However, high-resolution surveys of variation in genotype, haplotype and copy number have generally focused on a small number of population groups. Here we report the analysis of high-quality genotypes at 525,910 single-nucleotide polymorphisms (SNPs) and 396 copy-number-variable loci in a worldwide sample of 29 populations. Analysis of SNP genotypes yields strongly supported fine-scale inferences about population structure. Increasing linkage disequilibrium is observed with increasing geographic distance from Africa, as expected under a serial founder effect for the out-of-Africa spread of human populations. New approaches for haplotype analysis produce inferences about population structure that complement results based on unphased SNPs. Despite a difference from SNPs in the frequency spectrum of the copy-number variants (CNVs) detected--including a comparatively large number of CNVs in previously unexamined populations from Oceania and the Americas--the global distribution of CNVs largely accords with population structure analyses for SNP data sets of similar size. Our results produce new inferences about inter-population variation, support the utility of CNVs in human population-genetic research, and serve as a genomic resource for human-genetic studies in diverse worldwide populations.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Mapping human genetic diversity in Asia.

              Asia harbors substantial cultural and linguistic diversity, but the geographic structure of genetic variation across the continent remains enigmatic. Here we report a large-scale survey of autosomal variation from a broad geographic sample of Asian human populations. Our results show that genetic ancestry is strongly correlated with linguistic affiliations as well as geography. Most populations show relatedness within ethnic/linguistic groups, despite prevalent gene flow among populations. More than 90% of East Asian (EA) haplotypes could be found in either Southeast Asian (SEA) or Central-South Asian (CSA) populations and show clinal structure with haplotype diversity decreasing from south to north. Furthermore, 50% of EA haplotypes were found in SEA only and 5% were found in CSA only, indicating that SEA was a major geographic source of EA populations.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                August 2012
                August 2012
                23 August 2012
                : 8
                : 8
                : e1002886
                Affiliations
                [1 ]Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
                [2 ]Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
                [3 ]Department of Biology, Stanford University, Stanford, California, United States of America
                Dartmouth College, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: CW SZ NAR. Performed the experiments: CW. Analyzed the data: CW. Contributed reagents/materials/analysis tools: CW SZ NAR. Wrote the paper: CW SZ NAR.

                Article
                PGENETICS-D-12-00547
                10.1371/journal.pgen.1002886
                3426559
                22927824
                76845edb-41dd-4a53-8a59-19865f998b31
                Copyright @ 2012

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 2 March 2012
                : 24 June 2012
                Page count
                Pages: 16
                Funding
                This work was supported by National Institutes of Health grants R01 GM081441 and R01 HG005855, by the Burroughs Wellcome Fund, and by a Howard Hughes Medical Institute International Student Research Fellowship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology
                Genetics
                Population Genetics
                Genetic Polymorphism

                Genetics
                Genetics

                Comments

                Comment on this article