20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Genomic history and forensic characteristics of Sherpa highlanders on the Tibetan Plateau inferred from high-resolution InDel panel and genome-wide SNPs

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Sherpa people, one of the high-altitude hypoxic adaptive populations, mainly reside in Nepal and the southern Tibet Autonomous Region. The genetic origin and detailed evolutionary profiles of Sherpas remain to be further explored and comprehensively characterized. Here we analyzed the newly-generated InDel genotype data from 628 Dingjie Sherpas by merging with 4222 worldwide InDel profiles and collected genome-wide SNP data (approximately 600K SNPs) from 1612 individuals in 191 modern and ancient populations to explore and reconstruct the fine-scale genetic structure of Sherpas and their relationships with nearby modern and ancient East Asians based on the shared alleles and haplotypes. The forensic parameters of 57 autosomal InDels (A-InDels) included in our used new-generation InDel amplification system showed that this focused InDel panel is informative and polymorphic in Dingjie Sherpas, suggesting that it can be used as the supplementary tool for forensic personal identification and parentage testing in Dingjie Sherpas. Descriptive findings from the PCA, ADMIXTURE, and TreeMix-based phylogenies suggested that studied Nepal Sherpas showed excess allele sharing with neighboring Tibeto-Burman Tibetans. Furthermore, patterns of allele sharing in f-statistics demonstrated that Nepal Sherpas had a different evolutionary history compared with their neighbors from Nepal (Newar and Gurung) but showed genetic similarity with 2700-year-old Chokhopani and modern Tibet Tibetans. QpAdm/qpGraph-based admixture sources and models further showed that Sherpas, core Tibetans, and Chokhopani formed one clade, which could be fitted as having the main ancestry from late Neolithic Qijia millet farmers and other deep ancestries from early Asians. Chromosome painting profiles and shared IBD fragments inferred from fineSTRUCTURE and ChromoPainter not only confirmed the abovementioned genomic affinity patterns but also revealed the fine-scale genetic microstructures among Sino-Tibetan speakers. Finally, natural-selection signals revealed via iHS, nSL and iHH12 showed natural selection signatures associated with disease susceptibility in Sherpas. Generally, we provided the comprehensive landscape of admixture and evolutionary history of Sherpa people based on the shared alleles and haplotypes from the InDel-based genotype data and high-density genome-wide SNP data. The more detailed genetic landscape of Sherpa people should be further confirmed and characterized via ancient genomes or single-molecule real-time sequencing technology.

          Related collections

          Most cited references63

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          A global reference for human genetic variation

          The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Second-generation PLINK: rising to the challenge of larger and richer datasets

            PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for even faster and more scalable implementations of key functions. In addition, GWAS and population-genetic data now frequently contain probabilistic calls, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(sqrt(n))-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. This will be followed by PLINK 2.0, which will introduce (a) a new data format capable of efficiently representing probabilities, phase, and multiallelic variants, and (b) extensions of many functions to account for the new types of information. The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Fast model-based estimation of ancestry in unrelated individuals.

              Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multi-locus genotype data, can be used to perform a statistical correction for population stratification. One popular technique for estimation of ancestry is the model-based approach embodied by the widely applied program structure. Another approach, implemented in the program EIGENSTRAT, relies on Principal Component Analysis rather than model-based estimation and does not directly deliver admixture fractions. EIGENSTRAT has gained in popularity in part owing to its remarkable speed in comparison to structure. We present a new algorithm and a program, ADMIXTURE, for model-based estimation of ancestry in unrelated individuals. ADMIXTURE adopts the likelihood model embedded in structure. However, ADMIXTURE runs considerably faster, solving problems in minutes that take structure hours. In many of our experiments, we have found that ADMIXTURE is almost as fast as EIGENSTRAT. The runtime improvements of ADMIXTURE rely on a fast block relaxation scheme using sequential quadratic programming for block updates, coupled with a novel quasi-Newton acceleration of convergence. Our algorithm also runs faster and with greater accuracy than the implementation of an Expectation-Maximization (EM) algorithm incorporated in the program FRAPPE. Our simulations show that ADMIXTURE's maximum likelihood estimates of the underlying admixture coefficients and ancestral allele frequencies are as accurate as structure's Bayesian estimates. On real-world data sets, ADMIXTURE's estimates are directly comparable to those from structure and EIGENSTRAT. Taken together, our results show that ADMIXTURE's computational speed opens up the possibility of using a much larger set of markers in model-based ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies.
                Bookmark

                Author and article information

                Journal
                Forensic Science International: Genetics
                Forensic Science International: Genetics
                Elsevier BV
                18724973
                January 2022
                January 2022
                : 56
                : 102633
                Article
                10.1016/j.fsigen.2021.102633
                34826721
                e6d415e2-252c-4339-9642-54381e411e3e
                © 2022

                https://www.elsevier.com/tdm/userlicense/1.0/

                History

                Comments

                Comment on this article