9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Detection of Identical-By-Descent (IBD) segments provides a fundamental measure of genetic relatedness and plays a key role in a wide range of analyses. We develop FastSMC, an IBD detection algorithm that combines a fast heuristic search with accurate coalescent-based likelihood calculations. FastSMC enables biobank-scale detection and dating of IBD segments within several thousands of years in the past. We apply FastSMC to 487,409 UK Biobank samples and detect ~214 billion IBD segments transmitted by shared ancestors within the past 1500 years, obtaining a fine-grained picture of genetic relatedness in the UK. Sharing of common ancestors strongly correlates with geographic distance, enabling the use of genomic data to localize a sample’s birth coordinates with a median error of 45 km. We seek evidence of recent positive selection by identifying loci with unusually strong shared ancestry and detect 12 genome-wide significant signals. We devise an IBD-based test for association between phenotype and ultra-rare loss-of-function variation, identifying 29 association signals in 7 blood-related traits.

          Abstract

          Accurately measuring genetic relatedness by Identical-By-Descent (IBD) segments is challenging in biobank-level genome data. The authors present IBD method FastSMC, which when applied to the UK Biobank gives a detailed picture of genetic relatedness and evolutionary history in the UK over the past 2000 years.

          Related collections

          Most cited references70

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          SciPy 1.0: fundamental algorithms for scientific computing in Python

          SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Matplotlib: A 2D Graphics Environment

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Second-generation PLINK: rising to the challenge of larger and richer datasets

              PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for even faster and more scalable implementations of key functions. In addition, GWAS and population-genetic data now frequently contain probabilistic calls, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, O(sqrt(n))-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. This will be followed by PLINK 2.0, which will introduce (a) a new data format capable of efficiently representing probabilities, phase, and multiallelic variants, and (b) extensions of many functions to account for the new types of information. The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
                Bookmark

                Author and article information

                Contributors
                juba.naitsaada@stats.ox.ac.uk
                palamara@stats.ox.ac.uk
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                30 November 2020
                30 November 2020
                2020
                : 11
                : 6130
                Affiliations
                [1 ]GRID grid.4991.5, ISNI 0000 0004 1936 8948, Department of Statistics, , University of Oxford, ; Oxford, UK
                [2 ]GRID grid.38142.3c, ISNI 000000041936754X, Department of Biostatistics, , Harvard T.H. Chan School of Public Health, ; Boston, MA 02115 USA
                [3 ]GRID grid.4991.5, ISNI 0000 0004 1936 8948, Department of Computer Science, , University of Oxford, ; Oxford, UK
                [4 ]GRID grid.2515.3, ISNI 0000 0004 0378 8438, Brigham & Women’s Hospital, , Division of Genetics, ; Boston, MA 02215 USA
                [5 ]GRID grid.65499.37, ISNI 0000 0001 2106 9910, Department of Medical Oncology, , Dana-Farber Cancer Institute, ; Boston, MA 02215 USA
                [6 ]GRID grid.4991.5, ISNI 0000 0004 1936 8948, Wellcome Centre for Human Genetics, , University of Oxford, ; Oxford, UK
                Author information
                http://orcid.org/0000-0002-2197-1151
                http://orcid.org/0000-0002-0729-1125
                http://orcid.org/0000-0002-2345-3390
                http://orcid.org/0000-0002-6221-4288
                http://orcid.org/0000-0002-1572-6782
                http://orcid.org/0000-0002-7980-4620
                http://orcid.org/0000-0002-7999-1972
                Article
                19588
                10.1038/s41467-020-19588-x
                7704644
                33257650
                663609d5-da33-44a3-91af-d9249c762ff7
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 2 April 2020
                : 2 October 2020
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100000265, RCUK | Medical Research Council (MRC);
                Award ID: MR/S502509/1
                Award ID: EP/L016044/1
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100006558, Oxford University | Balliol College, University of Oxford (Balliol College);
                Award ID: Jowett Scholarship
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100000266, RCUK | Engineering and Physical Sciences Research Council (EPSRC);
                Award ID: EP/L016044/1
                Award ID: D4D00010
                Award ID: D4D00010
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000002, U.S. Department of Health & Human Services | National Institutes of Health (NIH);
                Award ID: T32 GM135117
                Award ID: R21-HG010748-0
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100004440, Wellcome Trust (Wellcome);
                Award ID: 204826/Z/16/Z
                Award ID: 204826/Z/16/Z
                Award Recipient :
                Funded by: U.S. Department of Health & Human Services | National Institutes of Health (NIH)
                Funded by: FundRef https://doi.org/10.13039/100000051, U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI);
                Award ID: R21-HG010748-0
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100010663, EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 European Research Council (H2020 Excellent Science - European Research Council);
                Award ID: 850869
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s) 2020

                Uncategorized
                genome-wide association studies,haplotypes,heritable quantitative trait,population genetics

                Comments

                Comment on this article