28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Exploration of haplotype research consortium imputation for genome-wide association studies in 20,032 Generation Scotland participants

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The Generation Scotland: Scottish Family Health Study (GS:SFHS) is a family-based population cohort with DNA, biological samples, socio-demographic, psychological and clinical data from approximately 24,000 adult volunteers across Scotland. Although data collection was cross-sectional, GS:SFHS became a prospective cohort due to of the ability to link to routine Electronic Health Record (EHR) data. Over 20,000 participants were selected for genotyping using a large genome-wide array.

          Methods

          GS:SFHS was analysed using genome-wide association studies (GWAS) to test the effects of a large spectrum of variants, imputed using the Haplotype Research Consortium (HRC) dataset, on medically relevant traits measured directly or obtained from EHRs. The HRC dataset is the largest available haplotype reference panel for imputation of variants in populations of European ancestry and allows investigation of variants with low minor allele frequencies within the entire GS:SFHS genotyped cohort.

          Results

          Genome-wide associations were run on 20,032 individuals using both genotyped and HRC imputed data. We present results for a range of well-studied quantitative traits obtained from clinic visits and for serum urate measures obtained from data linkage to EHRs collected by the Scottish National Health Service. Results replicated known associations and additionally reveal novel findings, mainly with rare variants, validating the use of the HRC imputation panel. For example, we identified two new associations with fasting glucose at variants near to Y_RNA and WDR4 and four new associations with heart rate at SNPs within CSMD1 and ASPH, upstream of HTR1F and between PROKR2 and GPCPD1. All were driven by rare variants (minor allele frequencies in the range of 0.08–1%). Proof of principle for use of EHRs was verification of the highly significant association of urate levels with the well-established urate transporter SLC2A9.

          Conclusions

          GS:SFHS provides genetic data on over 20,000 participants alongside a range of phenotypes as well as linkage to National Health Service laboratory and clinical records. We have shown that the combination of deeper genotype imputation and extended phenotype availability make GS:SFHS an attractive resource to carry out association studies to gain insight into the genetic architecture of complex traits.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s13073-017-0414-4) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          SLC2A9 is a newly identified urate transporter influencing serum urate concentration, urate excretion and gout.

          Uric acid is the end product of purine metabolism in humans and great apes, which have lost hepatic uricase activity, leading to uniquely high serum uric acid concentrations (200-500 microM) compared with other mammals (3-120 microM). About 70% of daily urate disposal occurs via the kidneys, and in 5-25% of the human population, impaired renal excretion leads to hyperuricemia. About 10% of people with hyperuricemia develop gout, an inflammatory arthritis that results from deposition of monosodium urate crystals in the joint. We have identified genetic variants within a transporter gene, SLC2A9, that explain 1.7-5.3% of the variance in serum uric acid concentrations, following a genome-wide association scan in a Croatian population sample. SLC2A9 variants were also associated with low fractional excretion of uric acid and/or gout in UK, Croatian and German population samples. SLC2A9 is a known fructose transporter, and we now show that it has strong uric acid transport activity in Xenopus laevis oocytes.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Efficient haplotype matching and storage using the positional Burrows–Wheeler transform (PBWT)

            Motivation: Over the last few years, methods based on suffix arrays using the Burrows–Wheeler Transform have been widely used for DNA sequence read matching and assembly. These provide very fast search algorithms, linear in the search pattern size, on a highly compressible representation of the dataset being searched. Meanwhile, algorithmic development for genotype data has concentrated on statistical methods for phasing and imputation, based on probabilistic matching to hidden Markov model representations of the reference data, which while powerful are much less computationally efficient. Here a theory of haplotype matching using suffix array ideas is developed, which should scale too much larger datasets than those currently handled by genotype algorithms. Results: Given M sequences with N bi-allelic variable sites, an O(NM) algorithm to derive a representation of the data based on positional prefix arrays is given, which is termed the positional Burrows–Wheeler transform (PBWT). On large datasets this compresses with run-length encoding by more than a factor of a hundred smaller than using gzip on the raw data. Using this representation a method is given to find all maximal haplotype matches within the set in O(NM) time rather than O(NM 2) as expected from naive pairwise comparison, and also a fast algorithm, empirically independent of M given sufficient memory for indexes, to find maximal matches between a new sequence and the set. The discussion includes some proposals about how these approaches could be used for imputation and phasing. Availability: http://github.com/richarddurbin/pbwt Contact: richard.durbin@sanger.ac.uk
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Geisinger MyCode Community Health Initiative: an electronic health record-linked biobank for Precision Medicine research

              Purpose Geisinger Health System (GHS) provides an ideal platform for Precision Medicine. Key elements are the integrated health system, stable patient population, and electronic health record (EHR) infrastructure. In 2007 Geisinger launched MyCode®, a system-wide biobanking program to link samples and EHR data for broad research use. Methods Patient-centered input into MyCode® was obtained using participant focus groups. Participation in MyCode® is based on opt-in informed consent and allows recontact, which facilitates collection of data not in the EHR, and, since 2013, the return of clinically actionable results to participants. MyCode® leverages Geisinger’s technology and clinical infrastructure for participant tracking and sample collection. Results MyCode® has a consent rate of >85% with more than 90,000 participants currently, with ongoing enrollment of ~4,000 per month. MyCode® samples have been used to generate molecular data, including high-density genotype and exome sequence data. Genotype and EHR-derived phenotype data replicate previously reported genetic associations. Conclusion The MyCode® project has created resources that enable a new model for translational research that is faster, more flexible, and more cost effective than traditional clinical research approaches. The new model is scalable, and will increase in value as these resources grow and are adopted across multiple research platforms.
                Bookmark

                Author and article information

                Contributors
                Reka.Nagy@ed.ac.uk
                thibaud.boutin@igmm.ed.ac.uk
                Jonathan.Marten@igmm.ed.ac.uk
                jennifer.huffman@nih.gov
                shona.kerr@igmm.ed.ac.uk
                archie.campbell@igmm.ed.ac.uk
                louise.evenden@ed.ac.uk
                jude.gibson@ed.ac.uk
                carmen.amador@igmm.ed.ac.uk
                david.howard@ed.ac.uk
                pau.navarro@ed.ac.uk
                andrew.morris@ed.ac.uk
                ian.deary@ed.ac.uk
                l.hocking@abdn.ac.uk
                Sandosh.Padmanabhan@glasgow.ac.uk
                b.h.smith@dundee.ac.uk
                peter.joshi@ed.ac.uk
                Jim.Wilson@ed.ac.uk
                nicholas.hastie@igmm.ed.ac.uk
                Alan.Wright@igmm.ed.ac.uk
                andrew.mcintosh@ed.ac.uk
                David.Porteous@igmm.ed.ac.uk
                Chris.Haley@igmm.ed.ac.uk
                Veronique.vitart@igmm.ed.ac.uk
                +44 (0)131 651 8751 , caroline.hayward@igmm.ed.ac.uk
                Journal
                Genome Med
                Genome Med
                Genome Medicine
                BioMed Central (London )
                1756-994X
                7 March 2017
                7 March 2017
                2017
                : 9
                : 23
                Affiliations
                [1 ]MRC Human Genetics Unit, University of Edinburgh, Institute of Genetics and Molecular Medicine, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU UK
                [2 ]Centre for Genomic and Experimental Medicine, University of Edinburgh, Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
                [3 ]ISNI 0000 0004 1936 7988, GRID grid.4305.2, Edinburgh Clinical Research Facility, , University of Edinburgh, ; Edinburgh, UK
                [4 ]ISNI 0000 0004 1936 7988, GRID grid.4305.2, Division of Psychiatry, , University of Edinburgh, Royal Edinburgh Hospital, ; Edinburgh, UK
                [5 ]Farr Institute of Health Informatics Research, Edinburgh, UK
                [6 ]ISNI 0000 0004 1936 7988, GRID grid.4305.2, , Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, University of Edinburgh, ; Edinburgh, UK
                [7 ]ISNI 0000 0004 1936 7291, GRID grid.7107.1, Division of Applied Health Sciences, , University of Aberdeen, ; Aberdeen, UK
                [8 ]ISNI 0000 0001 2193 314X, GRID grid.8756.c, Division of Cardiovascular and Medical Sciences, , University of Glasgow, ; Glasgow, UK
                [9 ]ISNI 0000 0004 0397 2876, GRID grid.8241.f, , Medical Research Institute, University of Dundee, ; Dundee, UK
                [10 ]ISNI 0000 0004 1936 7988, GRID grid.4305.2, , Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, ; Edinburgh, EH8 9AG UK
                Author information
                http://orcid.org/0000-0002-9405-9550
                Article
                414
                10.1186/s13073-017-0414-4
                5339960
                28270201
                3281be39-43e7-4be7-9846-8a4bd62e1229
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 17 August 2016
                : 9 February 2017
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000265, Medical Research Council;
                Award ID: Core Funding MRCHGU
                Award ID: Core Funding MRCHGU
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100000589, Chief Scientist Office;
                Award ID: CZD/16/6
                Award ID: CZD/16/6
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100004440, Wellcome Trust;
                Award ID: 104036/Z/14/Z
                Award Recipient :
                Categories
                Research
                Custom metadata
                © The Author(s) 2017

                Molecular medicine
                genome-wide association studies (gwas),electronic health records,imputation,quantitative trait,genetics,urate,heart rate,glucose,haplotype research consortium (hrc)

                Comments

                Comment on this article