11
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Is population structure in the genetic biobank era irrelevant, a challenge, or an opportunity?

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Replicable genetic association signals have consistently been found through genome-wide association studies in recent years. The recent dramatic expansion of study sizes improves power of estimation of effect sizes, genomic prediction, causal inference, and polygenic selection, but it simultaneously increases susceptibility of these methods to bias due to subtle population structure. Standard methods using genetic principal components to correct for structure might not always be appropriate and we use a simulation study to illustrate when correction might be ineffective for avoiding biases. New methods such as trans-ethnic modeling and chromosome painting allow for a richer understanding of the relationship between traits and population structure. We illustrate the arguments using real examples (stroke and educational attainment) and provide a more nuanced understanding of population structure, which is set to be revisited as a critical aspect of future analyses in genetic epidemiology. We also make simple recommendations for how problems can be avoided in the future. Our results have particular importance for the implementation of GWAS meta-analysis, for prediction of traits, and for causal inference.

          Related collections

          Most cited references117

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          A global reference for human genetic variation

          The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age

            Cathie Sudlow and colleagues describe the UK Biobank, a large population-based prospective study, established to allow investigation of the genetic and non-genetic determinants of the diseases of middle and old age.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              GCTA: a tool for genome-wide complex trait analysis.

              For most human complex diseases and traits, SNPs identified by genome-wide association studies (GWAS) explain only a small fraction of the heritability. Here we report a user-friendly software tool called genome-wide complex trait analysis (GCTA), which was developed based on a method we recently developed to address the "missing heritability" problem. GCTA estimates the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait rather than testing the association of any particular SNP to the trait. We introduce GCTA's five main functions: data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation. We focus on the function of estimating the variance explained by all the SNPs on the X chromosome and testing the hypotheses of dosage compensation. The GCTA software is a versatile tool to estimate and partition complex trait variation with large GWAS data sets.
                Bookmark

                Author and article information

                Contributors
                dan.lawson@bristol.ac.uk
                Journal
                Hum Genet
                Hum. Genet
                Human Genetics
                Springer Berlin Heidelberg (Berlin/Heidelberg )
                0340-6717
                1432-1203
                27 April 2019
                27 April 2019
                2020
                : 139
                : 1
                : 23-41
                Affiliations
                [1 ]GRID grid.5337.2, ISNI 0000 0004 1936 7603, MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, , University of Bristol, ; Oakfield House, Oakfield Grove, Bristol, BS8 2BN UK
                [2 ]GRID grid.83440.3b, ISNI 0000000121901201, Institute of Cardiovascular Science, Faculty of Population Health Sciences, , University College London, ; Gower Street, London, WC1E 6BT UK
                Author information
                http://orcid.org/0000-0002-5311-6213
                Article
                2014
                10.1007/s00439-019-02014-8
                6942007
                31030318
                76e22bfb-5b6c-43be-857a-d2892221e9c3
                © The Author(s) 2019

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

                History
                : 24 November 2018
                : 12 April 2019
                Funding
                Funded by: Wellcome Trust (GB)
                Award ID: WT104125MA
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100000265, Medical Research Council;
                Award ID: MC_UU_00011/1
                Categories
                Review
                Custom metadata
                © Springer-Verlag GmbH Germany, part of Springer Nature 2020

                Genetics
                Genetics

                Comments

                Comment on this article