11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium

      research-article
      ,
      BMC Genetics
      BioMed Central
      Linkage disequilibrium, Maximum likelihood estimation, Sampling error

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to measurement error. Previous studies of r 2 (a measure of LD), such as the bias due to finite sample size and its variance, were based on the special case that the true population-wise LD is zero. These results generally do not hold for non-zero \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {r}_{true}^2 $$\end{document} values, which are more common in real genetic data.

          Results

          This work generalises the estimation of r 2 to all levels of LD, and for both phased and unphased data. First, we provide new formulae for the effect of finite sample size on the observed r 2 values. Second, we find a new empirical formula for the variance of the observed r 2, equals to 2 E[ r 2](1 −  E[ r 2])/ n, where n is the diploid sample size. Third, we propose a new routine, Constrained ML, a likelihood-based method to directly estimate haplotype frequencies and r 2 from diploid genotypes under Hardy-Weinberg Equilibrium. While serving the same purpose as the pre-existing Expectation-Maximisation algorithm, the new routine can have better convergence and is simpler to use. A new likelihood-ratio test is also introduced to test for the absence of a particular haplotype. Extensive simulations are run to support these findings.

          Conclusion

          Most inferences on LD will benefit from our new findings, from point and interval estimation to hypothesis testing. Genetic analyses utilising r 2 information will become more accurate as a result.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.

          Molecular techniques allow the survey of a large number of linked polymorphic loci in random samples from diploid populations. However, the gametic phase of haplotypes is usually unknown when diploid individuals are heterozygous at more than one locus. To overcome this difficulty, we implement an expectation-maximization (EM) algorithm leading to maximum-likelihood estimates of molecular haplotype frequencies under the assumption of Hardy-Weinberg proportions. The performance of the algorithm is evaluated for simulated data representing both DNA sequences and highly polymorphic loci with different levels of recombination. As expected, the EM algorithm is found to perform best for large samples, regardless of recombination rates among loci. To ensure finding the global maximum likelihood estimate, the EM algorithm should be started from several initial conditions. The present approach appears to be useful for the analysis of nuclear DNA sequences or highly variable loci. Although the algorithm, in principle, can accommodate an arbitrary number of loci, there are practical limitations because the computing time grows exponentially with the number of polymorphic loci. Although the algorithm, in principle, can accommodate an arbitrary number of loci, there are practical limitations because the computing time grows exponentially with the number of polymorphic loci.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Estimation of effective population size from data on linkage disequilibrium

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Comparing sequences without using alignments: application to HIV/SIV subtyping

              Background In general, the construction of trees is based on sequence alignments. This procedure, however, leads to loss of informationwhen parts of sequence alignments (for instance ambiguous regions) are deleted before tree building. To overcome this difficulty, one of us previously introduced a new and rapid algorithm that calculates dissimilarity matrices between sequences without preliminary alignment. Results In this paper, HIV (Human Immunodeficiency Virus) and SIV (Simian Immunodeficiency Virus) sequence data are used to evaluate this method. The program produces tree topologies that are identical to those obtained by a combination of standard methods detailed in the HIV Sequence Compendium. Manual alignment editing is not necessary at any stage. Furthermore, only one user-specified parameter is needed for constructing trees. Conclusion The extensive tests on HIV/SIV subtyping showed that the virus classifications produced by our method are in good agreement with our best taxonomic knowledge, even in non-coding LTR (Long Terminal Repeat) regions that are not tractable by regular alignment methods due to frequent duplications/insertions/deletions. Our method, however, is not limited to the HIV/SIV subtyping. It provides an alternative tree construction without a time-consuming aligning procedure.
                Bookmark

                Author and article information

                Contributors
                tin-yu.hui11@imperial.ac.uk
                a.burt@imperial.ac.uk
                Journal
                BMC Genet
                BMC Genet
                BMC Genetics
                BioMed Central (London )
                1471-2156
                26 February 2020
                26 February 2020
                2020
                : 21
                : 21
                Affiliations
                ISNI 0000 0001 2113 8111, GRID grid.7445.2, Department of Life Sciences, Silwood Park Campus, , Imperial College London, ; Ascot, Berkshire SL5 7PY UK
                Author information
                http://orcid.org/0000-0002-1702-803X
                Article
                818
                10.1186/s12863-020-0818-9
                7045472
                32102657
                9aa45f82-51f6-4bb2-8c8a-1b3fd5882d1d
                © The Author(s). 2020

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 3 August 2019
                : 29 January 2020
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000865, Bill and Melinda Gates Foundation;
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2020

                Genetics
                linkage disequilibrium,maximum likelihood estimation,sampling error
                Genetics
                linkage disequilibrium, maximum likelihood estimation, sampling error

                Comments

                Comment on this article