19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Study of the optimum haplotype length to build genomic relationship matrices

      research-article
      1 , 2 , , 3 , 4 , 2 ,
      Genetics, Selection, Evolution : GSE
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          As genomic data becomes more abundant, genomic prediction is more routinely used to estimate breeding values. In genomic prediction, the relationship matrix ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{A}}$$\end{document} ), which is traditionally used in genetic evaluations is replaced by the genomic relationship matrix ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document} ). This paper considers alternative ways of building relationship matrices either using single markers or haplotypes of different lengths. We compared the prediction accuracies and log-likelihoods when using these alternative relationship matrices and the traditional \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document} matrix, for real and simulated data.

          Methods

          For real data, we built relationship matrices using 50k genotype data for a population of Brahman cattle to analyze three traits: scrotal circumference (SC), age at puberty (AGECL) and weight at first corpus luteum (WTCL). Haplotypes were phased with hsphase and imputed with BEAGLE. The relationship matrices were built using three methods based on haplotypes of different lengths. The log-likelihood was considered to define the optimum haplotype lengths for each trait and each haplotype-based relationship matrix.

          Results

          Based on simulated data, we showed that the inverse of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document} matrix and the inverse of the haplotype relationship matrices for methods using one-single nucleotide polymorphism (SNP) phased haplotypes provided coefficients of determination (R 2) close to 1, although the estimated genetic variances differed across methods. Using real data and multiple SNPs in the haplotype segments to build the relationship matrices provided better results than the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document} matrix based on one-SNP haplotypes. However, the optimal haplotype length to achieve the highest log-likelihood depended on the method used and the trait. The optimal haplotype length (7 to 8 SNPs) was similar for SC and AGECL. One of the haplotype-based methods achieved the largest increase in log-likelihood for SC, i.e. from −1330 when using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}$$\end{document} to −1325 when using haplotypes with eight SNPs.

          Conclusions

          Building the relationship matrix by using haplotypes that comprise multiple SNPs will increase the accuracy of estimated breeding values. However, the optimum haplotype length that shows the correct relationship among individuals for each trait can be derived from the data.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Best linear unbiased estimation and prediction under a selection model.

          Mixed linear models are assumed in most animal breeding applications. Convenient methods for computing BLUE of the estimable linear functions of the fixed elements of the model and for computing best linear unbiased predictions of the random elements of the model have been available. Most data available to animal breeders, however, do not meet the usual requirements of random sampling, the problem being that the data arise either from selection experiments or from breeders' herds which are undergoing selection. Consequently, the usual methods are likely to yield biased estimates and predictions. Methods for dealing with such data are presented in this paper.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Coefficients of Inbreeding and Relationship

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found

              Accuracy of genomic selection using different methods to define haplotypes.

              Genomic selection uses total breeding values for juvenile animals, predicted from a large number of estimated marker haplotype effects across the whole genome. In this study the accuracy of predicting breeding values is compared for four different models including a large number of markers, at different marker densities for traits with heritabilities of 50 and 10%. The models estimated the effect of (1) each single-marker allele [single-nucleotide polymorphism (SNP)1], (2) haplotypes constructed from two adjacent marker alleles (SNP2), and (3) haplotypes constructed from 2 or 10 markers, including the covariance between haplotypes by combining linkage disequilibrium and linkage analysis (HAP_IBD2 and HAP_IBD10). Between 119 and 2343 polymorphic SNPs were simulated on a 3-M genome. For the trait with a heritability of 10%, the differences between models were small and none of them yielded the highest accuracies across all marker densities. For the trait with a heritability of 50%, the HAP_IBD10 model yielded the highest accuracies of estimated total breeding values for juvenile and phenotyped animals at all marker densities. It was concluded that genomic selection is considerably more accurate than traditional selection, especially for a low-heritability trait.
                Bookmark

                Author and article information

                Contributors
                mferdosi@myune.edu.au
                John.Henshall@cobb-vantress.com
                btier@une.edu.au
                Journal
                Genet Sel Evol
                Genet. Sel. Evol
                Genetics, Selection, Evolution : GSE
                BioMed Central (London )
                0999-193X
                1297-9686
                29 September 2016
                29 September 2016
                2016
                : 48
                : 75
                Affiliations
                [1 ]The Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, Australia
                [2 ]Animal Genetics and Breeding Unit, University of New England, Armidale, Australia
                [3 ]Cobb-Vantress, Siloam Springs, AR USA
                [4 ]CSIRO Agriculture Flagship, FD McMaster Laboratory Chiswick, Armidale, Australia
                Article
                253
                10.1186/s12711-016-0253-6
                5043651
                27687320
                ffe6c0bd-a939-4954-9eaa-6f4b82d4c927
                © The Author(s) 2016

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 21 June 2015
                : 15 September 2016
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2016

                Genetics
                Genetics

                Comments

                Comment on this article