24
views
0
recommends
+1 Recommend
3 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found

      Graph-based data selection for the construction of genomic prediction models.

      1 , ,
      Genetics
      Genetics Society of America

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Efficient genomic selection in animals or crops requires the accurate prediction of the agronomic performance of individuals from their high-density molecular marker profiles. Using a training data set that contains the genotypic and phenotypic information of a large number of individuals, each marker or marker allele is associated with an estimated effect on the trait under study. These estimated marker effects are subsequently used for making predictions on individuals for which no phenotypic records are available. As most plant and animal breeding programs are currently still phenotype driven, the continuously expanding collection of phenotypic records can only be used to construct a genomic prediction model if a dense molecular marker fingerprint is available for each phenotyped individual. However, as the genotyping budget is generally limited, the genomic prediction model can only be constructed using a subset of the tested individuals and possibly a genome-covering subset of the molecular markers. In this article, we demonstrate how an optimal selection of individuals can be made with respect to the quality of their available phenotypic data. We also demonstrate how the total number of molecular markers can be reduced while a maximum genome coverage is ensured. The third selection problem we tackle is specific to the construction of a genomic prediction model for a hybrid breeding program where only molecular marker fingerprints of the homozygous parents are available. We show how to identify the set of parental inbred lines of a predefined size that has produced the highest number of progeny. These three selection approaches are put into practice in a simulation study where we demonstrate how the trade-off between sample size and sample quality affects the prediction accuracy of genomic prediction models for hybrid maize.

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Best linear unbiased estimation and prediction under a selection model.

          Mixed linear models are assumed in most animal breeding applications. Convenient methods for computing BLUE of the estimable linear functions of the fixed elements of the model and for computing best linear unbiased predictions of the random elements of the model have been available. Most data available to animal breeders, however, do not meet the usual requirements of random sampling, the problem being that the data arise either from selection experiments or from breeders' herds which are undergoing selection. Consequently, the usual methods are likely to yield biased estimates and predictions. Methods for dealing with such data are presented in this paper.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Recovery of inter-block information when block sizes are unequal

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found

              Reproducing kernel hilbert spaces regression methods for genomic assisted prediction of quantitative traits.

              Reproducing kernel Hilbert spaces regression procedures for prediction of total genetic value for quantitative traits, which make use of phenotypic and genomic data simultaneously, are discussed from a theoretical perspective. It is argued that a nonparametric treatment may be needed for capturing the multiple and complex interactions potentially arising in whole-genome models, i.e., those based on thousands of single-nucleotide polymorphism (SNP) markers. After a review of reproducing kernel Hilbert spaces regression, it is shown that the statistical specification admits a standard mixed-effects linear model representation, with smoothing parameters treated as variance components. Models for capturing different forms of interaction, e.g., chromosome-specific, are presented. Implementations can be carried out using software for likelihood-based or Bayesian inference.
                Bookmark

                Author and article information

                Journal
                Genetics
                Genetics
                Genetics Society of America
                1943-2631
                0016-6731
                Aug 2010
                : 185
                : 4
                Affiliations
                [1 ] Department of Biosciences and Landscape Architecture, University College Ghent, B-9000 Gent, Belgium. steven.maenhout@hogent.be
                Article
                genetics.110.116426
                10.1534/genetics.110.116426
                2927770
                20479144
                b83d1e6e-20cb-4902-ab18-0ad30ed421f3
                History

                Comments

                Comment on this article