59
views
0
recommends
+1 Recommend
4 collections
    0
    shares

      Why publish your research Open Access with G3: Genes|Genomes|Genetics?

      Learn more and submit today!

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Resource Allocation for Maximizing Prediction Accuracy and Genetic Gain of Genomic Selection in Plant Breeding: A Simulation Experiment

      research-article
      1
      G3: Genes|Genomes|Genetics
      Genetics Society of America
      genomic selection, plant breeding, GenPred, Shared data resources

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Allocating resources between population size and replication affects both genetic gain through phenotypic selection and quantitative trait loci detection power and effect estimation accuracy for marker-assisted selection (MAS). It is well known that because alleles are replicated across individuals in quantitative trait loci mapping and MAS, more resources should be allocated to increasing population size compared with phenotypic selection. Genomic selection is a form of MAS using all marker information simultaneously to predict individual genetic values for complex traits and has widely been found superior to MAS. No studies have explicitly investigated how resource allocation decisions affect success of genomic selection. My objective was to study the effect of resource allocation on response to MAS and genomic selection in a single biparental population of doubled haploid lines by using computer simulation. Simulation results were compared with previously derived formulas for the calculation of prediction accuracy under different levels of heritability and population size. Response of prediction accuracy to resource allocation strategies differed between genomic selection models (ridge regression best linear unbiased prediction [RR-BLUP], BayesC π) and multiple linear regression using ordinary least-squares estimation (OLS), leading to different optimal resource allocation choices between OLS and RR-BLUP. For OLS, it was always advantageous to maximize population size at the expense of replication, but a high degree of flexibility was observed for RR-BLUP. Prediction accuracy of doubled haploid lines included in the training set was much greater than of those excluded from the training set, so there was little benefit to phenotyping only a subset of the lines genotyped. Finally, observed prediction accuracies in the simulation compared well to calculated prediction accuracies, indicating these theoretical formulas are useful for making resource allocation decisions.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: found
          • Article: found

          Factors affecting accuracy from genomic selection in populations derived from multiple inbred lines: a Barley case study.

          We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR-BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Using the genomic relationship matrix to predict the accuracy of genomic selection.

            Estimated breeding values (EBVs) using data from genetic markers can be predicted using a genomic relationship matrix, derived from animal's genotypes, and best linear unbiased prediction. However, if the accuracy of the EBVs is calculated in the usual manner (from the inverse element of the coefficient matrix), it is likely to be overestimated owing to sampling errors in elements of the genomic relationship matrix. We show here that the correct accuracy can be obtained by regressing the relationship matrix towards the pedigree relationship matrix so that it is an unbiased estimate of the relationships at the QTL controlling the trait. This method shows how the accuracy increases as the number of markers used increases because the regression coefficient (of genomic relationship towards pedigree relationship) increases. We also present a deterministic method for predicting the accuracy of such genomic EBVs before data on individual animals are collected. This method estimates the proportion of genetic variance explained by the markers, which is equal to the regression coefficient described above, and the accuracy with which marker effects are estimated. The latter depends on the variance in relationship between pairs of animals, which equals the mean linkage disequilibrium over all pairs of loci. The theory was validated using simulated data and data on fat concentration in the milk of Holstein cattle. © 2011 Blackwell Verlag GmbH.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes.

              Genomic prediction involves characterization of chromosome fragments in a training population to predict merit. Confidence in the predictions relies on their evaluation in a validation population. Many commercial animals are multibreed (MB) or crossbred, but seedstock populations tend to be purebred (PB). Training in MB allows selection of PB for crossbred performance. Training in PB to predict MB performance quantifies the potential for across-breed genomic prediction. Efficiency of genomic selection was evaluated for a trait with heritability 0.5 simulated using actual SNP genotypes. The PB population had 1,086 Angus animals, and the MB population had 924 individuals from 8 sire breeds. Phenotypic values were simulated for scenarios including 50, 100, 250, or 500 additive QTL randomly selected from 50K SNP panels. Panels containing various numbers of SNP, including or excluding the QTL, were used in the analysis. A Bayesian model averaging method was used to simultaneously estimate the effects of all markers on the panels in MB (or PB) training populations. Estimated effects were utilized to predict genomic merit of animals in PB (or MB) validation populations. Correlations between predicted and simulated genomic merit in the validation population was used to reflect predictive ability. Panels that included QTL were able to account for 50% or more of the within-breed genetic variance when the trait was influenced by 50 QTL. The predictive power eroded as the number of QTL increased. Panels that composed the QTL and no other markers were able to account for 50% or more genetic variance even with 500 QTL. Panels that included genomic markers as well as QTL had less predictive power as the number of markers on the panel was increased. Panels that excluded the QTL and relied on markers in linkage disequilibrium (LD) to predict QTL effects performed more poorly than marker panels with QTL. Real-life situations with 50K panels that excluded the QTL could account for no more than 20% genetic variation for 50 QTL and less than 10% for 500 QTL. The difference between panels that included and excluded QTL indicates that the predictive ability of existing panels is limited by their LD. Training in PB to predict MB tended to be more predictive than training in MB to predict PB due to greater average levels of LD in PB than in MB populations. Improved across breed prediction of genomic merit will require panels with more than 50,000 markers.
                Bookmark

                Author and article information

                Journal
                G3 (Bethesda)
                Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes|Genomes|Genetics
                Genetics Society of America
                2160-1836
                1 March 2013
                March 2013
                : 3
                : 3
                : 481-491
                Affiliations
                [1]Department of Agronomy and Horticulture, University of Nebraska, Lincoln, Nebraska 68583
                Author notes

                Supporting information is available online at http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004911/-/DC1

                [1 ]Corresponding author: 363 Keim Hall, P. O. Box 830915, Lincoln, NE 68583-0915. E-mail: alorenz2@ 123456unl.edu
                Article
                GGG_004911
                10.1534/g3.112.004911
                3583455
                23450123
                49f7aed5-fe14-4128-9ea8-5150dff69b7d
                Copyright © 2013 Lorenz

                This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License ( http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 02 November 2012
                : 03 January 2012
                Categories
                Investigations
                Custom metadata
                v1

                Genetics
                genomic selection,plant breeding,genpred,shared data resources
                Genetics
                genomic selection, plant breeding, genpred, shared data resources

                Comments

                Comment on this article