A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In this paper we propose a Bayesian multi-output regressor stacking (BMORS) model that is a generalization of the multi-trait regressor stacking method. The proposed BMORS model consists of two stages: in the first stage, a univariate genomic best linear unbiased prediction (GBLUP including genotype × environment interaction GE) model is implemented for each of the L traits under study; then the predictions of all traits are included as covariates in the second stage, by implementing a Ridge regression model. The main objectives of this research were to study alternative models to the existing multi-trait multi-environment (BMTME) model with respect to (1) genomic-enabled prediction accuracy, and (2) potential advantages in terms of computing resources and implementation. We compared the predictions of the BMORS model to those of the univariate GBLUP model using 7 maize and wheat datasets. We found that the proposed BMORS produced similar predictions to the univariate GBLUP model and to the BMTME model in terms of prediction accuracy; however, the best predictions were obtained under the BMTME model. In terms of computing resources, we found that the BMORS is at least 9 times faster than the BMTME method. Based on our empirical findings, the proposed BMORS model is an alternative for predicting multi-trait and multi-environment data, which are very common in genomic-enabled prediction in plant and animal breeding programs.

Most cited references 16

Record: found
Abstract: not found
Book: not found

Solving Least Squares Problems

Charles Lawson, Richard Hanson (1995)

0 comments Cited 214 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Accuracy of multi-trait genomic selection using different methods

Mario Calus, Roel Veerkamp (2011)

Background Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and selected for in another environment. The objective of this paper was to develop three models that would permit multi-trait genomic selection by combining scarcely recorded traits with genetically correlated indicator traits, and to compare their performance to single-trait models, using simulated datasets. Methods Three (SNP) Single Nucleotide Polymorphism based models were used. Model G and BCπ0 assumed that contributed (co)variances of all SNP are equal. Model BSSVS sampled SNP effects from a distribution with large (or small) effects to model SNP that are (or not) associated with a quantitative trait locus. For reasons of comparison, model A including pedigree but not SNP information was fitted as well. Results In terms of accuracies for animals without phenotypes, the models generally ranked as follows: BSSVS > BCπ0 > G > > A. Using multi-trait SNP-based models, the accuracy for juvenile animals without any phenotypes increased up to 0.10. For animals with phenotypes on an indicator trait only, accuracy increased up to 0.03 and 0.14, for genetic correlations with the evaluated trait of 0.25 and 0.75, respectively. Conclusions When the indicator trait had a genetic correlation lower than 0.5 with the trait of interest in our simulated data, the accuracy was higher if genotypes rather than phenotypes were obtained for the indicator trait. However, when genetic correlations were higher than 0.5, using an indicator trait led to higher accuracies for selection candidates. For different combinations of traits, the level of genetic correlation below which genotyping selection candidates is more effective than obtaining phenotypes for an indicator trait, needs to be derived considering at least the heritabilities and the numbers of animals recorded for the traits involved.

0 comments Cited 118 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Genomic Prediction of Gene Bank Wheat Landraces

Jose Crossa, Diego Jarquín, Jorge Franco … (2016)

This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials.

0 comments Cited 81 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: G3&#58; Genes|Genomes|Genetics

Abbreviated Title: G3

Publisher: Genetics Society of America

ISSN (Electronic): 2160-1836

Publication date Created: October 07 2019

Publication date Created: October 2019

Publication date (Print): October 2019

Publication date (Electronic): August 19 2019

Volume: 9

Issue: 10

Pages: 3381-3393

Article

DOI: 10.1534/g3.119.400336

SO-VID: 3f307e90-2a65-4512-a21d-439ee1932753

History

ScienceOpen disciplines: Evolutionary Biology,Quantitative & Systems biology,Developmental biology,Molecular biology,Bioinformatics & Computational biology,Genetics

Data availability:

ScienceOpen disciplines: Evolutionary Biology, Quantitative & Systems biology, Developmental biology, Molecular biology, Bioinformatics & Computational biology, Genetics

A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data

Read this article at

Abstract

Most cited references 16

Solving Least Squares Problems

Accuracy of multi-trait genomic selection using different methods

Genomic Prediction of Gene Bank Wheat Landraces

Author and article information

Journal

Article

History

Comments

Comment on this article

Similar content 801

Cited by 12

Most referenced authors 261