A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In this paper we propose a Bayesian multi-output regressor stacking (BMORS) model that is a generalization of the multi-trait regressor stacking method. The proposed BMORS model consists of two stages: in the first stage, a univariate genomic best linear unbiased prediction (GBLUP including genotype × environment interaction GE) model is implemented for each of the $L$ traits under study; then the predictions of all traits are included as covariates in the second stage, by implementing a Ridge regression model. The main objectives of this research were to study alternative models to the existing multi-trait multi-environment (BMTME) model with respect to (1) genomic-enabled prediction accuracy, and (2) potential advantages in terms of computing resources and implementation. We compared the predictions of the BMORS model to those of the univariate GBLUP model using 7 maize and wheat datasets. We found that the proposed BMORS produced similar predictions to the univariate GBLUP model and to the BMTME model in terms of prediction accuracy; however, the best predictions were obtained under the BMTME model. In terms of computing resources, we found that the BMORS is at least 9 times faster than the BMTME method. Based on our empirical findings, the proposed BMORS model is an alternative for predicting multi-trait and multi-environment data, which are very common in genomic-enabled prediction in plant and animal breeding programs.

Most cited references 16

Record: found
Abstract: not found
Book: not found

Solving Least Squares Problems

Charles Lawson, Richard Hanson (1995)

0 comments Cited 213 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Accuracy of multi-trait genomic selection using different methods

Mario Calus, Roel Veerkamp (2011)

Background Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and selected for in another environment. The objective of this paper was to develop three models that would permit multi-trait genomic selection by combining scarcely recorded traits with genetically correlated indicator traits, and to compare their performance to single-trait models, using simulated datasets. Methods Three (SNP) Single Nucleotide Polymorphism based models were used. Model G and BCπ0 assumed that contributed (co)variances of all SNP are equal. Model BSSVS sampled SNP effects from a distribution with large (or small) effects to model SNP that are (or not) associated with a quantitative trait locus. For reasons of comparison, model A including pedigree but not SNP information was fitted as well. Results In terms of accuracies for animals without phenotypes, the models generally ranked as follows: BSSVS > BCπ0 > G > > A. Using multi-trait SNP-based models, the accuracy for juvenile animals without any phenotypes increased up to 0.10. For animals with phenotypes on an indicator trait only, accuracy increased up to 0.03 and 0.14, for genetic correlations with the evaluated trait of 0.25 and 0.75, respectively. Conclusions When the indicator trait had a genetic correlation lower than 0.5 with the trait of interest in our simulated data, the accuracy was higher if genotypes rather than phenotypes were obtained for the indicator trait. However, when genetic correlations were higher than 0.5, using an indicator trait led to higher accuracies for selection candidates. For different combinations of traits, the level of genetic correlation below which genotyping selection candidates is more effective than obtaining phenotypes for an indicator trait, needs to be derived considering at least the heritabilities and the numbers of animals recorded for the traits involved.

0 comments Cited 118 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Genomic Prediction of Gene Bank Wheat Landraces

Jose Crossa, Diego Jarquín, Jorge Franco … (2016)

This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials.

0 comments Cited 81 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): G3 (Bethesda)

Journal ID (iso-abbrev): Genetics

Journal ID (hwp): G3: Genes, Genomes, Genetics

Journal ID (pmc): G3: Genes, Genomes, Genetics

Journal ID (publisher-id): G3: Genes, Genomes, Genetics

Title: G3: Genes|Genomes|Genetics

Publisher: Genetics Society of America

ISSN (Electronic): 2160-1836

Publication date (Electronic): 19 August 2019

Publication date Collection: October 2019

Volume: 9

Issue: 10

Pages: 3381-3393

Affiliations

[* ]Facultad de Telemática, Universidad de Colima, Colima, Colima, 28040, México,

[† ]Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, 44430, México,

[‡ ]International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, Ciudad de México, 06600, México,

[§ ]Universidad de Quintana Roo, Chetumal, Quintana Roo, México,

[** ]Departamento de Estadística, Centro de Investigación en Matemáticas, Guanajuato, Guanajuato, 36023, México, and

[†† ]Department of Plant Sciences, Norwegian University of Life Sciences, IHA/CIGENE, P.O. Box 5003, NO-1432 Ås, Norway

Author notes

[1 ]Corresponding authors: Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, Ciudad de México, 06600, México. E-mail: j.crossa@ 123456cgiar.org . Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, 44430, México. E-mail: aml_uach@ 123456hotmail.com

Author information

José Crossa http://orcid.org/0000-0001-9429-5855

Article

Publisher ID: GGG_400336

DOI: 10.1534/g3.119.400336

PMC ID: 6778812

PubMed ID: 31427455

SO-VID: 3f307e90-2a65-4512-a21d-439ee1932753

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 09 May 2019

Date accepted : 15 August 2019

Page count

Figures: 7, Tables: 3, Equations: 3, References: 33, Pages: 13

Comments

Comment on this article

scite_

Cited by 12

See all cited by

Most referenced authors 261

See all reference authors

- Version 1
- Version 1

A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data

Read this article at

Abstract

Most cited references 16

Solving Least Squares Problems

Accuracy of multi-trait genomic selection using different methods

Genomic Prediction of Gene Bank Wheat Landraces

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 71

Cited by 12

Most referenced authors 261