Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets.

Most cited references 4

Record: found
Abstract: found
Article: not found

Genomic selection.

M. E. Goddard, B. Hayes (2007)

Genomic selection is a form of marker-assisted selection in which genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker. This approach has become feasible thanks to the large number of single nucleotide polymorphisms (SNP) discovered by genome sequencing and new methods to efficiently genotype large number of SNP. Simulation results and limited experimental results suggest that breeding values can be predicted with high accuracy using genetic markers alone but more validation is required especially in samples of the population different from that in which the effect of the markers was estimated. The ideal method to estimate the breeding value from genomic data is to calculate the conditional mean of the breeding value given the genotype of the animal at each QTL. This conditional mean can only be calculated by using a prior distribution of QTL effects so this should be part of the research carried out to implement genomic selection. In practice, this method of estimating breeding values is approximated by using the marker genotypes instead of the QTL genotypes but the ideal method is likely to be approached more closely as more sequence and SNP data is obtained. Implementation of genomic selection is likely to have major implications for genetic evaluation systems and for genetic improvement programmes generally and these are discussed.

0 comments Cited 224 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

A Genomic Bayesian Multi-trait and Multi-environment Model

Osval Montesinos López, Abelardo Montesinos-López, Jose Crossa … (2016)

When information on multiple genotypes evaluated in multiple environments is recorded, a multi-environment single trait model for assessing genotype × environment interaction (G × E) is usually employed. Comprehensive models that simultaneously take into account the correlated traits and trait × genotype × environment interaction (T × G × E) are lacking. In this research, we propose a Bayesian model for analyzing multiple traits and multiple environments for whole-genome prediction (WGP) model. For this model, we used Half- t priors on each standard deviation term and uniform priors on each correlation of the covariance matrix. These priors were not informative and led to posterior inferences that were insensitive to the choice of hyper-parameters. We also developed a computationally efficient Markov Chain Monte Carlo (MCMC) under the above priors, which allowed us to obtain all required full conditional distributions of the parameters leading to an exact Gibbs sampling for the posterior distribution. We used two real data sets to implement and evaluate the proposed Bayesian method and found that when the correlation between traits was high (>0.5), the proposed model (with unstructured variance–covariance) improved prediction accuracy compared to the model with diagonal and standard variance–covariance structures. The R-software package Bayesian Multi-Trait and Multi-Environment (BMTME) offers optimized C++ routines to efficiently perform the analyses.

0 comments Cited 60 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

A Bayesian Poisson-lognormal Model for Count Data for Multiple-Trait Multiple-Environment Genomic-Enabled Prediction

Osval A. Montesinos-López, Abelardo Montesinos-López, Jose Crossa … (2017)

When a plant scientist wishes to make genomic-enabled predictions of multiple traits measured in multiple individuals in multiple environments, the most common strategy for performing the analysis is to use a single trait at a time taking into account genotype × environment interaction (G × E), because there is a lack of comprehensive models that simultaneously take into account the correlated counting traits and G × E. For this reason, in this study we propose a multiple-trait and multiple-environment model for count data. The proposed model was developed under the Bayesian paradigm for which we developed a Markov Chain Monte Carlo (MCMC) with noninformative priors. This allows obtaining all required full conditional distributions of the parameters leading to an exact Gibbs sampler for the posterior distribution. Our model was tested with simulated data and a real data set. Results show that the proposed multi-trait, multi-environment model is an attractive alternative for modeling multiple count traits measured in multiple environments.

0 comments Cited 11 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): G3 (Bethesda)

Journal ID (iso-abbrev): Genetics

Journal ID (hwp): G3: Genes, Genomes, Genetics

Journal ID (pmc): G3: Genes, Genomes, Genetics

Journal ID (publisher-id): G3: Genes, Genomes, Genetics

Title: G3: Genes|Genomes|Genetics

Publisher: Genetics Society of America

ISSN (Electronic): 2160-1836

Publication date (Electronic): 04 January 2018

Publication date Collection: January 2018

Volume: 8

Issue: 1

Pages: 131-147

Affiliations

[* ]Facultad de Telemática, Universidad de Colima, 28040 Colima, México

[† ]Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430 Jalisco, México

[‡ ]International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 México City, México

[§ ]Departamento de Estadística, Centro de Investigación en Matemáticas (CIMAT), 36240 Guanajuato, México

[** ]Department of Entomology, Michigan State University, East Lancing, Michigan 48824

[†† ]Department of Computer Science, Aalto University, FI-00076, Finland

Author notes

[1 ]Corresponding authors: Facultad de Telemática, Universidad de Colima, 28040 Colima, México. E-mail: oamontes1@ 123456ucol.mx ; and Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. postal 6-641, Apdo. Postal 6-641, 06600 México City, México. E-mail: j.crossa@ 123456cgiar.org

Author information

José Crossa http://orcid.org/0000-0001-9429-5855

Article

Publisher ID: GGG_300309

DOI: 10.1534/g3.117.300309

PMC ID: 5765342

PubMed ID: 29097376

SO-VID: 2d423c56-9825-463f-985b-fc2336b22370

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 27 September 2017

Date accepted : 31 October 2017

Page count

Figures: 3, Tables: 11, Equations: 11, References: 11, Pages: 17

Comments

Comment on this article

scite_

Cited by 11

See all cited by

Most referenced authors 137

See all reference authors

Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

Read this article at

Abstract

Most cited references 4

Genomic selection.

A Genomic Bayesian Multi-trait and Multi-environment Model

A Bayesian Poisson-lognormal Model for Count Data for Multiple-Trait Multiple-Environment Genomic-Enabled Prediction

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 65

Cited by 11

Most referenced authors 137