Genome-wide prediction of traits with different genetic architecture through efficient variable selection.

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.

Most cited references 14

Record: found
Abstract: not found
Article: not found

Genomic Selection in Wheat Breeding using Genotyping-by-Sequencing

Jesse A. Poland, Jeffrey Endelman, Julie Dawson … (2012)

0 comments Cited 257 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Performance of genomic selection in mice.

Andres Legarra, Christèle Robert-Granié, Eduardo Manfredi … (2008)

Selection plans in plant and animal breeding are driven by genetic evaluation. Recent developments suggest using massive genetic marker information, known as "genomic selection." There is little evidence of its performance, though. We empirically compared three strategies for selection: (1) use of pedigree and phenotypic information, (2) use of genomewide markers and phenotypic information, and (3) the combination of both. We analyzed four traits from a heterogeneous mouse population (http://gscan.well.ox.ac.uk/), including 1884 individuals and 10,946 SNP markers. We used linear mixed models, using extensions of association analysis. Cross-validation techniques were used, providing assumption-free estimates of predictive ability. Sampling of validation and training data sets was carried out across and within families, which allows comparing across- and within-family information. Use of genomewide genetic markers increased predictive ability up to 0.22 across families and up to 0.03 within families. The latter is not statistically significant. These values are roughly comparable to increases of up to 0.57 (across family) and 0.14 (within family) in accuracy of prediction of genetic value. In this data set, within-family information was more accurate than across-family information, and populational linkage disequilibrium was not a completely accurate source of information for genetic evaluation. This fact questions some applications of genomic selection.

0 comments Cited 178 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Accurate prediction of genetic values for complex traits by whole-genome resequencing.

Theo Meuwissen, Mike Goddard (2010)

Whole-genome resequencing technology has improved rapidly during recent years and is expected to improve further such that the sequencing of an entire human genome sequence for $1000 is within reach. Our main aim here is to use whole-genome sequence data for the prediction of genetic values of individuals for complex traits and to explore the accuracy of such predictions. This is relevant for the fields of plant and animal breeding and, in human genetics, for the prediction of an individual's risk for complex diseases. Here, population history and genomic architectures were simulated under the Wright-Fisher population and infinite-sites mutation model, and prediction of genetic value was by the genomic selection approach, where a Bayesian nonlinear model was used to predict the effects of individual SNPs. The Bayesian model assumed a priori that only few SNPs are causative, i.e., have an effect different from zero. When using whole-genome sequence data, accuracies of prediction of genetic value were >40% increased relative to the use of dense approximately 30K SNP chips. At equal high density, the inclusion of the causative mutations yielded an extra increase of accuracy of 2.5-3.7%. Predictions of genetic value remained accurate even when the training and evaluation data were 10 generations apart. Best linear unbiased prediction (BLUP) of SNP effects does not take full advantage of the genome sequence data, and nonlinear predictions, such as the Bayesian method used here, are needed to achieve maximum accuracy. On the basis of theoretical work, the results could be extended to more realistic genome and population sizes.

0 comments Cited 168 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (iso-abbrev): Genetics

Title: Genetics

Publisher: Genetics Society of America

ISSN (Electronic): 1943-2631

ISSN (Print): 0016-6731

Publication date (Electronic): Oct 2013

Volume: 195

Issue: 2

Affiliations

[1 ] Plant Breeding, Technische Universität München, 85354 Freising, Germany.

Article

Publisher Item ID: genetics.113.150078

DOI: 10.1534/genetics.113.150078

PMC ID: 3781982

PubMed ID: 23934883

SO-VID: 389707bd-4f2b-4d61-8519-4e2528535e17

History

Keywords: GenPred,complex traits,genetic architecture,genome-based prediction,plant breeding populations,shared data resources,variable selection

Data availability:

Keywords: GenPred, complex traits, genetic architecture, genome-based prediction, plant breeding populations, shared data resources, variable selection

Genome-wide prediction of traits with different genetic architecture through efficient variable selection.

Read this article at

Abstract

Most cited references 14

Genomic Selection in Wheat Breeding using Genotyping-by-Sequencing

Performance of genomic selection in mice.

Accurate prediction of genetic values for complex traits by whole-genome resequencing.

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 49

Cited by 54

Most referenced authors 557