Genome-Wide Regression and Prediction with the BGLR Statistical Package

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Many modern genomic data analyses require implementing regressions where the number of parameters ( p, e.g., the number of marker effects) exceeds sample size ( n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis.

Most cited references 13

Record: found
Abstract: found
Article: not found

Best linear unbiased estimation and prediction under a selection model.

C. R. Henderson (1975)

Mixed linear models are assumed in most animal breeding applications. Convenient methods for computing BLUE of the estimable linear functions of the fixed elements of the model and for computing best linear unbiased predictions of the random elements of the model have been available. Most data available to animal breeders, however, do not meet the usual requirements of random sampling, the problem being that the data arise either from selection experiments or from breeders' herds which are undergoing selection. Consequently, the usual methods are likely to yield biased estimates and predictions. Methods for dealing with such data are presented in this paper.

0 comments Cited 514 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genome-wide genetic association of complex traits in heterogeneous stock mice.

William Valdar, Leah Solberg, Dominique Gauguier … (2006)

Difficulties in fine-mapping quantitative trait loci (QTLs) are a major impediment to progress in the molecular dissection of complex traits in mice. Here we show that genome-wide high-resolution mapping of multiple phenotypes can be achieved using a stock of genetically heterogeneous mice. We developed a conservative and robust bootstrap analysis to map 843 QTLs with an average 95% confidence interval of 2.8 Mb. The QTLs contribute to variation in 97 traits, including models of human disease (asthma, type 2 diabetes mellitus, obesity and anxiety) as well as immunological, biochemical and hematological phenotypes. The genetic architecture of almost all phenotypes was complex, with many loci each contributing a small proportion to the total variance. Our data set, freely available at http://gscan.well.ox.ac.uk, provides an entry point to the functional characterization of genes involved in many complex traits.

0 comments Cited 218 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Performance of genomic selection in mice.

Andres Legarra, Christèle Robert-Granié, Eduardo Manfredi … (2008)

Selection plans in plant and animal breeding are driven by genetic evaluation. Recent developments suggest using massive genetic marker information, known as "genomic selection." There is little evidence of its performance, though. We empirically compared three strategies for selection: (1) use of pedigree and phenotypic information, (2) use of genomewide markers and phenotypic information, and (3) the combination of both. We analyzed four traits from a heterogeneous mouse population (http://gscan.well.ox.ac.uk/), including 1884 individuals and 10,946 SNP markers. We used linear mixed models, using extensions of association analysis. Cross-validation techniques were used, providing assumption-free estimates of predictive ability. Sampling of validation and training data sets was carried out across and within families, which allows comparing across- and within-family information. Use of genomewide genetic markers increased predictive ability up to 0.22 across families and up to 0.03 within families. The latter is not statistically significant. These values are roughly comparable to increases of up to 0.57 (across family) and 0.14 (within family) in accuracy of prediction of genetic value. In this data set, within-family information was more accurate than across-family information, and populational linkage disequilibrium was not a completely accurate source of information for genetic evaluation. This fact questions some applications of genomic selection.

0 comments Cited 178 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Genetics

Journal ID (iso-abbrev): Genetics

Journal ID (hwp): genetics

Journal ID (pmc): genetics

Journal ID (publisher-id): genetics

Title: Genetics

Publisher: Genetics Society of America

ISSN (Print): 0016-6731

ISSN (Electronic): 1943-2631

Publication date (Print): October 2014

Publication date (Electronic): 9 July 2014

Publication date PMC-release: 9 July 2014

Volume: 198

Issue: 2

Pages: 483-495

Affiliations

[* ]Socio Economía Estadística e Informática, Colegio de Postgraduados 56230, México

[† ]Department of Biostatistics, Section on Statistical Genetics, University of Alabama, Birmingham, Alabama 35294

Author notes

[1 ]Corresponding author: Colegio de Postgraduados, Km. 36.5, Carretera Mexico, Montecillo Texcoco, Estado de México, México 56230. E-mail: perpdgo@ 123456colpos.mx

Article

Publisher ID: 164442

DOI: 10.1534/genetics.114.164442

PMC ID: 4196607

PubMed ID: 25009151

SO-VID: 1425fbb6-82d4-4099-b977-ea51d2bdc7c1

License:

Available freely online through the author-supported open access option.

History

Date received : 22 March 2014

Date accepted : 26 June 2014

Page count

Pages: 13

Custom metadata

DJS Export v1

special-property highlight-article

ScienceOpen disciplines: Genetics

Keywords: bayesian methods,regression,whole-genome regression,whole-genome prediction,genome-wide regression,variable selection,shrinkage,semiparametric regression,reproducing kernel hilbert spaces regressions, rkhs,r,genpred,shared data resource

Data availability:

ScienceOpen disciplines: Genetics

Keywords: bayesian methods, regression, whole-genome regression, whole-genome prediction, genome-wide regression, variable selection, shrinkage, semiparametric regression, reproducing kernel hilbert spaces regressions, rkhs, r, genpred, shared data resource

Genome-Wide Regression and Prediction with the BGLR Statistical Package

Read this article at

Abstract

Most cited references 13

Best linear unbiased estimation and prediction under a selection model.

Genome-wide genetic association of complex traits in heterogeneous stock mice.

Performance of genomic selection in mice.

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Custom metadata

Comments

Comment on this article

Similar content 107

Cited by 537

Most referenced authors 469