Genomic BLUP decoded: a look into the black box of genomic prediction.

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Genomic best linear unbiased prediction (BLUP) is a statistical method that uses relationships between individuals calculated from single-nucleotide polymorphisms (SNPs) to capture relationships at quantitative trait loci (QTL). We show that genomic BLUP exploits not only linkage disequilibrium (LD) and additive-genetic relationships, but also cosegregation to capture relationships at QTL. Simulations were used to study the contributions of those types of information to accuracy of genomic estimated breeding values (GEBVs), their persistence over generations without retraining, and their effect on the correlation of GEBVs within families. We show that accuracy of GEBVs based on additive-genetic relationships can decline with increasing training data size and speculate that modeling polygenic effects via pedigree relationships jointly with genomic breeding values using Bayesian methods may prevent that decline. Cosegregation information from half sibs contributes little to accuracy of GEBVs in current dairy cattle breeding schemes but from full sibs it contributes considerably to accuracy within family in corn breeding. Cosegregation information also declines with increasing training data size, and its persistence over generations is lower than that of LD, suggesting the need to model LD and cosegregation explicitly. The correlation between GEBVs within families depends largely on additive-genetic relationship information, which is determined by the effective number of SNPs and training data size. As genomic BLUP cannot capture short-range LD information well, we recommend Bayesian methods with t-distributed priors.

Most cited references 23

Record: found
Abstract: found
Article: not found

Best linear unbiased estimation and prediction under a selection model.

C. R. Henderson (1975)

Mixed linear models are assumed in most animal breeding applications. Convenient methods for computing BLUE of the estimable linear functions of the fixed elements of the model and for computing best linear unbiased predictions of the random elements of the model have been available. Most data available to animal breeders, however, do not meet the usual requirements of random sampling, the problem being that the data arise either from selection experiments or from breeders' herds which are undergoing selection. Consequently, the usual methods are likely to yield biased estimates and predictions. Methods for dealing with such data are presented in this paper.

0 comments Cited 513 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Linkage disequilibrium in the human genome.

D. E. Reich, M Cargill, S Bolk … (2001)

With the availability of a dense genome-wide map of single nucleotide polymorphisms (SNPs), a central issue in human genetics is whether it is now possible to use linkage disequilibrium (LD) to map genes that cause disease. LD refers to correlations among neighbouring alleles, reflecting 'haplotypes' descended from single, ancestral chromosomes. The size of LD blocks has been the subject of considerable debate. Computer simulations and empirical data have suggested that LD extends only a few kilobases (kb) around common SNPs, whereas other data have suggested that it can extend much further, in some cases greater than 100 kb. It has been difficult to obtain a systematic picture of LD because past studies have been based on only a few (1-3) loci and different populations. Here, we report a large-scale experiment using a uniform protocol to examine 19 randomly selected genomic regions. LD in a United States population of north-European descent typically extends 60 kb from common alleles, implying that LD mapping is likely to be practical in this population. By contrast, LD in a Nigerian population extends markedly less far. The results illuminate human history, suggesting that LD in northern Europeans is shaped by a marked demographic event about 27,000-53,000 years ago.

0 comments Cited 312 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach

Hans D Daetwyler, Beatriz Villanueva, John A. Woolliams (2008)

Background The prediction of the genetic disease risk of an individual is a powerful public health tool. While predicting risk has been successful in diseases which follow simple Mendelian inheritance, it has proven challenging in complex diseases for which a large number of loci contribute to the genetic variance. The large numbers of single nucleotide polymorphisms now available provide new opportunities for predicting genetic risk of complex diseases with high accuracy. Methodology/Principal Findings We have derived simple deterministic formulae to predict the accuracy of predicted genetic risk from population or case control studies using a genome-wide approach and assuming a dichotomous disease phenotype with an underlying continuous liability. We show that the prediction equations are special cases of the more general problem of predicting the accuracy of estimates of genetic values of a continuous phenotype. Our predictive equations are responsive to all parameters that affect accuracy and they are independent of allele frequency and effect distributions. Deterministic prediction errors when tested by simulation were generally small. The common link among the expressions for accuracy is that they are best summarized as the product of the ratio of number of phenotypic records per number of risk loci and the observed heritability. Conclusions/Significance This study advances the understanding of the relative power of case control and population studies of disease. The predictions represent an upper bound of accuracy which may be achievable with improved effect estimation methods. The formulae derived will help researchers determine an appropriate sample size to attain a certain accuracy when predicting genetic risk.

0 comments Cited 288 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (iso-abbrev): Genetics

Title: Genetics

Publisher: Genetics Society of America

ISSN (Electronic): 1943-2631

ISSN (Print): 0016-6731

Publication date (Electronic): Jul 2013

Volume: 194

Issue: 3

Affiliations

[1 ] Department of Animal Science and Center for Integrated Animal Genomics, Iowa State University, Ames, Iowa 50011, USA. dhabier@gmail.com

Article

Publisher Item ID: genetics.113.152207

DOI: 10.1534/genetics.113.152207

PMC ID: 3697966

PubMed ID: 23640517

SO-VID: c563c75c-9e71-4f1c-aa92-72c41b9a6782

History

Keywords: GenPred,Shared data resources,additive-genetic relationships,cosegregation,genomic best linear unbiased prediction (BLUP),genomic selection,linkage disequilibrium (LD)

Data availability:

Keywords: GenPred, Shared data resources, additive-genetic relationships, cosegregation, genomic best linear unbiased prediction (BLUP), genomic selection, linkage disequilibrium (LD)

Genomic BLUP decoded: a look into the black box of genomic prediction.

Read this article at

Abstract

Most cited references 23

Best linear unbiased estimation and prediction under a selection model.

Linkage disequilibrium in the human genome.

Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 166

Cited by 148

Most referenced authors 345