29
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genomic selection in loblolly pine - from lab to field

      abstract
      1 , , 1 , 1 , 1 , 1
      BMC Proceedings
      BioMed Central
      IUFRO Tree Biotechnology Conference 2011: From Genomes to Integration and Delivery
      26 June-2 July 2011

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background Tree breeding is logistically complex and expensive, and breeders have long sought to use molecular markers to accelerate breeding. A candidate gene approach based on testing for association between the presence of DNA sequence variation in or near candidate genes, and phenotypic variation in a population has long been explored [1,2]. However, using candidate gene approach (QTLs) has not been successful in breeding [3,4]. QTL-trait associations detected in one genetic background are often not observed in other families, because of recombination of genes during the segregation and low levels of linkage disequilibrium in the population. A new technology called genomic selection (GS) is revolutionizing dairy cattle breeding. In GS, marker effects are first estimated in a large training population (>500) with both phenotypic and genotypic data. Subsequently, estimated marker effects are used to predict breeding values in validation populations for which marker genotypes but not phenotypes are available [4]. Several dairy cattle breeding companies now routinely use GS to select and market bulls. The success of GS in cattle breeding is largely based on bovine genome sequencing and discovery of thousands of SNP markers. GS application, if successful, will have a great impact on forest tree breeding because of their complex and logistically difficult breeding programs. Although, there have been several simulation studies examining the effective population size, linkage disequilibrium, and heritability on the predicted accuracy of GS in tree breeding [5], GS has not yet been demonstrated for forest trees using empirical markers data, mainly due to lack of sufficient dense markers. Methods Biallelic SNP markers provided by the CTGN project (http://dendrome.ucdavis.edu/ctgn/) were used for genotyping. A population of 149 cloned full-sib offspring of loblolly pine (Pinus taeda L.) was phenotyped. Fitting 3406 informative SNP markers simultaneously, we estimated genome-wide breeding values and compared them with breeding values based on pedigree model. Variances explained by the marker additive and dominant effects were obtained. Results The accuracy of the genomic estimated breeding values ranged from 0.30 to 0.83 for growth and wood quality traits. Lignin and cellulose content had great accuracy values from GS compared to growth traits. The accuracies were comparable with breeding values that were calculated based on the traditional pedigree model. If we take into account time needed to complete progeny testing, GS would be more efficient than classical progeny testing for some traits. The marker additive effects explained 18% and 23% for lignin and cellulose, respectively. Variances could not be determined for height and volume, because the Gibbs sampler failed to converge, even after five million iterations. We speculate that observed accuracies in this study trace familial linkage rather than historical LD with trait loci, because of small population size and relatively deep pedigrees. The markers are sampling the haplotypes and thus constructing the pedigrees rather than explaining phenotypic variance. Nevertheless, the results are promising, and we expect that with decreases in genotyping cost, GS has a potential to fundamentally change tree breeding in the near future. Challenges of GS applications in tree breeding Despite promising results from some early work based on empirical data, there some challenges to overcome to routinely use GS in tree breeding. Conifers have genome size with a range between 18,000 and 40,000 Mbp [6]. Their populations have low levels of LD which decays rapidly. LD in loblolly pine decays to less than r2=0.25 within 2000 bp [7]. Low LD is due to genetic recombination over the evolutionary history of the species and causes inconsistency of QTL-marker association. Large genome size and historically low LD require large numbers of dense markers to explain a considerable amount of phenotypic variation in complex traits. Another challenge is the lack of genetic maps in forest trees. With a few exceptions, the genomes of forest trees have not been sequenced, and thus precise locations of SNP markers are lacking, which hinders the use of haplotypes. Using haplotypes reduces the dimensions of the data and thus requires much smaller computing resources to analyze. More importantly, with haplotypes, larger variation between trees can be obtained using allelic combinations, although larger training populations are required to adequately sample the effects of all the haplotypes. High marker genotyping cost is the major obstacle in applications of GS in forest trees. More cost efficient genotyping technologies, such as genotyping-by-sequencing and restriction digestion are being explored to reduce cost of markers. On the other hand, advances in computer power have made it possible to analyze large amount of complex data, but bioinformatics challenges still remain to analyze sequence data and SNP marker calling. Further research is needed in development of training models and calibration of prediction model. The number of generations that statistical models can be used before losing accuracy remains to be determined in forest trees. Another question is the validity of models across different populations. In cattle breeding, lower accuracies of GS for dairy versus beef cattle remains a challenge. For some tree species, GxE interaction could be an issue to be addressed; observed marker-trait association observed in one population may not hold in another environment. Conclusions We are currently working on construction of realized genomic relationship matrix based on SNP markers to use in predictions of breeding values. This method provides flexibility in terms of fitting common environmental effects in mixed models. We expect that decreases in marker genotyping costs will make GS in pine breeding feasible in the near future. Our group will work on pilot projects with forestry companies in the southern US, and plans are underway to revise breeding strategies to incorporate genomic selection.

          Related collections

          Most cited references3

          • Record: found
          • Abstract: found
          • Article: not found

          Genomic selection.

          Genomic selection is a form of marker-assisted selection in which genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker. This approach has become feasible thanks to the large number of single nucleotide polymorphisms (SNP) discovered by genome sequencing and new methods to efficiently genotype large number of SNP. Simulation results and limited experimental results suggest that breeding values can be predicted with high accuracy using genetic markers alone but more validation is required especially in samples of the population different from that in which the effect of the markers was estimated. The ideal method to estimate the breeding value from genomic data is to calculate the conditional mean of the breeding value given the genotype of the animal at each QTL. This conditional mean can only be calculated by using a prior distribution of QTL effects so this should be part of the research carried out to implement genomic selection. In practice, this method of estimating breeding values is approximated by using the marker genotypes instead of the QTL genotypes but the ideal method is likely to be approached more closely as more sequence and SNP data is obtained. Implementation of genomic selection is likely to have major implications for genetic evaluation systems and for genetic improvement programmes generally and these are discussed.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Nucleotide diversity and linkage disequilibrium in loblolly pine.

            Outbreeding species with large, stable population sizes, such as widely distributed conifers, are expected to harbor relatively more DNA sequence polymorphism. Under the neutral theory of molecular evolution, the expected heterozygosity is a function of the product 4N(e)mu, where N(e) is the effective population size and mu is the per-generation mutation rate, and the genomic scale of linkage disequilibrium is determined by 4N(e)r, where r is the per-generation recombination rate between adjacent sites. These parameters were estimated in the long-lived, outcrossing gymnosperm loblolly pine (Pinus taeda L.) from a survey of single nucleotide polymorphisms across approximately 18 kb of DNA distributed among 19 loci from a common set of 32 haploid genomes. Estimates of 4N(e)mu at silent and nonsynonymous sites were 0.00658 and 0.00108, respectively, and both were statistically heterogeneous among loci. By Tajima's D statistic, the site frequency spectrum of no locus was observed to deviate from that predicted by neutral theory. Substantial recombination in the history of the sampled alleles was observed and linkage disequilibrium declined within several kilobases. The composite likelihood estimate of 4N(e)r based on all two-site sample configurations equaled 0.00175. When geological dating, an assumed generation time (25 years), and an estimated divergence from Pinus pinaster Ait. are used, the effective population size of loblolly pine should be 5.6 x 10(5). The emerging narrow range of estimated silent site heterozygosities (relative to the vast range of population sizes) for humans, Drosophila, maize, and pine parallels the paradox described earlier for allozyme polymorphism and challenges simple equilibrium models of molecular evolution.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Evolution of Genome Size and Complexity in Pinus

              Background Genome evolution in the gymnosperm lineage of seed plants has given rise to many of the most complex and largest plant genomes, however the elements involved are poorly understood. Methodology/Principal Findings Gymny is a previously undescribed retrotransposon family in Pinus that is related to Athila elements in Arabidopsis. Gymny elements are dispersed throughout the modern Pinus genome and occupy a physical space at least the size of the Arabidopsis thaliana genome. In contrast to previously described retroelements in Pinus, the Gymny family was amplified or introduced after the divergence of pine and spruce (Picea). If retrotransposon expansions are responsible for genome size differences within the Pinaceae, as they are in angiosperms, then they have yet to be identified. In contrast, molecular divergence of Gymny retrotransposons together with other families of retrotransposons can account for the large genome complexity of pines along with protein-coding genic DNA, as revealed by massively parallel DNA sequence analysis of Cot fractionated genomic DNA. Conclusions/Significance Most of the enormous genome complexity of pines can be explained by divergence of retrotransposons, however the elements responsible for genome size variation are yet to be identified. Genomic resources for Pinus including those reported here should assist in further defining whether and how the roles of retrotransposons differ in the evolution of angiosperm and gymnosperm genomes.
                Bookmark

                Author and article information

                Conference
                BMC Proc
                BMC Proceedings
                BioMed Central
                1753-6561
                2011
                13 September 2011
                : 5
                : Suppl 7
                : I8
                Affiliations
                [1 ]Cooperative Tree Improvement Program, North Carolina State University, Raleigh, USA
                Article
                1753-6561-5-S7-I8
                10.1186/1753-6561-5-S7-I8
                3239875
                0edd00fb-a6d0-42cf-8981-517c8869e391
                Copyright ©2011 Isik et al; licensee BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                IUFRO Tree Biotechnology Conference 2011: From Genomes to Integration and Delivery
                Arraial d'Ajuda, Bahia, Brazil
                26 June-2 July 2011
                History
                Categories
                Invited Speaker Presentation

                Medicine
                Medicine

                Comments

                Comment on this article