+1 Recommend
2 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genotype Imputation To Improve the Cost-Efficiency of Genomic Selection in Farmed Atlantic Salmon

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Genomic selection uses genome-wide marker information to predict breeding values for traits of economic interest, and is more accurate than pedigree-based methods. The development of high density SNP arrays for Atlantic salmon has enabled genomic selection in selective breeding programs, alongside high-resolution association mapping of the genetic basis of complex traits. However, in sibling testing schemes typical of salmon breeding programs, trait records are available on many thousands of fish with close relationships to the selection candidates. Therefore, routine high density SNP genotyping may be prohibitively expensive. One means to reducing genotyping cost is the use of genotype imputation, where selected key animals ( e.g., breeding program parents) are genotyped at high density, and the majority of individuals ( e.g., performance tested fish and selection candidates) are genotyped at much lower density, followed by imputation to high density. The main objectives of the current study were to assess the feasibility and accuracy of genotype imputation in the context of a salmon breeding program. The specific aims were: (i) to measure the accuracy of genotype imputation using medium (25 K) and high (78 K) density mapped SNP panels, by masking varying proportions of the genotypes and assessing the correlation between the imputed genotypes and the true genotypes; and (ii) to assess the efficacy of imputed genotype data in genomic prediction of key performance traits (sea lice resistance and body weight). Imputation accuracies of up to 0.90 were observed using the simple two-generation pedigree dataset, and moderately high accuracy (0.83) was possible even with very low density SNP data (∼250 SNPs). The performance of genomic prediction using imputed genotype data was comparable to using true genotype data, and both were superior to pedigree-based prediction. These results demonstrate that the genotype imputation approach used in this study can provide a cost-effective method for generating robust genome-wide SNP data for genomic prediction in Atlantic salmon. Genotype imputation approaches are likely to form a critical component of cost-efficient genomic selection programs to improve economically important traits in aquaculture.

          Related collections

          Most cited references 33

          • Record: found
          • Abstract: found
          • Article: found

          Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking.

          The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals.
            • Record: found
            • Abstract: found
            • Article: found

            Genomic selection using low-density marker panels.

            Genomic selection (GS) using high-density single-nucleotide polymorphisms (SNPs) is promising to improve response to selection in populations that are under artificial selection. High-density SNP genotyping of all selection candidates each generation, however, may not be cost effective. Smaller panels with SNPs that show strong associations with phenotype can be used, but this may require separate SNPs for each trait and each population. As an alternative, we propose to use a panel of evenly spaced low-density SNPs across the genome to estimate genome-assisted breeding values of selection candidates in pedigreed populations. The principle of this approach is to utilize cosegregation information from low-density SNPs to track effects of high-density SNP alleles within families. Simulations were used to analyze the loss of accuracy of estimated breeding values from using evenly spaced and selected SNP panels compared to using all high-density SNPs in a Bayesian analysis. Forward stepwise selection and a Bayesian approach were used to select SNPs. Loss of accuracy was nearly independent of the number of simulated quantitative trait loci (QTL) with evenly spaced SNPs, but increased with number of QTL for the selected SNP panels. Loss of accuracy with evenly spaced SNPs increased steadily over generations but was constant when the smaller number individuals that are selected for breeding each generation were also genotyped using the high-density SNP panel. With equal numbers of low-density SNPs, panels with SNPs selected on the basis of the Bayesian approach had the smallest loss in accuracy for a single trait, but a panel with evenly spaced SNPs at 10 cM was only slightly worse, whereas a panel with SNPs selected by forward stepwise selection was inferior. Panels with evenly spaced SNPs can, however, be used across traits and populations and their performance is independent of the number of QTL affecting the trait and of the methods used to estimate effects in the training data and are, therefore, preferred for broad applications in pedigreed populations under artificial selection.
              • Record: found
              • Abstract: found
              • Article: found

              A hidden markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping.

              Faithful reconstruction of haplotypes from diploid marker data (phasing) is important for many kinds of genetic analyses, including mapping of trait loci, prediction of genomic breeding values, and identification of signatures of selection. In human genetics, phasing most often exploits population information (linkage disequilibrium), while in animal genetics the primary source of information is familial (Mendelian segregation and linkage). We herein develop and evaluate a method that simultaneously exploits both sources of information. It builds on hidden Markov models that were initially developed to exploit population information only. We demonstrate that the approach improves the accuracy of allele phasing as well as imputation of missing genotypes. Reconstructed haplotypes are assigned to hidden states that are shown to correspond to clusters of genealogically related chromosomes. We show that these cluster states can directly be used to fine map QTL. The method is computationally effective at handling large data sets based on high-density SNP panels.

                Author and article information

                G3 (Bethesda)
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes|Genomes|Genetics
                Genetics Society of America
                1 March 2017
                April 2017
                : 7
                : 4
                : 1377-1383
                [* ]The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom
                []Hendrix Genetics Aquaculture BV/ NetherlandsVilla 'de Körver', Spoorstraat 695831 CK BoxmeerThe Netherlands
                []Edinburgh Genomics, Ashworth Laboratories, University of Edinburgh, EH9 3JT, United Kingdom
                [§ ]Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, G61 1QH, United Kingdom
                [** ]Institute of Aquaculture, University of Stirling, FK9 4LA, United Kingdom
                Author notes

                Present address: Benchmark Breeding and Genetics Ltd, Bush House, Edinburgh Technopole, Edinburgh EH26 0BB, UK.


                Present address: Department of Animal, Plant and Soil Sciences, La Trobe University, Agribio Building, 5 Ring Road, Bundoora, Victoria 3086, Australia.

                [3 ]Corresponding author: The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK. E-mail: ross.houston@
                Copyright © 2017 Tsai et al.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                Page count
                Figures: 3, Tables: 2, Equations: 1, References: 47, Pages: 7
                Genomic Selection


                imputation, shared data resources, genpred, genomic selection, disease resistance, aquaculture


                Comment on this article