47
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Contamination is a well-known but often neglected problem in molecular biology. Here, we investigated the prevalence of cross-contamination among 446 samples from 116 distinct species of animals, which were processed in the same laboratory and subjected to subcontracted transcriptome sequencing.

          Results

          Using cytochrome oxidase 1 as a barcode, we identified a minimum of 782 events of between-species contamination, with approximately 80% of our samples being affected. An analysis of laboratory metadata revealed a strong effect of the sequencing center: nearly all the detected events of between-species contamination involved species that were sent the same day to the same company. We introduce new methods to address the amount of within-species, between-individual contamination, and to correct for this problem when calling genotypes from base read counts.

          Conclusions

          We report evidence for pervasive within-species contamination in this data set, and show that classical population genomic statistics, such as synonymous diversity, the ratio of non-synonymous to synonymous diversity, inbreeding coefficient F IT, and Tajima’s D, are sensitive to this problem to various extents. Control analyses suggest that our published results are probably robust to the problem of contamination. Recommendations on how to prevent or avoid contamination in large-scale population genomics/molecular ecology are provided based on this analysis.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12915-017-0366-6) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references38

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform

          Due to the increasing throughput of current DNA sequencing instruments, sample multiplexing is necessary for making economical use of available sequencing capacities. A widely used multiplexing strategy for the Illumina Genome Analyzer utilizes sample-specific indexes, which are embedded in one of the library adapters. However, this and similar multiplex approaches come with a risk of sample misidentification. By introducing indexes into both library adapters (double indexing), we have developed a method that reveals the rate of sample misidentification within current multiplex sequencing experiments. With ~0.3% these rates are orders of magnitude higher than expected and may severely confound applications in cancer genomics and other fields requiring accurate detection of rare variants. We identified the occurrence of mixed clusters on the flow as the predominant source of error. The accuracy of sample identification is further impaired if indexed oligonucleotides are cross-contaminated or if indexed libraries are amplified in bulk. Double-indexing eliminates these problems and increases both the scope and accuracy of multiplex sequencing on the Illumina platform.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Comparative population genomics in animals uncovers the determinants of genetic diversity.

            Genetic diversity is the amount of variation observed between DNA sequences from distinct individuals of a given species. This pivotal concept of population genetics has implications for species health, domestication, management and conservation. Levels of genetic diversity seem to vary greatly in natural populations and species, but the determinants of this variation, and particularly the relative influences of species biology and ecology versus population history, are still largely mysterious. Here we show that the diversity of a species is predictable, and is determined in the first place by its ecological strategy. We investigated the genome-wide diversity of 76 non-model animal species by sequencing the transcriptome of two to ten individuals in each species. The distribution of genetic diversity between species revealed no detectable influence of geographic range or invasive status but was accurately predicted by key species traits related to parental investment: long-lived or low-fecundity species with brooding ability were genetically less diverse than short-lived or highly fecund ones. Our analysis demonstrates the influence of long-term life-history strategies on species response to short-term environmental perturbations, a result with immediate implications for conservation policies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Targets of balancing selection in the human genome.

              Balancing selection is potentially an important biological force for maintaining advantageous genetic diversity in populations, including variation that is responsible for long-term adaptation to the environment. By serving as a means to maintain genetic variation, it may be particularly relevant to maintaining phenotypic variation in natural populations. Nevertheless, its prevalence and specific targets in the human genome remain largely unknown. We have analyzed the patterns of diversity and divergence of 13,400 genes in two human populations using an unbiased single-nucleotide polymorphism data set, a genome-wide approach, and a method that incorporates demography in neutrality tests. We identified an unbiased catalog of genes with signatures of long-term balancing selection, which includes immunity genes as well as genes encoding keratins and membrane channels; the catalog also shows enrichment in functional categories involved in cellular structure. Patterns are mostly concordant in the two populations, with a small fraction of genes showing population-specific signatures of selection. Power considerations indicate that our findings represent a subset of all targets in the genome, suggesting that although balancing selection may not have an obvious impact on a large proportion of human genes, it is a key force affecting the evolution of a number of genes in humans.
                Bookmark

                Author and article information

                Contributors
                nicolas.galtier@univ-montp2.fr
                Journal
                BMC Biol
                BMC Biol
                BMC Biology
                BioMed Central (London )
                1741-7007
                29 March 2017
                29 March 2017
                2017
                : 15
                : 25
                Affiliations
                [1 ]ISNI 0000 0001 2097 0141, GRID grid.121334.6, , UMR5554 – Institute of Evolutionary Sciences, University Montpellier, CNRS, IRD, EPHE, ; Place Eugène Bataillon, CC64, 34095 Montpellier, France
                [2 ]ISNI 0000 0001 2203 0006, GRID grid.464101.6, UMR7144 - Adaptation et Diversité en Milieu Marin - CNRS, , Université Pierre et MarieCurie, Station Biologique de Roscoff, ; 29680 Roscoff, France
                Article
                366
                10.1186/s12915-017-0366-6
                5370491
                28356154
                bd4a10da-e921-4f94-8bd2-4df28fb8ecd2
                © Galtier et al. 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 6 December 2016
                : 13 March 2017
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000781, European Research Council;
                Award ID: 23971
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100001665, Agence Nationale de la Recherche;
                Award ID: ANR-10-BINF-01-01
                Award ID: ANR-15-CE12-0010
                Award Recipient :
                Funded by: Swiss National Fundation
                Award ID: CRSII3_160723
                Award Recipient :
                Categories
                Methodology Article
                Custom metadata
                © The Author(s) 2017

                Life sciences
                rnaseq,transcriptome,animals,snp calling,genotyping,within-species
                Life sciences
                rnaseq, transcriptome, animals, snp calling, genotyping, within-species

                Comments

                Comment on this article