28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Two New Computational Methods for Universal DNA Barcoding: A Benchmark Using Barcode Sequences of Bacteria, Archaea, Animals, Fungi, and Land Plants

      research-article
      1 , 2 , * , 1 , 3
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast ( rbcL, matK, and trnH- psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto- k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          MRBAYES: Bayesian inference of phylogenetic trees.

          The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A DNA barcode for land plants.

            DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success.

              DNA barcoding and DNA taxonomy have recently been proposed as solutions to the crisis of taxonomy and received significant attention from scientific journals, grant agencies, natural history museums, and mainstream media. Here, we test two key claims of molecular taxonomy using 1333 mitochondrial COI sequences for 449 species of Diptera. We investigate whether sequences can be used for species identification ("DNA barcoding") and find a relatively low success rate (< 70%) based on tree-based and newly proposed species identification criteria. Misidentifications are due to wide overlap between intra- and interspecific genetic variability, which causes 6.5% of all query sequences to have allospecific or a mixture of allo- and conspecific (3.6%) best-matching barcodes. Even when two COI sequences are identical, there is a 6% chance that they belong to different species. We also find that 21% of all species lack unique barcodes when consensus sequences of all conspecific sequences are used. Lastly, we test whether DNA sequences yield an unambiguous species-level taxonomy when sequence profiles are assembled based on pairwise distance thresholds. We find many sequence triplets for which two of the three pairwise distances remain below the threshold, whereas the third exceeds it; i.e., it is impossible to consistently delimit species based on pairwise distances. Furthermore, for species profiles based on a 3% threshold, only 47% of all profiles are consistent with currently accepted species limits, 20% contain more than one species, and 33% only some sequences from one species; i.e., adopting such a DNA taxonomy would require the redescription of a large proportion of the known species, thus worsening the taxonomic impediment. We conclude with an outlook on the prospects of obtaining complete barcode databases and the future use of DNA sequences in a modern integrative taxonomy.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2013
                18 October 2013
                : 8
                : 10
                : e76910
                Affiliations
                [1 ]Graduate School of Global Environmental Studies, Kyoto University, Kyoto, Kyoto, Japan
                [2 ]Research Center for Aquatic Genomics, National Research Institute of Fisheries Science, Fisheries Research Agency, Yokohama, Kanagawa, Japan
                [3 ]Graduate School of Human and Environmental Studies, Kyoto University, Kyoto, Kyoto, Japan
                Consiglio Nazionale delle Ricerche (CNR), Italy
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: AST HT. Performed the experiments: AST. Analyzed the data: AST. Wrote the paper: AST HT. Developed and implemented the new methods: AST.

                Article
                PONE-D-13-21549
                10.1371/journal.pone.0076910
                3799923
                24204702
                662ce314-5b22-4c23-b029-6e5be906d3aa
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 27 May 2013
                : 25 August 2013
                Page count
                Pages: 11
                Funding
                This work was supported by grant-in-aid by Funding Program for Next Generation World-Leading Researchers (GS014) by the Japan Society for the Promotion of Science. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article