Blog
About

9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Analysis of sequencing strategies and tools for taxonomic annotation: Defining standards for progressive metagenomics

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Metagenomics research has recently thrived due to DNA sequencing technologies improvement, driving the emergence of new analysis tools and the growth of taxonomic databases. However, there is no all-purpose strategy that can guarantee the best result for a given project and there are several combinations of software, parameters and databases that can be tested. Therefore, we performed an impartial comparison, using statistical measures of classification for eight bioinformatic tools and four taxonomic databases, defining a benchmark framework to evaluate each tool in a standardized context. Using in silico simulated data for 16S rRNA amplicons and whole metagenome shotgun data, we compared the results from different software and database combinations to detect biases related to algorithms or database annotation. Using our benchmark framework, researchers can define cut-off values to evaluate the expected error rate and coverage for their results, regardless the score used by each software. A quick guide to select the best tool, all datasets and scripts to reproduce our results and benchmark any new method are available at https://github.com/Ales-ibt/Metagenomic-benchmark. Finally, we stress out the importance of gold standards, database curation and manual inspection of taxonomic profiling results, for a better and more accurate microbial diversity description.

          Related collections

          Most cited references 21

          • Record: found
          • Abstract: found
          • Article: not found

          Comparison of the predicted and observed secondary structure of T4 phage lysozyme.

          Predictions of the secondary structure of T4 phage lysozyme, made by a number of investigators on the basis of the amino acid sequence, are compared with the structure of the protein determined experimentally by X-ray crystallography. Within the amino terminal half of the molecule the locations of helices predicted by a number of methods agree moderately well with the observed structure, however within the carboxyl half of the molecule the overall agreement is poor. For eleven different helix predictions, the coefficients giving the correlation between prediction and observation range from 0.14 to 0.42. The accuracy of the predictions for both beta-sheet regions and for turns are generally lower than for the helices, and in a number of instances the agreement between prediction and observation is no better than would be expected for a random selection of residues. The structural predictions for T4 phage lysozyme are much less successful than was the case for adenylate kinase (Schulz et al. (1974) Nature 250, 140-142). No one method of prediction is clearly superior to all others, and although empirical predictions based on larger numbers of known protein structure tend to be more accurate than those based on a limited sample, the improvement in accuracy is not dramatic, suggesting that the accuracy of current empirical predictive methods will not be substantially increased simply by the inclusion of more data from additional protein structure determinations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data

            The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies

              Background Characterizing microbial communities via next-generation sequencing is subject to a number of pitfalls involving sample processing. The observed community composition can be a severe distortion of the quantities of bacteria actually present in the microbiome, hampering analysis and threatening the validity of conclusions from metagenomic studies. We introduce an experimental protocol using mock communities for quantifying and characterizing bias introduced in the sample processing pipeline. We used 80 bacterial mock communities comprised of prescribed proportions of cells from seven vaginally-relevant bacterial strains to assess the bias introduced in the sample processing pipeline. We created two additional sets of 80 mock communities by mixing prescribed quantities of DNA and PCR product to quantify the relative contribution to bias of (1) DNA extraction, (2) PCR amplification, and (3) sequencing and taxonomic classification for particular choices of protocols for each step. We developed models to predict the “true” composition of environmental samples based on the observed proportions, and applied them to a set of clinical vaginal samples from a single subject during four visits. Results We observed that using different DNA extraction kits can produce dramatically different results but bias is introduced regardless of the choice of kit. We observed error rates from bias of over 85% in some samples, while technical variation was very low at less than 5% for most bacteria. The effects of DNA extraction and PCR amplification for our protocols were much larger than those due to sequencing and classification. The processing steps affected different bacteria in different ways, resulting in amplified and suppressed observed proportions of a community. When predictive models were applied to clinical samples from a subject, the predicted microbiome profiles were better reflections of the physiology and diagnosis of the subject at the visits than the observed community compositions. Conclusions Bias in 16S studies due to DNA extraction and PCR amplification will continue to require attention despite further advances in sequencing technology. Analysis of mock communities can help assess bias and facilitate the interpretation of results from environmental samples. Electronic supplementary material The online version of this article (doi:10.1186/s12866-015-0351-6) contains supplementary material, which is available to authorized users.
                Bookmark

                Author and article information

                Contributors
                alexsf@ibt.unam.mx
                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group UK (London )
                2045-2322
                13 August 2018
                13 August 2018
                2018
                : 8
                Affiliations
                [1 ]ISNI 0000 0001 2159 0001, GRID grid.9486.3, Consorcio de Investigación del Golfo de México (CIGOM), , Instituto de Biotecnología, Universidad Nacional Autónoma de México, ; Cuernvaca, Mexico
                [2 ]ISNI 0000 0001 2159 0001, GRID grid.9486.3, Instituto de Biotecnología, , Universidad Nacional Autónoma de México, ; Cuernvaca, Mexico
                [3 ]ISNI 0000 0000 9071 1447, GRID grid.462226.6, Departamento de Innovación Biomédica, , CICESE. Carretera Ensenada-Tijuana 3918, Zona Playitas, ; Ensenada, BC Mexico
                Article
                30515
                10.1038/s41598-018-30515-5
                6089906
                30104688
                © The Author(s) 2018

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                Funding
                Funded by: Salary funded by the National Council of Science and Technology of Mexico - Mexican Ministry of Energy - Hydrocarbon Trust, project 201441. This is a contribution of the Gulf of Mexico Research Consortium (CIGoM).
                Categories
                Article
                Custom metadata
                © The Author(s) 2018

                Uncategorized

                Comments

                Comment on this article