38
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The rapid development of sequencing technologies has provided access to environments that were either once thought inhospitable to life altogether or that contain too few cells to be analyzed using genomics approaches. While 16S rRNA gene microbial community sequencing has revolutionized our understanding of community composition and diversity over time and space, it only provides a crude estimate of microbial functional and metabolic potential. Alternatively, shotgun metagenomics allows comprehensive sampling of all genetic material in an environment, without any underlying primer biases. Until recently, one of the major bottlenecks of shotgun metagenomics has been the requirement for large initial DNA template quantities during library preparation.

          Results

          Here, we investigate the effects of varying template concentrations across three low biomass library preparation protocols on their ability to accurately reconstruct a mock microbial community of known composition. We analyze the effects of input DNA quantity and library preparation method on library insert size, GC content, community composition, assembly quality and metagenomic binning. We found that library preparation method and the amount of starting material had significant impacts on the mock community metagenomes. In particular, GC content shifted towards more GC rich sequences at the lower input quantities regardless of library prep method, the number of low quality reads that could not be mapped to the reference genomes increased with decreasing input quantities, and the different library preparation methods had an impact on overall metagenomic community composition.

          Conclusions

          This benchmark study provides recommendations for library creation of representative and minimally biased metagenome shotgun sequencing, enabling insights into functional attributes of low biomass ecosystem microbial communities.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12864-015-2063-6) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

          Background The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. Results We provide quantifications and evidence for GC bias, error rates, error sequence context, effects of quality filtering, and the reliability of quality values. By combining different filtering criteria we reduced error rates 7-fold at the expense of discarding 12.5% of alignable bases. While overall error rates are low in HiSeq data we observed regions of accumulated wrong base calls. Only 3% of all error positions accounted for 24.7% of all substitution errors. Analyzing the forward and reverse strands separately revealed error rates of up to 18.7%. Insertions and deletions occurred at very low rates on average but increased to up to 2% in homopolymers. A positive correlation between read coverage and GC content was found depending on the GC content range. Conclusions The errors and biases we report have implications for the use and the interpretation of Illumina sequencing data. GAIIx and HiSeq data sets show slightly different error profiles. Quality filtering is essential to minimize downstream analysis artifacts. Supporting previous recommendations, the strand-specificity provides a criterion to distinguish sequencing errors from low abundance polymorphisms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Metagenomics - a guide from sampling to data analysis

            Metagenomics applies a suite of genomic technologies and bioinformatics tools to directly access the genetic content of entire communities of organisms. The field of metagenomics has been responsible for substantial advances in microbial ecology, evolution, and diversity over the past 5 to 10 years, and many research laboratories are actively engaged in it now. With the growing numbers of activities also comes a plethora of methodological knowledge and expertise that should guide future developments in the field. This review summarizes the current opinions in metagenomics, and provides practical guidance and advice on sample processing, sequencing technology, assembly, binning, annotation, experimental design, statistical analysis, data storage, and data sharing. As more metagenomic datasets are generated, the availability of standardized procedures and shared data storage and analysis becomes increasingly important to ensure that output of individual projects can be assessed and compared.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Ray Meta: scalable de novo metagenome assembly and profiling

              Voluminous parallel sequencing datasets, especially metagenomic experiments, require distributed computing for de novo assembly and taxonomic profiling. Ray Meta is a massively distributed metagenome assembler that is coupled with Ray Communities, which profiles microbiomes based on uniquely-colored k-mers. It can accurately assemble and profile a three billion read metagenomic experiment representing 1,000 bacterial genomes of uneven proportions in 15 hours with 1,024 processor cores, using only 1.5 GB per core. The software will facilitate the processing of large and complex datasets, and will help in generating biological insights for specific environments. Ray Meta is open source and available at http://denovoassembler.sf.net.
                Bookmark

                Author and article information

                Contributors
                rmbowers@lbl.gov
                aclum@lbl.gov
                hundley5@llnl.gov
                jhlim@lbl.gov
                ksingh@lbl.gov
                dgciobanu@lbl.gov
                cyngan@lbl.gov
                jfcheng@lbl.gov
                sgtringe@lbl.gov
                925-296-5939 , twoyke@lbl.gov
                Journal
                BMC Genomics
                BMC Genomics
                BMC Genomics
                BioMed Central (London )
                1471-2164
                24 October 2015
                24 October 2015
                2015
                : 16
                : 856
                Affiliations
                Microbial Genomics Program Lead, DOE Joint Genome Institute, 2800 Mitchell Dr, Walnut Creek, CA USA
                Article
                2063
                10.1186/s12864-015-2063-6
                4619416
                26496746
                0954f8c0-6e65-458c-83cb-c9faf3b61111
                © Bowers et al. 2015

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 13 July 2015
                : 13 October 2015
                Categories
                Methodology Article
                Custom metadata
                © The Author(s) 2015

                Genetics
                low input,low biomass,metagenome,library preparation protocol,microbiome
                Genetics
                low input, low biomass, metagenome, library preparation protocol, microbiome

                Comments

                Comment on this article