4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system.

          Results

          We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub ( https://github.com/MW55/Natrix) or as a Docker container on DockerHub ( https://hub.docker.com/r/mw55/natrix).

          Conclusion

          Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          Basic local alignment search tool.

          A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            DADA2: High resolution sample inference from Illumina amplicon data

            We present DADA2, a software package that models and corrects Illumina-sequenced amplicon errors. DADA2 infers sample sequences exactly, without coarse-graining into OTUs, and resolves differences of as little as one nucleotide. In several mock communities DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Cutadapt removes adapter sequences from high-throughput sequencing reads

                Bookmark

                Author and article information

                Contributors
                daniela.beisser@uni-due.de
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                16 November 2020
                16 November 2020
                2020
                : 21
                : 526
                Affiliations
                [1 ]GRID grid.10253.35, ISNI 0000 0004 1936 9756, Department of Mathematics and Computer Science, , University of Marburg, ; Marburg, Germany
                [2 ]GRID grid.5718.b, ISNI 0000 0001 2187 5445, Department of Bioinformatics and Computational Biophysics, , University of Duisburg-Essen, ; Essen, Germany
                [3 ]GRID grid.5718.b, ISNI 0000 0001 2187 5445, Department of Biodiversity, , University of Duisburg-Essen, ; Essen, Germany
                Author information
                http://orcid.org/0000-0002-0679-6631
                Article
                3852
                10.1186/s12859-020-03852-4
                7667751
                33198651
                c8d2c669-0799-4749-b238-0eeee03d57f2
                © The Author(s) 2020

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 29 May 2020
                : 30 October 2020
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100001659, Deutsche Forschungsgemeinschaft;
                Award ID: BO 3245/19-1
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100003495, Hessisches Ministerium fr Wissenschaft und Kunst;
                Award ID: HMWK LOEWE, MOSLA research cluster
                Award Recipient :
                Funded by: Projekt DEAL
                Categories
                Software
                Custom metadata
                © The Author(s) 2020

                Bioinformatics & Computational biology
                amplicon sequencing,operational taxonomic units,amplicon sequence variants,snakemake,pipline,illumina

                Comments

                Comment on this article