160
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Swarm: robust and fast clustering method for amplicon-based studies

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters’ internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample.

          The ongoing revolution in high-throughput sequencing continues to democratize the ability of small groups of investigators to map the microbial component of the biosphere. In particular, the coevolution of new sequencing platforms and new software tools allows data acquisition and analysis on an unprecedented scale. Here we report the next stage in this coevolutionary arms race, using the Illumina GAIIx platform to sequence a diverse array of 25 environmental samples and three known "mock communities" at a depth averaging 3.1 million reads per sample. We demonstrate excellent consistency in taxonomic recovery and recapture diversity patterns that were previously reported on the basis of metaanalysis of many studies from the literature (notably, the saline/nonsaline split in environmental samples and the split between host-associated and free-living communities). We also demonstrate that 2,000 Illumina single-end reads are sufficient to recapture the same relationships among samples that we observe with the full dataset. The results thus open up the possibility of conducting large-scale studies analyzing thousands of samples simultaneously to survey microbial communities at an unprecedented spatial and temporal resolution.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Identification of common molecular subsequences.

              Bookmark
              • Record: found
              • Abstract: not found
              • Book: not found

              R: A Language and Environment for Statistical Computing.

                Bookmark

                Author and article information

                Contributors
                Journal
                PeerJ
                PeerJ
                PeerJ
                PeerJ
                PeerJ
                PeerJ Inc. (San Francisco, USA )
                2167-8359
                25 September 2014
                2014
                : 2
                : e593
                Affiliations
                [1 ]CNRS, UMR 7144, EPEP – Évolution des Protistes et des Écosystèmes Pélagiques, Station Biologique de Roscoff , Roscoff, France
                [2 ]Sorbonne Universités, UPMC Univ Paris 06, UMR 7144, Station Biologique de Roscoff , Roscoff, France
                [3 ]Department of Ecology, University of Kaiserslautern , Kaiserslautern, Germany
                [4 ]Department of Microbiology, Oslo University Hospital, Rikshospitalet , Oslo, Norway
                [5 ]Department of Informatics, University of Oslo , Oslo, Norway
                [6 ]School of Engineering, University of Glasgow , Glasgow, UK
                Article
                593
                10.7717/peerj.593
                4178461
                25276506
                5318d98d-5f01-45f1-82c6-45a105a70ceb
                © 2014 Mahé et al.

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

                History
                : 13 May 2014
                : 3 September 2014
                Funding
                Funded by: EU EraNet BiodivErsA program BioMarKs
                Award ID: 2008-6530
                Funded by: French government “Investissements d’Avenir” project OCEANOMICS
                Award ID: ANR-11-BTBR-0008
                Funded by: Deutsche Forschungsgemeinschaft
                Award ID: DU1319/1-1
                Funded by: EPSRC Career Acceleration Fellowship
                Award ID: EP/H003851/1
                FM and CdeV were supported by the EU EraNet BiodivErsA program BioMarKs (grant #2008-6530) and the French government “Investissements d’Avenir” project OCEANOMICS (ANR-11-BTBR-0008) and the EU FP7 program MicroB3 (contract number 287589). FM and MD were supported by the Deutsche Forschungsgemeinschaft (grant #DU1319/1-1). TR was supported by a Centre of Excellence grant from the Research Council of Norway to CMBN. CQ is funded by an EPSRC Career Acceleration Fellowship – EP/H003851/1. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Biodiversity
                Bioinformatics
                Ecology
                Microbiology
                Molecular Biology

                environmental diversity,barcoding,molecular operational taxonomic units

                Comments

                Comment on this article