3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Unique k-mers as Strain-Specific Barcodes for Phylogenetic Analysis and Natural Microbiome Profiling

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The need for a comparative analysis of natural metagenomes stimulated the development of new methods for their taxonomic profiling. Alignment-free approaches based on the search for marker k-mers turned out to be capable of identifying not only species, but also strains of microorganisms with known genomes. Here, we evaluated the ability of genus-specific k-mers to distinguish eight phylogroups of Escherichia coli (A, B1, C, E, D, F, G, B2) and assessed the presence of their unique 22-mers in clinical samples from microbiomes of four healthy people and four patients with Crohn’s disease. We found that a phylogenetic tree inferred from the pairwise distance matrix for unique 18-mers and 22-mers of 124 genomes was fully consistent with the topology of the tree, obtained with concatenated aligned sequences of orthologous genes. Therefore, we propose strain-specific “barcodes” for rapid phylotyping. Using unique 22-mers for taxonomic analysis, we detected microbes of all groups in human microbiomes; however, their presence in the five samples was significantly different. Pointing to the intraspecies heterogeneity of E. coli in the natural microflora, this also indicates the feasibility of further studies of the role of this heterogeneity in maintaining population homeostasis.

          Related collections

          Most cited references53

          • Record: found
          • Abstract: found
          • Article: not found

          Comparative Analysis of Human Gut Microbiota by Barcoded Pyrosequencing

          Humans host complex microbial communities believed to contribute to health maintenance and, when in imbalance, to the development of diseases. Determining the microbial composition in patients and healthy controls may thus provide novel therapeutic targets. For this purpose, high-throughput, cost-effective methods for microbiota characterization are needed. We have employed 454-pyrosequencing of a hyper-variable region of the 16S rRNA gene in combination with sample-specific barcode sequences which enables parallel in-depth analysis of hundreds of samples with limited sample processing. In silico modeling demonstrated that the method correctly describes microbial communities down to phylotypes below the genus level. Here we applied the technique to analyze microbial communities in throat, stomach and fecal samples. Our results demonstrate the applicability of barcoded pyrosequencing as a high-throughput method for comparative microbial ecology.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Quake: quality-aware detection and correction of sequencing errors

            We introduce Quake, a program to detect and correct errors in DNA sequencing reads. Using a maximum likelihood approach incorporating quality values and nucleotide specific miscall rates, Quake achieves the highest accuracy on realistically simulated reads. We further demonstrate substantial improvements in de novo assembly and SNP detection after using Quake. Quake can be used for any size project, including more than one billion human reads, and is freely available as open source software from http://www.cbcb.umd.edu/software/quake.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies

              Motivation: De novo assembly of whole genome shotgun (WGS) next-generation sequencing (NGS) data benefits from high-quality input with high coverage. However, in practice, determining the quality and quantity of useful reads quickly and in a reference-free manner is not trivial. Gaining a better understanding of the WGS data, and how that data is utilized by assemblers, provides useful insights that can inform the assembly process and result in better assemblies. Results: We present the K-mer Analysis Toolkit (KAT): a multi-purpose software toolkit for reference-free quality control (QC) of WGS reads and de novo genome assemblies, primarily via their k-mer frequencies and GC composition. KAT enables users to assess levels of errors, bias and contamination at various stages of the assembly process. In this paper we highlight KAT’s ability to provide valuable insights into assembly composition and quality of genome assemblies through pairwise comparison of k-mers present in both input reads and the assemblies. Availability and Implementation: KAT is available under the GPLv3 license at: https://github.com/TGAC/KAT. Contact: bernardo.clavijo@earlham.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Journal
                Int J Mol Sci
                Int J Mol Sci
                ijms
                International Journal of Molecular Sciences
                MDPI
                1422-0067
                31 January 2020
                February 2020
                : 21
                : 3
                : 944
                Affiliations
                [1 ]Institute of Mathematical Problems of Biology RAS—the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, 142290 Pushchino, Russia; panyukov@ 123456itaec.ru
                [2 ]Structural and Functional Genomics Group, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Russia; anthyllium@ 123456gmail.com
                [3 ]Institute of Cell Biophysics of the Russian Academy of Sciences, 142290 Pushchino, Russia
                Author notes
                [* ]Correspondence: ozoline@ 123456rambler.ru
                Article
                ijms-21-00944
                10.3390/ijms21030944
                7037511
                32023871
                22614d4b-e1ed-4d36-a506-7da08e542522
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 29 November 2019
                : 28 January 2020
                Categories
                Article

                Molecular biology
                bacterial genomes,genome barcodes,k-mers,alignment-free algorithms,phylogenetic trees,metagenomes,taxonomic profiling,phylotyping,human microbiome

                Comments

                Comment on this article