+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny, and Escherichia core genomic diversity


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          EnteroBase is an integrated software environment that supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview of how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridioides, Helicobacter, Vibrio, and Moraxella and genotyped those assemblies by core genome multilocus sequence typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case Study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports single nucleotide polymorphism (SNP) calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by Case Study 2 which summarizes the microevolution of Yersinia pestis over the last 5000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by Case Study 3, which presents a novel, global overview of the population structure of all of the species, subspecies, and clades within Escherichia.

          Related collections

          Most cited references98

          • Record: found
          • Abstract: found
          • Article: not found

          Basic local alignment search tool.

          A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies

            Motivation: Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next-generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. Results: I present some of the most notable new features and extensions of RAxML, such as a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees. In addition, an up-to-date 50-page user manual covering all new RAxML options is available. Availability and implementation: The code is available under GNU GPL at https://github.com/stamatak/standard-RAxML. Contact: alexandros.stamatakis@h-its.org Supplementary information: Supplementary data are available at Bioinformatics online.
              • Record: found
              • Abstract: found
              • Article: not found

              Minimap2: pairwise alignment for nucleotide sequences

              Heng Li (2018)
              Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.

                Author and article information

                Genome Res
                Genome Res
                Genome Research
                Cold Spring Harbor Laboratory Press
                January 2020
                January 2020
                : 30
                : 1
                : 138-152
                Warwick Medical School, University of Warwick, Coventry CV4 7AL, United Kingdom;
                [3 ]Scottish Salmonella Reference Laboratory, Glasgow G31 2ER, UK;
                [4 ]Public Health England (PHE), Colindale, London NW9 5EQ, UK;
                [5 ]National Wildlife Management Centre, APHA, Sand Hutton, York YO41 1LZ, UK;
                [6 ]Austrian Agency for Health and Food Safety (AGES), Institute for Medical Microbiology and Hygiene, 8010 Graz, Austria;
                [7 ]German Federal Institute for Risk Assessement, D-10589 Berlin, Germany (Study Centre for Genome Sequencing and Analysis);
                [8 ]Animal and Plant Health Agency (APHA), Addlestone KT15 3NB, UK;
                [9 ]Environment and Sustainability Institute, University of Exeter, Penryn TR10 9FE, UK;
                [10 ]Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK;
                [11 ]Institut Pasteur, 75724 Paris cedex, France;
                [12 ]Department of Epidemiology and Population Health, Institute of Infection and Global Health, University of Liverpool, Neston CH64 7TE, UK
                Author notes

                Coequal first author


                A complete list of the Agama Study Group coauthors appears at the end of this paper.

                Corresponding author: m.achtman@ 123456warwick.ac.uk
                Author information
                © 2020 Zhou et al.; Published by Cold Spring Harbor Laboratory Press

                This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

                : 20 April 2019
                : 3 December 2019
                Page count
                Pages: 15
                Funded by: Biotechnology and Biological Sciences Research Council , open-funder-registry 10.13039/501100000268;
                Award ID: BB/L020319/1
                Funded by: Wellcome Trust , open-funder-registry 10.13039/100004440;
                Award ID: 202792/Z/16/Z


                Comment on this article