136
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments

      other

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Rapidly decreasing genome sequencing costs have led to a proportionate increase in the number of samples used in prokaryotic population studies. Extracting single nucleotide polymorphisms (SNPs) from a large whole genome alignment is now a routine task, but existing tools have failed to scale efficiently with the increased size of studies. These tools are slow, memory inefficient and are installed through non-standard procedures. We present SNP-sites which can rapidly extract SNPs from a multi-FASTA alignment using modest resources and can output results in multiple formats for downstream analysis. SNPs can be extracted from a 8.3 GB alignment file (1842 taxa, 22 618 sites) in 267 seconds using 59 MB of RAM and 1 CPU core, making it feasible to run on modest computers. It is easy to install through the Debian and Homebrew package managers, and has been successfully tested on more than 20 operating systems. SNP-sites is implemented in C and is available under the open source license GNU GPL version 3.

          Related collections

          Most cited references3

          • Record: found
          • Abstract: found
          • Article: not found

          Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences.

          We sequenced the genomes of 3,615 strains of serotype Emm protein 1 (M1) group A Streptococcus to unravel the nature and timing of molecular events contributing to the emergence, dissemination, and genetic diversification of an unusually virulent clone that now causes epidemic human infections worldwide. We discovered that the contemporary epidemic clone emerged in stepwise fashion from a precursor cell that first contained the phage encoding an extracellular DNase virulence factor (streptococcal DNase D2, SdaD2) and subsequently acquired the phage encoding the SpeA1 variant of the streptococcal pyrogenic exotoxin A superantigen. The SpeA2 toxin variant evolved from SpeA1 by a single-nucleotide change in the M1 progenitor strain before acquisition by horizontal gene transfer of a large chromosomal region encoding secreted toxins NAD(+)-glycohydrolase and streptolysin O. Acquisition of this 36-kb region in the early 1980s into just one cell containing the phage-encoded sdaD2 and speA2 genes was the final major molecular event preceding the emergence and rapid intercontinental spread of the contemporary epidemic clone. Thus, we resolve a decades-old controversy about the type and sequence of genomic alterations that produced this explosive epidemic. Analysis of comprehensive, population-based contemporary invasive strains from seven countries identified strong patterns of temporal population structure. Compared with a preepidemic reference strain, the contemporary clone is significantly more virulent in nonhuman primate models of pharyngitis and necrotizing fasciitis. A key finding is that the molecular evolutionary events transpiring in just one bacterial cell ultimately have produced millions of human infections worldwide.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            PHYLIP: Phylogeny Inference Package (version 3.2)

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Multiple sequence alignment using ClustalW and ClustalX

                Bookmark

                Author and article information

                Journal
                Microb Genom
                Microb Genom
                MGen
                Microbial Genomics
                Microbiology Society
                2057-5858
                April 2016
                29 April 2016
                : 2
                : 4
                : e000056
                Affiliations
                [ 1]Pathogen Genomics, Wellcome Trust Sanger Institute, Wellcome Genome Campus , Hinxton, Cambridge, CB10 1SA, UK
                [ 2]Computing, Engineering and Mathematics, University of Brighton , Moulsecoomb, Brighton, BN2 4GJ, UK
                [ 3]Victorian Life Sciences Computation Initiative, The University of Melbourne , Parkville, Australia
                Author notes
                Correspondence Andrew J. Page ( ap13@ 123456sanger.ac.uk )

                All supporting data, code and protocols have been provided within the article or through supplementary data files.

                Article
                mgen000056
                10.1099/mgen.0.000056
                5320690
                28348851
                108c00e1-be03-4bc6-bf6c-6073a08a1963
                © 2016 The Authors

                This is an open access article under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

                History
                : 28 January 2016
                : 15 March 2016
                : 18 March 2016
                Funding
                Funded by: Wellcome Trust
                Award ID: 098051
                Categories
                Methods Paper
                Systems Microbiology: Large-scale comparative genomics
                Custom metadata
                0
                0
                No additional notes for production.
                C
                Not applicable
                Yes
                other
                Open Access CC-BY-NC

                software,snp calling,high throughput
                software, snp calling, high throughput

                Comments

                Comment on this article