MOST: a modified MLST typing tool based on short read sequencing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Multilocus sequence typing (MLST) is an effective method to describe bacterial populations. Conventionally, MLST involves Polymerase Chain Reaction (PCR) amplification of housekeeping genes followed by Sanger DNA sequencing. Public Health England (PHE) is in the process of replacing the conventional MLST methodology with a method based on short read sequence data derived from Whole Genome Sequencing (WGS). This paper reports the comparison of the reliability of MLST results derived from WGS data, comparing mapping and assembly-based approaches to conventional methods using 323 bacterial genomes of diverse species. The sensitivity of the two WGS based methods were further investigated with 26 mixed and 29 low coverage genomic data sets from Salmonella enteridis and Streptococcus pneumoniae. Of the 323 samples, 92.9% ( n = 300), 97.5% ( n = 315) and 99.7% ( n = 322) full MLST profiles were derived by the conventional method, assembly- and mapping-based approaches, respectively. The concordance between samples that were typed by conventional (92.9%) and both WGS methods was 100%. From the 55 mixed and low coverage genomes, 89.1% ( n = 49) and 67.3% ( n = 37) full MLST profiles were derived from the mapping and assembly based approaches, respectively. In conclusion, deriving MLST from WGS data is more sensitive than the conventional method. When comparing WGS based methods, the mapping based approach was the most sensitive. In addition, the mapping based approach described here derives quality metrics, which are difficult to determine quantitatively using conventional and WGS-assembly based approaches.

Related collections

Most cited references 3

Record: found
Abstract: found
Article: found

Is Open Access

Identification of Salmonella for public health surveillance using whole genome sequencing

Philip Ashton, Satheesh Nair, Tansy Peters … (2016)

In April 2015, Public Health England implemented whole genome sequencing (WGS) as a routine typing tool for public health surveillance of Salmonella, adopting a multilocus sequence typing (MLST) approach as a replacement for traditional serotyping. The WGS derived sequence type (ST) was compared to the phenotypic serotype for 6,887 isolates of S. enterica subspecies I, and of these, 6,616 (96%) were concordant. Of the 4% (n = 271) of isolates of subspecies I exhibiting a mismatch, 119 were due to a process error in the laboratory, 26 were likely caused by the serotype designation in the MLST database being incorrect and 126 occurred when two different serovars belonged to the same ST. The population structure of S. enterica subspecies II–IV differs markedly from that of subspecies I and, based on current data, defining the serovar from the clonal complex may be less appropriate for the classification of this group. Novel sequence types that were not present in the MLST database were identified in 8.6% of the total number of samples tested (including S. enterica subspecies I–IV and S. bongori) and these 654 isolates belonged to 326 novel STs. For S. enterica subspecies I, WGS MLST derived serotyping is a high throughput, accurate, robust, reliable typing method, well suited to routine public health surveillance. The combined output of ST and serovar supports the maintenance of traditional serovar nomenclature while providing additional insight on the true phylogenetic relationship between isolates.

0 comments Cited 120 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Short read sequence typing (SRST): multi-locus sequence types from short reads

Michael Inouye, Thomas Conway, Justin Zobel … (2012)

Background Multi-locus sequence typing (MLST) has become the gold standard for population analyses of bacterial pathogens. This method focuses on the sequences of a small number of loci (usually seven) to divide the population and is simple, robust and facilitates comparison of results between laboratories and over time. Over the last decade, researchers and population health specialists have invested substantial effort in building up public MLST databases for nearly 100 different bacterial species, and these databases contain a wealth of important information linked to MLST sequence types such as time and place of isolation, host or niche, serotype and even clinical or drug resistance profiles. Recent advances in sequencing technology mean it is increasingly feasible to perform bacterial population analysis at the whole genome level. This offers massive gains in resolving power and genetic profiling compared to MLST, and will eventually replace MLST for bacterial typing and population analysis. However given the wealth of data currently available in MLST databases, it is crucial to maintain backwards compatibility with MLST schemes so that new genome analyses can be understood in their proper historical context. Results We present a software tool, SRST, for quick and accurate retrieval of sequence types from short read sets, using inputs easily downloaded from public databases. SRST uses read mapping and an allele assignment score incorporating sequence coverage and variability, to determine the most likely allele at each MLST locus. Analysis of over 3,500 loci in more than 500 publicly accessible Illumina read sets showed SRST to be highly accurate at allele assignment. SRST output is compatible with common analysis tools such as eBURST, Clonal Frame or PhyloViz, allowing easy comparison between novel genome data and MLST data. Alignment, fastq and pileup files can also be generated for novel alleles. Conclusions SRST is a novel software tool for accurate assignment of sequence types using short read data. Several uses for the tool are demonstrated, including quality control for high-throughput sequencing projects, plasmid MLST and analysis of genomic data during outbreak investigation. SRST is open-source, requires Python, BWA and SamTools, and is available from http://srst.sourceforge.net.

0 comments Cited 40 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Automated extraction of typing information for bacterial pathogens from whole genome sequence data: Neisseria meningitidis as an exemplar.

Keith Jolley, M Maiden (2012)

Whole genome sequence (WGS) data are increasingly used to characterise bacterial pathogens. These data provide detailed information on the genotypes and likely phenotypes of aetiological agents, enabling the relationships of samples from potential disease outbreaks to be established precisely. However, the generation of increasing quantities of sequence data does not, in itself, resolve the problems that many microbiological typing methods have addressed over the last 100 years or so; indeed, providing large volumes of unstructured data can confuse rather than resolve these issues. Here we review the nascent field of storage of WGS data for clinical application and show how curated sequence-based typing schemes on websites have generated an infrastructure that can exploit WGS for bacterial typing efficiently. We review the tools that have been implemented within the PubMLST website to extract clinically useful, strain-characterisation information that can be provided to physicians and public health professionals in a timely, concise and understandable way. These data can be used to inform medical decisions such as how to treat a patient, whether to instigate public health action, and what action might be appropriate. The information is compatible both with previous sequence-based typing data and also with data obtained in the absence of WGS, providing a flexible infrastructure for WGS-based clinical microbiology.

0 comments Cited 24 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Rediat Tewolde

Journal

Journal ID (nlm-ta): PeerJ

Journal ID (iso-abbrev): PeerJ

Journal ID (publisher-id): peerj

Journal ID (pmc): peerj

Title: PeerJ

Publisher: PeerJ Inc. (San Francisco, USA )

ISSN (Electronic): 2167-8359

Publication date (Electronic): 17 August 2016

Publication date Collection: 2016

Volume: 4

Electronic Location Identifier: e2308

Affiliations

[1 ]Infectious Disease Informatics Unit, Public Health England , London, United Kingdom

[2 ]Gastrointestinal Bacteria Reference Unit, Public Health England , London, United Kingdom

[3 ]Respiratory and Vaccine Preventable Bacteria Reference Unit, Public Health England , London, United Kingdom

[4 ]Antimicrobial Resistance and Healthcare Associated Infection Unit, Public Health England, NIS , London, United Kingdom

Article

Publisher ID: 2308

DOI: 10.7717/peerj.2308

PMC ID: 4991843

PubMed ID: 27602279

SO-VID: d206a59d-0f61-4ebc-99e1-826230fe45ea

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

History

Date received : 19 April 2016

Date accepted : 10 July 2016

Funding

The authors received no funding for this work.

Comments

Comment on this article

scite_

Cited by 56

See all cited by

Most referenced authors 453

See all reference authors

MOST: a modified MLST typing tool based on short read sequencing

Read this article at

Abstract

Related collections

Arabidopsis genomics

Most cited references 3

Identification of Salmonella for public health surveillance using whole genome sequencing

Short read sequence typing (SRST): multi-locus sequence types from short reads

Automated extraction of typing information for bacterial pathogens from whole genome sequence data: Neisseria meningitidis as an exemplar.

Author and article information

Contributors

Journal

Affiliations

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 97

Cited by 56

Most referenced authors 453