Detection of recombination events in bacterial genomes from large population samples

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Analysis of important human pathogen populations is currently under transition toward whole-genome sequencing of growing numbers of samples collected on a global scale. Since recombination in bacteria is often an important factor shaping their evolution by enabling resistance elements and virulence traits to rapidly transfer from one evolutionary lineage to another, it is highly beneficial to have access to tools that can detect recombination events. Multiple advanced statistical methods exist for such purposes; however, they are typically limited either to only a few samples or to data from relatively short regions of a total genome. By harnessing the power of recent advances in Bayesian modeling techniques, we introduce here a method for detecting homologous recombination events from whole-genome sequence data for bacterial population samples on a large scale. Our statistical approach can efficiently handle hundreds of whole genome sequenced population samples and identify separate origins of the recombinant sequence, offering an enhanced insight into the diversification of bacterial clones at the level of the whole genome. A data set of 241 whole genome sequences from an important pandemic lineage of Streptococcus pneumoniae is used together with multiple simulated data sets to demonstrate the potential of our approach.

Related collections

Most cited references 37

Record: found
Abstract: found
Article: not found

Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

H Kishino, T Yano, M. Hasegawa (1984)

A new statistical method for estimating divergence dates of species from DNA sequence data by a molecular clock approach is developed. This method takes into account effectively the information contained in a set of DNA sequence data. The molecular clock of mitochondrial DNA (mtDNA) was calibrated by setting the date of divergence between primates and ungulates at the Cretaceous-Tertiary boundary (65 million years ago), when the extinction of dinosaurs occurred. A generalized least-squares method was applied in fitting a model to mtDNA sequence data, and the clock gave dates of 92.3 +/- 11.7, 13.3 +/- 1.5, 10.9 +/- 1.2, 3.7 +/- 0.6, and 2.7 +/- 0.6 million years ago (where the second of each pair of numbers is the standard deviation) for the separation of mouse, gibbon, orangutan, gorilla, and chimpanzee, respectively, from the line leading to humans. Although there is some uncertainty in the clock, this dating may pose a problem for the widely believed hypothesis that the pipedal creature Australopithecus afarensis, which lived some 3.7 million years ago at Laetoli in Tanzania and at Hadar in Ethiopia, was ancestral to man and evolved after the human-ape splitting. Another likelier possibility is that mtDNA was transferred through hybridization between a proto-human and a proto-chimpanzee after the former had developed bipedalism.

0 comments Cited 888 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Analyzing the mosaic structure of genes.

Julian Somers (1992)

Some genes in prokaryotes consist of a mosaic of regions derived from different ancestors by horizontal gene transfer. A method is described for demonstrating the statistical significance of such mosaic structure and for locating the crossover points separating different regions.

0 comments Cited 347 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations

Jukka Corander, Pekka Marttinen, Jukka Sirén … (2008)

Background During the most recent decade many Bayesian statistical models and software for answering questions related to the genetic structure underlying population samples have appeared in the scientific literature. Most of these methods utilize molecular markers for the inferences, while some are also capable of handling DNA sequence data. In a number of earlier works, we have introduced an array of statistical methods for population genetic inference that are implemented in the software BAPS. However, the complexity of biological problems related to genetic structure analysis keeps increasing such that in many cases the current methods may provide either inappropriate or insufficient solutions. Results We discuss the necessity of enhancing the statistical approaches to face the challenges posed by the ever-increasing amounts of molecular data generated by scientists over a wide range of research areas and introduce an array of new statistical tools implemented in the most recent version of BAPS. With these methods it is possible, e.g., to fit genetic mixture models using user-specified numbers of clusters and to estimate levels of admixture under a genetic linkage model. Also, alleles representing a different ancestry compared to the average observed genomic positions can be tracked for the sampled individuals, and a priori specified hypotheses about genetic population structure can be directly compared using Bayes' theorem. In general, we have improved further the computational characteristics of the algorithms behind the methods implemented in BAPS facilitating the analyses of large and complex datasets. In particular, analysis of a single dataset can now be spread over multiple computers using a script interface to the software. Conclusion The Bayesian modelling methods introduced in this article represent an array of enhanced tools for learning the genetic structure of populations. Their implementations in the BAPS software are designed to meet the increasing need for analyzing large-scale population genetics data. The software is freely downloadable for Windows, Linux and Mac OS X systems at .

0 comments Cited 316 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (publisher-id): nar

Journal ID (hwp): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Collection: January 2012

Publication date (Print): January 2012

Publication date (Electronic): 7 November 2011

Publication date PMC-release: 7 November 2011

Volume: 40

Issue: 1

Page: e6

Affiliations

¹Department of Biomedical Engineering and Computational Science (BECS), Aalto University, P.O. Box 12200, FI-00076 AALTO, Finland, ²Center for Communicable Disease Dynamics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA, ³The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK, ⁴Department of Mathematics, Abo Akademi University, Piispankatu 8, FI-20500 Turku, Finland and ⁵Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, FI-00014 University of Helsinki, Finland

Author notes

*To whom correspondence should be addressed. Tel: +358 44 3030349; Fax: +358 9 470 23182; Email: pekka.marttinen@ 123456aalto.fi

Article

Publisher ID: gkr928

DOI: 10.1093/nar/gkr928

PMC ID: 3245952

PubMed ID: 22064866

SO-VID: b9fd37a3-4418-43a7-9fdc-1b88fa52c8e4

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 28 June 2011

Date revision received : 7 October 2011

Date accepted : 10 October 2011

Page count

Pages: 12

Comments

Comment on this article

scite_

Cited by 84

See all cited by

Most referenced authors 530

See all reference authors

Detection of recombination events in bacterial genomes from large population samples

Read this article at

Abstract

Related collections

Genome Integrity

Most cited references 37

Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

Analyzing the mosaic structure of genes.

Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 64

Cited by 84

Most referenced authors 530