Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
105
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Detection of recombination events in bacterial genomes from large population samples

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Analysis of important human pathogen populations is currently under transition toward whole-genome sequencing of growing numbers of samples collected on a global scale. Since recombination in bacteria is often an important factor shaping their evolution by enabling resistance elements and virulence traits to rapidly transfer from one evolutionary lineage to another, it is highly beneficial to have access to tools that can detect recombination events. Multiple advanced statistical methods exist for such purposes; however, they are typically limited either to only a few samples or to data from relatively short regions of a total genome. By harnessing the power of recent advances in Bayesian modeling techniques, we introduce here a method for detecting homologous recombination events from whole-genome sequence data for bacterial population samples on a large scale. Our statistical approach can efficiently handle hundreds of whole genome sequenced population samples and identify separate origins of the recombinant sequence, offering an enhanced insight into the diversification of bacterial clones at the level of the whole genome. A data set of 241 whole genome sequences from an important pandemic lineage of Streptococcus pneumoniae is used together with multiple simulated data sets to demonstrate the potential of our approach.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: found
          • Article: not found

          Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

          A new statistical method for estimating divergence dates of species from DNA sequence data by a molecular clock approach is developed. This method takes into account effectively the information contained in a set of DNA sequence data. The molecular clock of mitochondrial DNA (mtDNA) was calibrated by setting the date of divergence between primates and ungulates at the Cretaceous-Tertiary boundary (65 million years ago), when the extinction of dinosaurs occurred. A generalized least-squares method was applied in fitting a model to mtDNA sequence data, and the clock gave dates of 92.3 +/- 11.7, 13.3 +/- 1.5, 10.9 +/- 1.2, 3.7 +/- 0.6, and 2.7 +/- 0.6 million years ago (where the second of each pair of numbers is the standard deviation) for the separation of mouse, gibbon, orangutan, gorilla, and chimpanzee, respectively, from the line leading to humans. Although there is some uncertainty in the clock, this dating may pose a problem for the widely believed hypothesis that the pipedal creature Australopithecus afarensis, which lived some 3.7 million years ago at Laetoli in Tanzania and at Hadar in Ethiopia, was ancestral to man and evolved after the human-ape splitting. Another likelier possibility is that mtDNA was transferred through hybridization between a proto-human and a proto-chimpanzee after the former had developed bipedalism.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Analyzing the mosaic structure of genes.

            Some genes in prokaryotes consist of a mosaic of regions derived from different ancestors by horizontal gene transfer. A method is described for demonstrating the statistical significance of such mosaic structure and for locating the crossover points separating different regions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations

              Background During the most recent decade many Bayesian statistical models and software for answering questions related to the genetic structure underlying population samples have appeared in the scientific literature. Most of these methods utilize molecular markers for the inferences, while some are also capable of handling DNA sequence data. In a number of earlier works, we have introduced an array of statistical methods for population genetic inference that are implemented in the software BAPS. However, the complexity of biological problems related to genetic structure analysis keeps increasing such that in many cases the current methods may provide either inappropriate or insufficient solutions. Results We discuss the necessity of enhancing the statistical approaches to face the challenges posed by the ever-increasing amounts of molecular data generated by scientists over a wide range of research areas and introduce an array of new statistical tools implemented in the most recent version of BAPS. With these methods it is possible, e.g., to fit genetic mixture models using user-specified numbers of clusters and to estimate levels of admixture under a genetic linkage model. Also, alleles representing a different ancestry compared to the average observed genomic positions can be tracked for the sampled individuals, and a priori specified hypotheses about genetic population structure can be directly compared using Bayes' theorem. In general, we have improved further the computational characteristics of the algorithms behind the methods implemented in BAPS facilitating the analyses of large and complex datasets. In particular, analysis of a single dataset can now be spread over multiple computers using a script interface to the software. Conclusion The Bayesian modelling methods introduced in this article represent an array of enhanced tools for learning the genetic structure of populations. Their implementations in the BAPS software are designed to meet the increasing need for analyzing large-scale population genetics data. The software is freely downloadable for Windows, Linux and Mac OS X systems at .
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                January 2012
                January 2012
                7 November 2011
                7 November 2011
                : 40
                : 1
                : e6
                Affiliations
                1Department of Biomedical Engineering and Computational Science (BECS), Aalto University, P.O. Box 12200, FI-00076 AALTO, Finland, 2Center for Communicable Disease Dynamics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA, 3The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK, 4Department of Mathematics, Abo Akademi University, Piispankatu 8, FI-20500 Turku, Finland and 5Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, FI-00014 University of Helsinki, Finland
                Author notes
                *To whom correspondence should be addressed. Tel: +358 44 3030349; Fax: +358 9 470 23182; Email: pekka.marttinen@ 123456aalto.fi
                Article
                gkr928
                10.1093/nar/gkr928
                3245952
                22064866
                b9fd37a3-4418-43a7-9fdc-1b88fa52c8e4
                © The Author(s) 2011. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 28 June 2011
                : 7 October 2011
                : 10 October 2011
                Page count
                Pages: 12
                Categories
                Methods Online

                Genetics
                Genetics

                Comments

                Comment on this article