Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Shotgun metagenomics methods enable characterization of microbial communities in human microbiome and environmental samples. Assembly of metagenome sequences does not output whole genomes, so computational binning methods have been developed to cluster sequences into genome ‘bins’. These methods exploit sequence composition, species abundance, or chromosome organization but cannot fully distinguish closely related species and strains. We present a binning method that incorporates bacterial DNA methylation signatures, which are detected using single-molecule real-time sequencing. Our method takes advantage of these endogenous epigenetic barcodes to resolve individual reads and assembled contigs into species- and strain-level bins. We validated our method using synthetic and real microbiome sequences. In addition to genome binning, we show that our method links plasmids and other mobile genetic elements to their host species in a real microbiome sample. Incorporation of DNA methylation information into shotgun metagenomics analyses will complement existing methods to enable more accurate sequence binning.

Related collections

Most cited references 34

Record: found
Abstract: found
Article: not found

Community structure and metabolism through reconstruction of microbial genomes from the environment.

Gene W. Tyson, Jarrod Chapman, Philip Hugenholtz … (2004)

Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level. The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment.

0 comments Cited 650 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The NumPy array: a structure for efficient numerical computation

Gael Varoquaux, S. Chris Colbert, Stefan van der Walt (2011)

In the Python world, NumPy arrays are the standard representation for numerical data. Here, we show how these arrays enable efficient implementation of numerical computations in a high-level language. Overall, three techniques are applied to improve performance: vectorizing calculations, avoiding copying data in memory, and minimizing operation counts. We first present the NumPy array structure, then show how to use it for efficient computation, and finally how to share array data with other libraries.

0 comments Cited 536 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: found
Article: not found

Continuous base identification for single-molecule nanopore DNA sequencing.

James Clarke, Hai-Chen Wu, Lakmal Jayasinghe … (2009)

A single-molecule method for sequencing DNA that does not require fluorescent labelling could reduce costs and increase sequencing speeds. An exonuclease enzyme might be used to cleave individual nucleotide molecules from the DNA, and when coupled to an appropriate detection system, these nucleotides could be identified in the correct order. Here, we show that a protein nanopore with a covalently attached adapter molecule can continuously identify unlabelled nucleoside 5'-monophosphate molecules with accuracies averaging 99.8%. Methylated cytosine can also be distinguished from the four standard DNA bases: guanine, adenine, thymine and cytosine. The operating conditions are compatible with the exonuclease, and the kinetic data show that the nucleotides have a high probability of translocation through the nanopore and, therefore, of not being registered twice. This highly accurate tool is suitable for integration into a system for sequencing nucleic acids and for analysing epigenetic modifications.

0 comments Cited 447 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-journal-id): 9604648

Journal ID (pubmed-jr-id): 20305

Journal ID (nlm-ta): Nat Biotechnol

Journal ID (iso-abbrev): Nat. Biotechnol.

Title: Nature biotechnology

ISSN (Print): 1087-0156

ISSN (Electronic): 1546-1696

Publication date Nihms-submitted: 14 November 2017

Publication date (Electronic): 11 December 2017

Publication date (Print): January 2018

Publication date PMC-release: 11 June 2018

Volume: 36

Issue: 1

Pages: 61-69

Affiliations

[1 ]Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

[2 ]Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

[3 ]Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

[4 ]Department of Medicine, New York University School of Medicine, New York 10016, USA

[5 ]Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL, 32611, USA

Author notes

[# ]Address correspondence to: gang.fang@ 123456mssm.edu

Article

Manuscript ID: NIHMS920168

DOI: 10.1038/nbt.4037

PMC ID: 5762413

PubMed ID: 29227468

SO-VID: 52c3eaf8-34e3-4d75-a1e6-01a754196ee3

License:

Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation

Read this article at

Abstract

Related collections

Genome Engineering using CRISPR

Most cited references 34

Community structure and metabolism through reconstruction of microbial genomes from the environment.

The NumPy array: a structure for efficient numerical computation

Continuous base identification for single-molecule nanopore DNA sequencing.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 317

Cited by 56

Most referenced authors 3,193