Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Shotgun metagenomics based on untargeted sequencing can explore the taxonomic profile and the function of unknown microorganisms in samples, and complement the shortage of amplicon sequencing. Binning assembled sequences into individual groups, which represent microbial genomes, is the key step and a major challenge in metagenomic research. Both supervised and unsupervised machine learning methods have been employed in binning. Genome binning belonging to unsupervised method clusters contigs into individual genome bins by machine learning methods without the assistance of any reference databases. So far a lot of genome binning tools have emerged. Evaluating these genome tools is of great significance to microbiological research. In this study, we evaluate 15 genome binning tools containing 12 original binning tools and 3 refining binning tools by comparing the performance of these tools on chicken gut metagenomic datasets and the first CAMI challenge datasets.

Results

For chicken gut metagenomic datasets, original genome binner MetaBat, Groopm2 and Autometa performed better than other original binner, and MetaWrap combined the binning results of them generated the most high-quality genome bins. For CAMI datasets, Groopm2 achieved the highest purity (> 0.9) with good completeness (> 0.8), and reconstructed the most high-quality genome bins among original genome binners. Compared with Groopm2, MetaBat2 had similar performance with higher completeness and lower purity. Genome refining binners DASTool predicated the most high-quality genome bins among all genomes binners. Most genome binner performed well for unique strains. Nonetheless, reconstructing common strains still is a substantial challenge for all genome binner.

Conclusions

In conclusion, we tested a set of currently available, state-of-the-art metagenomics hybrid binning tools and provided a guide for selecting tools for metagenomic binning by comparing range of purity, completeness, adjusted rand index, and the number of high-quality reconstructed bins. Furthermore, available information for future binning strategy were concluded.

Related collections

Most cited references 26

Record: found
Abstract: found
Article: not found

Enterotypes in the landscape of gut microbial community composition

Paul Costea, Falk Hildebrand, Arumugam Manimozhiyan … (2018)

Population stratification is a useful approach towards a better understanding of complex biological problems in human health and well-being. The proposal that such stratification applies to the human gut microbiome, in the form of distinct community composition types, termed “enterotypes”, was met with both excitement and controversy. In view of accumulated data and re-analyses since the original work, we revisit the enterotype concept, discuss different methods of dividing up the landscape of possible microbiome configurations, and put these concepts into a functional, ecological and medical context. As enterotypes are of use in describing the gut microbial community landscape and may become relevant in clinical practice, we aim to reconcile differing views and encourage a balanced application of the concept.

0 comments Cited 421 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Global diversity and biogeography of bacterial communities in wastewater treatment plants

Linwei Wu, Daliang Ning, Bing Zhang … (2019)

Microorganisms in wastewater treatment plants (WWTPs) are essential for water purification to protect public and environmental health. However, the diversity of microorganisms and the factors that control it are poorly understood. Using a systematic global-sampling effort, we analysed the 16S ribosomal RNA gene sequences from ~1,200 activated sludge samples taken from 269 WWTPs in 23 countries on 6 continents. Our analyses revealed that the global activated sludge bacterial communities contain ~1 billion bacterial phylotypes with a Poisson lognormal diversity distribution. Despite this high diversity, activated sludge has a small, global core bacterial community (n = 28 operational taxonomic units) that is strongly linked to activated sludge performance. Meta-analyses with global datasets associate the activated sludge microbiomes most closely to freshwater populations. In contrast to macroorganism diversity, activated sludge bacterial communities show no latitudinal gradient. Furthermore, their spatial turnover is scale-dependent and appears to be largely driven by stochastic processes (dispersal and drift), although deterministic factors (temperature and organic input) are also important. Our findings enhance our mechanistic understanding of the global diversity and biogeography of activated sludge bacterial communities within a theoretical ecology framework and have important implications for microbial ecology and wastewater treatment processes.

0 comments Cited 203 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Untangling genomes from metagenomes: revealing an uncultured class of marine Euryarchaeota.

Vaughn Iverson, Robert Morris, Christian Frazar … (2012)

Ecosystems are shaped by complex communities of mostly unculturable microbes. Metagenomes provide a fragmented view of such communities, but the ecosystem functions of major groups of organisms remain mysterious. To better characterize members of these communities, we developed methods to reconstruct genomes directly from mate-paired short-read metagenomes. We closed a genome representing the as-yet uncultured marine group II Euryarchaeota, assembled de novo from 1.7% of a metagenome sequenced from surface seawater. The genome describes a motile, photo-heterotrophic cell focused on degradation of protein and lipids and clarifies the origin of proteorhodopsin. It also demonstrates that high-coverage mate-paired sequence can overcome assembly difficulties caused by interstrain variation in complex microbial communities, enabling inference of ecosystem functions for uncultured members.

0 comments Cited 201 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Yi Yue:

ORCID: http://orcid.org/0000-0002-6520-7171

yyyue@ahau.edu.cn

Hao Huang:

ORCID: http://orcid.org/0000-0001-6803-3210

huanghao_2013@qq.com

Zhao Qi: 403069355@qq.com

Hui-Min Dou: 1379686103@qq.com

Xin-Yi Liu: 812401670@qq.com

Tian-Fei Han: 2361501042@qq.com

Yue Chen: 939909885@qq.com

Xiang-Jun Song: sxj@ahau.edu.cn

You-Hua Zhang: zhangyh@ahau.edu.cn

Jian Tu: tujian1980@126.com

Journal

Journal ID (nlm-ta): BMC Bioinformatics

Journal ID (iso-abbrev): BMC Bioinformatics

Title: BMC Bioinformatics

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2105

Publication date (Electronic): 28 July 2020

Publication date PMC-release: 28 July 2020

Publication date Collection: 2020

Volume: 21

Electronic Location Identifier: 334

Affiliations

[1 ]GRID grid.411389.6, ISNI 0000 0004 1760 4804, Anhui Province Key Laboratory of Veterinary Pathobiology and Disease Control, , Anhui Agricultural University, ; Hefei, 230036 China

[2 ]GRID grid.411389.6, ISNI 0000 0004 1760 4804, School of Information & Computer, , Anhui Agricultural University, ; Hefei, 230036 China

[3 ]GRID grid.411389.6, ISNI 0000 0004 1760 4804, School of Life Sciences, , Anhui Agricultural University, ; Hefei, 230036 China

[4 ]GRID grid.411389.6, ISNI 0000 0004 1760 4804, School of Animal Science and Technology, , Anhui Agricultural University, ; Hefei, 230036 China

Author information

Yi Yue http://orcid.org/0000-0002-6520-7171

Hao Huang http://orcid.org/0000-0001-6803-3210

Article

Publisher ID: 3667

DOI: 10.1186/s12859-020-03667-3

PMC ID: 7469296

PubMed ID: 32723290

SO-VID: f082e11b-b0c2-4cdd-bd30-76525d72645e

License:

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

History

Date received : 2 November 2019

Date accepted : 16 July 2020

Funding

Funded by: FundRef http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;

Award ID: 31772707

Award ID: 31972642

Award Recipient : Jian Tu

Funded by: Construction of Biology Peak Discipline in Anhui Province

Award ID: 03019001

Award Recipient : Yi Yue

Custom metadata

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: metagenomics,genome binning,clustering,benchmarking,comparison

Data availability:

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: metagenomics, genome binning, clustering, benchmarking, comparison

Comments

Comment on this article

scite_

Cited by 27

See all cited by

Most referenced authors 2,464

See all reference authors

Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets

Read this article at

Abstract

Background

Results

Conclusions

Related collections

Genetoberfest

Most cited references 26

Enterotypes in the landscape of gut microbial community composition

Global diversity and biogeography of bacterial communities in wastewater treatment plants

Untangling genomes from metagenomes: revealing an uncultured class of marine Euryarchaeota.

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 104

Cited by 27

Most referenced authors 2,464