+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Global analysis of the biosynthetic chemical space of marine prokaryotes


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Marine prokaryotes are a rich source of novel bioactive secondary metabolites for drug discovery. Recent genome mining studies have revealed their great potential to bio-synthesize novel secondary metabolites. However, the exact biosynthetic chemical space encoded by the marine prokaryotes has yet to be systematically evaluated.


          We first investigated the secondary metabolic potential of marine prokaryotes by analyzing the diversity and novelty of the biosynthetic gene clusters (BGCs) in 7541 prokaryotic genomes from cultivated and single cells, along with 26,363 newly assembled medium-to-high-quality genomes from marine environmental samples. To quantitatively evaluate the unexplored biosynthetic chemical space of marine prokaryotes, the clustering thresholds for constructing the biosynthetic gene cluster and molecular networks were optimized to reach a similar level of the chemical similarity between the gene cluster family (GCF)-encoded metabolites and molecular family (MF) scaffolds using the MIBiG database. The global genome mining analysis demonstrated that the predicted 70,011 BGCs were organized into 24,536 mostly new (99.5%) GCFs, while the reported marine prokaryotic natural products were only classified into 778 MFs at the optimized clustering thresholds. The number of MF scaffolds is only 3.2% of the number of GCF-encoded scaffolds, suggesting that at least 96.8% of the secondary metabolic potential in marine prokaryotes is untapped. The unexplored biosynthetic chemical space of marine prokaryotes was illustrated by the 88 potential novel antimicrobial peptides encoded by ribosomally synthesized and post-translationally modified peptide BGCs. Furthermore, a sea-water-derived Aquimarina strain was selected to illustrate the diverse biosynthetic chemical space through untargeted metabolomics and genomics approaches, which identified the potential biosynthetic pathways of a group of novel polyketides and two known compounds (didemnilactone B and macrolactin A 15-ketone).


          The present bioinformatics and cheminformatics analyses highlight the promising potential to explore the biosynthetic chemical diversity of marine prokaryotes and provide valuable knowledge for the targeted discovery and biosynthesis of novel marine prokaryotic natural products.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s40168-023-01573-3.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes

          Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Interactive Tree Of Life (iTOL) v4: recent updates and new developments

            Abstract The Interactive Tree Of Life (https://itol.embl.de) is an online tool for the display, manipulation and annotation of phylogenetic and other trees. It is freely available and open to everyone. The current version introduces four new dataset types, together with numerous new features. Annotation options have been expanded and new control options added for many display elements. An interactive spreadsheet-like editor has been implemented, providing dataset creation and editing directly in the web interface. Font support has been rewritten with full support for UTF-8 character encoding throughout the user interface. Google Web Fonts are now fully supported in the tree text labels. iTOL v4 is the first tool which supports direct visualization of Qiime 2 trees and associated annotations. The user account system has been streamlined and expanded with new navigation options, and currently handles >700 000 trees from more than 40 000 individual users. Full batch access has been implemented allowing programmatic upload and export of trees and annotations.
              • Record: found
              • Abstract: found
              • Article: not found

              Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking.

              The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.

                Author and article information

                BioMed Central (London )
                28 June 2023
                28 June 2023
                : 11
                : 144
                [1 ]GRID grid.469325.f, ISNI 0000 0004 1761 325X, College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Key Laboratory of Marine Fishery Resources Exploitment & Utilization of Zhejiang Province, , Zhejiang University of Technology, ; Hangzhou, 310014 China
                [2 ]GRID grid.453137.7, ISNI 0000 0004 0406 0561, Key Laboratory of Marine Ecosystem and Biogeochemistry, , Ministry of Natural Resources & Second Institute of Oceanography, Ministry of Natural Resources, ; Hangzhou, 310012 China
                [3 ]GRID grid.47100.32, ISNI 0000000419368710, Department of Chemistry, , Institute of Biomolecular Design & Discovery, Yale University, ; West Haven, CT 06516 USA
                [4 ]GRID grid.469325.f, ISNI 0000 0004 1761 325X, Institute of Cyberspace Security, College of Information Engineering, , Zhejiang University of Technology, ; Hangzhou, 310023 China
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                : 13 January 2023
                : 15 May 2023
                Funded by: FundRef http://dx.doi.org/10.13039/501100012166, National Key Research and Development Program of China;
                Award ID: 2022YFC2804104
                Award ID: 2022YFC2804700
                Award Recipient :
                Funded by: National Natural Science Foundation of China
                Award ID: 42276137
                Award Recipient :
                Custom metadata
                © BioMed Central Ltd., part of Springer Nature 2023

                marine prokaryotes,biosynthetic gene clusters,secondary metabolite,genomics,cheminformatics


                Comment on this article