3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      gapseq: informed prediction of bacterial metabolic pathways and reconstruction of accurate metabolic models

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome-scale metabolic models of microorganisms are powerful frameworks to predict phenotypes from an organism’s genotype. While manual reconstructions are laborious, automated reconstructions often fail to recapitulate known metabolic processes. Here we present gapseq ( https://github.com/jotech/gapseq), a new tool to predict metabolic pathways and automatically reconstruct microbial metabolic models using a curated reaction database and a novel gap-filling algorithm. On the basis of scientific literature and experimental data for 14,931 bacterial phenotypes, we demonstrate that gapseq outperforms state-of-the-art tools in predicting enzyme activity, carbon source utilisation, fermentation products, and metabolic interactions within microbial communities.

          Supplementary Information

          The online version contains supplementary material available at (10.1186/s13059-021-02295-1).

          Related collections

          Most cited references117

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          BLAST+: architecture and applications

          Background Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. Results We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. Conclusion The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

            Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The Pfam protein families database in 2019

              Abstract The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors’ ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.
                Bookmark

                Author and article information

                Contributors
                j.zimmermann@iem.uni-kiel.de
                c.kaleta@iem.uni-kiel.de
                s.waschina@nutrinf.uni-kiel.de
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1474-7596
                1474-760X
                10 March 2021
                10 March 2021
                2021
                : 22
                : 81
                Affiliations
                [1 ]GRID grid.9764.c, ISNI 0000 0001 2153 9986, Christian-Albrechts-University Kiel, Institute of Experimental Medicine, Research Group Medical Systems Biology, ; Michaelis-Str. 5, Kiel, 24105 Germany
                [2 ]GRID grid.9764.c, ISNI 0000 0001 2153 9986, Christian-Albrechts-University Kiel, Institute of Human Nutrition and Food Science, Nutriinformatics, ; Heinrich-Hecht-Platz 10, Kiel, 24118 Germany
                Author information
                http://orcid.org/0000-0002-6290-3593
                Article
                2295
                10.1186/s13059-021-02295-1
                7949252
                33691770
                552bb784-4ae5-4561-ab9c-e22707871018
                © The Author(s) 2021

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 27 May 2020
                : 10 February 2021
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100001659, Deutsche Forschungsgemeinschaft;
                Award ID: EXC 2167
                Funded by: FundRef http://dx.doi.org/10.13039/501100001659, Deutsche Forschungsgemeinschaft;
                Award ID: SFB 1182
                Categories
                Software
                Custom metadata
                © The Author(s) 2021

                Genetics
                metabolic pathway analysis,metabolic networks,genome-scale metabolic models,benchmark,community simulation,microbiome,metagenome

                Comments

                Comment on this article