6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      BOFdat: Generating biomass objective functions for genome-scale metabolic models from experimental data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome-scale metabolic models (GEMs) are mathematically structured knowledge bases of metabolism that provide phenotypic predictions from genomic information. GEM-guided predictions of growth phenotypes rely on the accurate definition of a biomass objective function (BOF) that is designed to include key cellular biomass components such as the major macromolecules (DNA, RNA, proteins), lipids, coenzymes, inorganic ions and species-specific components. Despite its importance, no standardized computational platform is currently available to generate species-specific biomass objective functions in a data-driven, unbiased fashion. To fill this gap in the metabolic modeling software ecosystem, we implemented BOFdat, a Python package for the definition of a Biomass Objective Function from experimental data. BOFdat has a modular implementation that divides the BOF definition process into three independent modules defined here as steps: 1) the coefficients for major macromolecules are calculated, 2) coenzymes and inorganic ions are identified and their stoichiometric coefficients estimated, 3) the remaining species-specific metabolic biomass precursors are algorithmically extracted in an unbiased way from experimental data. We used BOFdat to reconstruct the BOF of the Escherichia coli model iML1515, a gold standard in the field. The BOF generated by BOFdat resulted in the most concordant biomass composition, growth rate, and gene essentiality prediction accuracy when compared to other methods. Installation instructions for BOFdat are available in the documentation and the source code is available on GitHub ( https://github.com/jclachance/BOFdat).

          Author summary

          The formulation of phenotypic predictions by genome-scale models (GEMs) is dependent on the specified objective. The idea of a biomass objective function (BOF) is to represent all metabolites necessary for cells to double so that optimizing the BOF is equivalent to optimizing growth. Knowledge of the qualitative and quantitative organism’s composition (i.e. which metabolites are necessary for growth and in what proportion) is critical for accurate predictions. We implemented BOFdat with the idea that experimental data should drive the definition of the biomass composition. As omic datasets become more available, the possibility of integrating them to obtain a condition-specific biomass composition is in reach and therefore one of the main features of BOFdat. While major macromolecules, coenzymes, and inorganic ions are ubiquitous components across species, several species-specific components exist in the cell that should be accounted for in the BOF. To identify these, we implemented an approach that minimizes the error between experimental essentiality data and GEM-driven prediction. Hence BOFdat provides an unbiased, data-driven approach to defining BOF that has the potential to improve the quality of new genome-scale models and greatly decrease the time required to generate a new reconstruction.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          A protocol for generating a high-quality genome-scale metabolic reconstruction.

          Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric

            Data imbalance is frequently encountered in biomedical applications. Resampling techniques can be used in binary classification to tackle this issue. However such solutions are not desired when the number of samples in the small class is limited. Moreover the use of inadequate performance metrics, such as accuracy, lead to poor generalization results because the classifiers tend to predict the largest size class. One of the good approaches to deal with this issue is to optimize performance metrics that are designed to handle data imbalance. Matthews Correlation Coefficient (MCC) is widely used in Bioinformatics as a performance metric. We are interested in developing a new classifier based on the MCC metric to handle imbalanced data. We derive an optimal Bayes classifier for the MCC metric using an approach based on Frechet derivative. We show that the proposed algorithm has the nice theoretical property of consistency. Using simulated data, we verify the correctness of our optimality result by searching in the space of all possible binary classifiers. The proposed classifier is evaluated on 64 datasets from a wide range data imbalance. We compare both classification performance and CPU efficiency for three classifiers: 1) the proposed algorithm (MCC-classifier), the Bayes classifier with a default threshold (MCC-base) and imbalanced SVM (SVM-imba). The experimental evaluation shows that MCC-classifier has a close performance to SVM-imba while being simpler and more efficient.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The biomass objective function.

              Flux balance analysis (FBA) is a mathematical approach for analyzing the flow of metabolites through a metabolic network. To computationally predict cell growth using FBA, one has to determine the biomass objective function that describes the rate at which all of the biomass precursors are made in the correct proportions. Here we review fundamental issues associated with its formulation and use to compute optimal growth states. Copyright 2010 Elsevier Ltd. All rights reserved.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: SoftwareRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: Data curationRole: MethodologyRole: SoftwareRole: Writing – review & editing
                Role: ConceptualizationRole: MethodologyRole: SoftwareRole: Writing – review & editing
                Role: ConceptualizationRole: MethodologyRole: SoftwareRole: Validation
                Role: ConceptualizationRole: MethodologyRole: Writing – review & editing
                Role: ValidationRole: Writing – review & editing
                Role: Funding acquisitionRole: ResourcesRole: Supervision
                Role: ConceptualizationRole: Funding acquisitionRole: ResourcesRole: SupervisionRole: VisualizationRole: Writing – review & editing
                Role: ConceptualizationRole: SupervisionRole: Writing – review & editing
                Role: ConceptualizationRole: MethodologyRole: SupervisionRole: VisualizationRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: MethodologyRole: ResourcesRole: SupervisionRole: VisualizationRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                22 April 2019
                April 2019
                : 15
                : 4
                : e1006971
                Affiliations
                [1 ] Département de Biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
                [2 ] Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States of America
                [3 ] Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, United States of America
                [4 ] Department of Pediatrics, University of California, San Diego, La Jolla, CA, United States of America
                [5 ] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Lyngby, Denmark
                Hebrew University of Jerusalem, ISRAEL
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0002-3096-6995
                http://orcid.org/0000-0001-6663-7643
                http://orcid.org/0000-0002-8293-3909
                http://orcid.org/0000-0003-2357-6785
                http://orcid.org/0000-0002-8630-4800
                http://orcid.org/0000-0003-1238-1499
                http://orcid.org/0000-0002-3961-294X
                Article
                PCOMPBIOL-D-18-01998
                10.1371/journal.pcbi.1006971
                6497307
                31009451
                51b3164e-0ccd-4a37-9496-8efc2c1dafa4
                © 2019 Lachance et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 29 November 2018
                : 21 March 2019
                Page count
                Figures: 6, Tables: 0, Pages: 20
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100000038, Natural Sciences and Engineering Research Council of Canada;
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100000038, Natural Sciences and Engineering Research Council of Canada;
                Award Recipient :
                Funded by: Fonds de recherche du Québec – Nature et technologies (FRQNT)
                Award ID: 206064
                Award Recipient :
                Funded by: Novo Nordisk Foundation through the Center for Biosustainability at the Technical University of Denmark
                Award ID: NNF10CC1016517
                Award Recipient :
                This work was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) [to JCL, P-ÉJ and SR], Fonds de recherche du Québec – Nature et technologies (FRQNT) [206064 to SR], Université de Sherbrooke, and by the Novo Nordisk Foundation through the Center for Biosustainability at the Technical University of Denmark [NNF10CC1016517 to BOP, AMF and ZAK]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Biochemistry
                Metabolism
                Metabolites
                Computer and Information Sciences
                Network Analysis
                Metabolic Networks
                Physical Sciences
                Chemistry
                Polymer Chemistry
                Macromolecules
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzyme Chemistry
                Biochemical Cofactors
                Coenzymes
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Genetics
                Biology and Life Sciences
                Cell Biology
                Cell Physiology
                Cell Metabolism
                Biology and Life Sciences
                Biochemistry
                Lipids
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Gene Prediction
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Gene Prediction
                Custom metadata
                vor-update-to-uncorrected-proof
                2019-05-02
                The data underlying the results presented in the study are available from https://github.com/jclachance/BOFdat.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article