1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Chemical rules for optimization of chemical mutagenicity via matched molecular pairs analysis and machine learning methods

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Chemical mutagenicity is a serious issue that needs to be addressed in early drug discovery. Over a long period of time, medicinal chemists have manually summarized a series of empirical rules for the optimization of chemical mutagenicity. However, given the rising amount of data, it is getting more difficult for medicinal chemists to identify more comprehensive chemical rules behind the biochemical data. Herein, we integrated a large Ames mutagenicity data set with 8576 compounds to derive mutagenicity transformation rules for reversing Ames mutagenicity via matched molecular pairs analysis. A well-trained consensus model with a reasonable applicability domain was constructed, which showed favorable performance in the external validation set with an accuracy of 0.815. The model was used to assess the generalizability and validity of these mutagenicity transformation rules. The results demonstrated that these rules were of great value and could provide inspiration for the structural modifications of compounds with potential mutagenic effects. We also found that the local chemical environment of the attachment points of rules was critical for successful transformation. To facilitate the use of these mutagenicity transformation rules, we integrated them into ADMETopt2 ( http://lmmd.ecust.edu.cn/admetsar2/admetopt2/), a free web server for optimization of chemical ADMET properties. The above-mentioned approach would be extended to the optimization of other toxicity endpoints.

          Graphical Abstract

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s13321-023-00707-x.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          DrugBank 5.0: a major update to the DrugBank database for 2018

          Abstract DrugBank (www.drugbank.ca) is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets. First described in 2006, DrugBank has continued to evolve over the past 12 years in response to marked improvements to web standards and changing needs for drug research and development. This year’s update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total number of investigational drugs in the database has grown by almost 300%, the number of drug-drug interactions has grown by nearly 600% and the number of SNP-associated drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoprotoemics). New data have also been added on the status of hundreds of new drug clinical trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacological research, pharmaceutical science and drug education.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            What is a support vector machine?

            Support vector machines (SVMs) are becoming popular in a wide variety of biological applications. But, what exactly are SVMs and how do they work? And what are their most promising applications in the life sciences?
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Random forest: a classification and regression tool for compound classification and QSAR modeling.

              A new classification and regression tool, Random Forest, is introduced and investigated for predicting a compound's quantitative or categorical biological activity based on a quantitative description of the compound's molecular structure. Random Forest is an ensemble of unpruned classification or regression trees created by using bootstrap samples of the training data and random feature selection in tree induction. Prediction is made by aggregating (majority vote or averaging) the predictions of the ensemble. We built predictive models for six cheminformatics data sets. Our analysis demonstrates that Random Forest is a powerful tool capable of delivering performance that is among the most accurate methods to date. We also present three additional features of Random Forest: built-in performance assessment, a measure of relative importance of descriptors, and a measure of compound similarity that is weighted by the relative importance of descriptors. It is the combination of relatively high prediction accuracy and its collection of desired features that makes Random Forest uniquely suited for modeling in cheminformatics.
                Bookmark

                Author and article information

                Contributors
                ytang234@ecust.edu.cn
                Journal
                J Cheminform
                J Cheminform
                Journal of Cheminformatics
                Springer International Publishing (Cham )
                1758-2946
                20 March 2023
                20 March 2023
                2023
                : 15
                : 35
                Affiliations
                GRID grid.28056.39, ISNI 0000 0001 2163 4895, Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, , East China University of Science and Technology, ; Shanghai, 200237 China
                Article
                707
                10.1186/s13321-023-00707-x
                10029263
                36941726
                73bc2260-ccda-40dc-aec5-1a0cfcbc4185
                © The Author(s) 2023

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 11 November 2022
                : 6 March 2023
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100012166, National Key Research and Development Program of China;
                Award ID: Grant 2019YFA0904800
                Funded by: National Natural Science Foundation of China
                Award ID: Grants 81872800
                Award ID: 82173746
                Funded by: 111 Project
                Award ID: Grant BP0719034
                Categories
                Research
                Custom metadata
                © The Author(s) 2023

                Chemoinformatics
                mutagenicity optimization,lead optimization,matched molecular pairs analysis,machine learning,consensus model

                Comments

                Comment on this article