2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine learning with physicochemical relationships: solubility prediction in organic solvents and water

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models.

          Abstract

          Accurate prediction of solubility represents a challenge for traditional computational approaches due to the complex nature of phenomena involved. Here the authors report a successful approach to solubility prediction in organic solvents and water using combination of machine learning and computational chemistry.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

          Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these black box models will alleviate some of these problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society. There is a way forward - it is to design models that are inherently interpretable. This manuscript clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare, and computer vision.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Lead- and drug-like compounds: the rule-of-five revolution.

            Citations in CAS SciFinder to the rule-of-five (RO5) publication will exceed 1000 by year-end 2004. Trends in the RO5 literature explosion that can be discerned are the further definitions of drug-like. This topic is explored in terms of drug-like physicochemical features, drug-like structural features, a comparison of drug-like and non-drug-like in drug discovery and a discussion of how drug-like features relate to clinical success. Physicochemical features of CNS drugs and features related to CNS blood-brain transporter affinity are briefly reviewed. Recent literature on features of non-oral drugs is reviewed and how features of lead-like compounds differ from those of drug-like compounds is discussed. Most recently, partly driven by NIH roadmap initiatives, considerations have arisen as to what tool-like means in the search for chemical tools to probe biology space. All these topics frame the scope of this short review/perspective.: © 2004 Elsevier Ltd . All rights reserved.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient

                Bookmark

                Author and article information

                Contributors
                b.nguyen@leeds.ac.uk
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                13 November 2020
                13 November 2020
                2020
                : 11
                : 5753
                Affiliations
                [1 ]GRID grid.9909.9, ISNI 0000 0004 1936 8403, Institute of Process Research & Development, School of Chemistry, , University of Leeds, Woodhouse Lane, ; Leeds, LS2 9JT UK
                [2 ]GRID grid.417815.e, ISNI 0000 0004 5929 4381, Chemical Development, Pharmaceutical Technology and Development, Operations, AstraZeneca, ; Macclesfield, SK10 2NA UK
                Author information
                http://orcid.org/0000-0002-3166-2782
                http://orcid.org/0000-0003-3872-7996
                http://orcid.org/0000-0002-0254-025X
                Article
                19594
                10.1038/s41467-020-19594-z
                7666209
                33188226
                d575325b-4f64-4e73-816e-74f1ed978768
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 23 April 2020
                : 12 October 2020
                Categories
                Article
                Custom metadata
                © The Author(s) 2020

                Uncategorized
                cheminformatics,computational chemistry,computational science,statistics
                Uncategorized
                cheminformatics, computational chemistry, computational science, statistics

                Comments

                Comment on this article