61
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Predicting reaction performance in C–N cross-coupling using machine learning

      , , , ,
      Science
      American Association for the Advancement of Science (AAAS)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. Here we demonstrate that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we show that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Random forest: a classification and regression tool for compound classification and QSAR modeling.

          A new classification and regression tool, Random Forest, is introduced and investigated for predicting a compound's quantitative or categorical biological activity based on a quantitative description of the compound's molecular structure. Random Forest is an ensemble of unpruned classification or regression trees created by using bootstrap samples of the training data and random feature selection in tree induction. Prediction is made by aggregating (majority vote or averaging) the predictions of the ensemble. We built predictive models for six cheminformatics data sets. Our analysis demonstrates that Random Forest is a powerful tool capable of delivering performance that is among the most accurate methods to date. We also present three additional features of Random Forest: built-in performance assessment, a measure of relative importance of descriptors, and a measure of compound similarity that is weighted by the relative importance of descriptors. It is the combination of relatively high prediction accuracy and its collection of desired features that makes Random Forest uniquely suited for modeling in cheminformatics.
            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            Applied Regression Analysis

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Computer-Assisted Synthetic Planning: The End of the Beginning

                Bookmark

                Author and article information

                Journal
                Science
                Science
                American Association for the Advancement of Science (AAAS)
                0036-8075
                1095-9203
                April 12 2018
                April 13 2018
                April 13 2018
                February 15 2018
                : 360
                : 6385
                : 186-190
                Article
                10.1126/science.aar5169
                29449509
                b48ae8a9-c67c-40b2-8752-87f75e8b62da
                © 2018

                http://www.sciencemag.org/about/science-licenses-journal-article-reuse

                History

                Comments

                Comment on this article