20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search†

      research-article
      a , a ,
      Chemical Science
      The Royal Society of Chemistry

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Computer-aided synthesis planning (CASP) aims to automatically learn organic reactivity from literature and perform retrosynthesis of unseen molecules. CASP systems must learn reactions sufficiently precisely to propose realistic disconnections, while avoiding overfitting to leave room for diverse options, and explore possible routes such as to allow short synthetic sequences to emerge. Herein we report an open-source CASP tool proposing original solutions to both challenges. First, we use a triple transformer loop (TTL) predicting starting materials (T1), reagents (T2), and products (T3) to explore various disconnection sites defined by combining systematic, template-based, and transformer-based tagging procedures. Second, we integrate TTL into a multistep tree search algorithm (TTLA) prioritizing sequences using a route penalty score (RPScore) considering the number of steps, their confidence score, and the simplicity of all intermediates along the route. Our approach favours short synthetic routes to commercial starting materials, as exemplified by retrosynthetic analyses of recently approved drugs.

          Abstract

          An efficient transformer-based retrosynthesis model, the triple-transformer loop algorithm (TTLA), is reported and proposes short routes from commercial building blocks for a variety of drugs.

          Related collections

          Most cited references46

          • Record: found
          • Abstract: found
          • Article: not found

          Attention Is All You Need

          The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. 15 pages, 5 figures
            • Record: found
            • Abstract: not found
            • Article: not found

            SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

              • Record: found
              • Abstract: found
              • Article: not found

              Planning chemical syntheses with deep neural networks and symbolic AI

              To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.

                Author and article information

                Journal
                Chem Sci
                Chem Sci
                SC
                CSHCBM
                Chemical Science
                The Royal Society of Chemistry
                2041-6520
                2041-6539
                1 September 2023
                20 September 2023
                1 September 2023
                : 14
                : 36
                : 9959-9969
                Affiliations
                [a ] Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland david.kreutter@ 123456unibe.ch jean-louis.reymond@ 123456unibe.ch
                Author information
                https://orcid.org/0000-0003-2487-1049
                https://orcid.org/0000-0003-2724-2942
                Article
                d3sc01604h
                10.1039/d3sc01604h
                10510629
                37736648
                5cf1978c-dc7b-49ae-a28b-e822929e1b1f
                This journal is © The Royal Society of Chemistry
                History
                : 27 March 2023
                : 30 August 2023
                Page count
                Pages: 11
                Funding
                Funded by: Novartis, doi 10.13039/100004336;
                Award ID: Unassigned
                Funded by: University of Bern, doi 10.13039/100009068;
                Award ID: Unassigned
                Categories
                Chemistry
                Custom metadata
                Paginated Article

                Comments

                Comment on this article

                Related Documents Log