Deep learning for molecular design—a review of the state of the art

Author(s): Daniel C. Elton ¹ ^, ² ^, ³ ^, ⁴ , Zois Boukouvalas ¹ ^, ² ^, ³ ^, ⁴ ^, ⁵ , Mark D. Fuge ¹ ^, ² ^, ³ ^, ⁴ , Peter W. Chung ¹ ^, ² ^, ³ ^, ⁴

Publication date (Electronic): 2019

Journal: Molecular Systems Design & Engineering

Publisher: Royal Society of Chemistry (RSC)

Read this article at

ScienceOpenPublisher

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We review a recent groundswell of work which uses deep learning techniques to generate and optimize molecules.

Abstract

In the space of only a few years, deep generative modeling has revolutionized how we think of artificial creativity, yielding autonomous systems which produce original images, music, and text. Inspired by these successes, researchers are now applying deep generative modeling techniques to the generation and optimization of molecules—in our review we found 45 papers on the subject published in the past two years. These works point to a future where such systems will be used to generate lead molecules, greatly reducing resources spent downstream synthesizing and characterizing bad leads in the lab. In this review we survey the increasingly complex landscape of models and representation schemes that have been proposed. The four classes of techniques we describe are recursive neural networks, autoencoders, generative adversarial networks, and reinforcement learning. After first discussing some of the mathematical fundamentals of each technique, we draw high level connections and comparisons with other techniques and expose the pros and cons of each. Several important high level themes emerge as a result of this work, including the shift away from the SMILES string representation of molecules towards more sophisticated representations such as graph grammars and 3D representations, the importance of reward function design, the need for better standards for benchmarking and testing, and the benefits of adversarial training and reinforcement learning over maximum likelihood based training.

Related collections

Most cited references 70

Record: found
Abstract: not found
Article: not found

Machine learning for molecular and materials science

Olexandr Isayev, Hugh Cartwright, Daniel W. Davies … (2018)

0 comments Cited 824 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Planning chemical syntheses with deep neural networks and symbolic AI

Mike Preuss, Mark Waller, Marwin H. S. Segler (2018)

To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.