+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as ‘virtual screening’ (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance.

          The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.

          Related collections

          Most cited references170

          • Record: found
          • Abstract: not found
          • Article: not found

          Identification of common molecular subsequences.

            • Record: found
            • Abstract: found
            • Article: not found

            Amino acid substitution matrices from protein blocks.

            Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
              • Record: found
              • Abstract: found
              • Article: not found

              Improved protein-ligand docking using GOLD.

              The Chemscore function was implemented as a scoring function for the protein-ligand docking program GOLD, and its performance compared to the original Goldscore function and two consensus docking protocols, "Goldscore-CS" and "Chemscore-GS," in terms of docking accuracy, prediction of binding affinities, and speed. In the "Goldscore-CS" protocol, dockings produced with the Goldscore function are scored and ranked with the Chemscore function; in the "Chemscore-GS" protocol, dockings produced with the Chemscore function are scored and ranked with the Goldscore function. Comparisons were made for a "clean" set of 224 protein-ligand complexes, and for two subsets of this set, one for which the ligands are "drug-like," the other for which they are "fragment-like." For "drug-like" and "fragment-like" ligands, the docking accuracies obtained with Chemscore and Goldscore functions are similar. For larger ligands, Goldscore gives superior results. Docking with the Chemscore function is up to three times faster than docking with the Goldscore function. Both combined docking protocols give significant improvements in docking accuracy over the use of the Goldscore or Chemscore function alone. "Goldscore-CS" gives success rates of up to 81% (top-ranked GOLD solution within 2.0 A of the experimental binding mode) for the "clean list," but at the cost of long search times. For most virtual screening applications, "Chemscore-GS" seems optimal; search settings that give docking speeds of around 0.25-1.3 min/compound have success rates of about 78% for "drug-like" compounds and 85% for "fragment-like" compounds. In terms of producing binding energy estimates, the Goldscore function appears to perform better than the Chemscore function and the two consensus protocols, particularly for faster search settings. Even at docking speeds of around 1-2 min/compound, the Goldscore function predicts binding energies with a standard deviation of approximately 10.5 kJ/mol. Copyright 2003 Wiley-Liss, Inc.

                Author and article information

                Brief Bioinform
                Brief. Bioinformatics
                Briefings in Bioinformatics
                Oxford University Press
                September 2019
                31 July 2018
                31 July 2018
                : 20
                : 5
                : 1878-1912
                [1 ] Department of Computer Engineering, Middle East Technical University , Ankara, Turkey
                [1a ] Department of Computer Engineering, İskenderun Technical University , Hatay, Turkey
                [2 ] Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University , Ankara, Turkey
                [3 ] European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI) , Cambridge, Hinxton, UK
                [4 ] Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
                Author notes
                Corresponding author: Tunca Doğan, Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey. E-mail: tuncadogan@ 123456gmail.com
                © The Author(s) 2018. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                : 25 January 2018
                : 25 May 2018
                Page count
                Pages: 36
                Funded by: Turkish Ministry of Development
                Funded by: KanSiL
                Award ID: KanSil_2016K121540
                Funded by: Newton/Katip Celebi Institutional Links
                Funded by: TUBITAK
                Funded by: Turkey and British Council
                Award ID: 116E930
                Funded by: European Molecular Biology Laboratory 10.13039/100013060
                Review Articles

                Bioinformatics & Computational biology
                virtual screening,drug-target interactions,ligand-based vs and proteochemometric modelling,machine learning,deep learning,compound and bioactivity databases,gold-standard data sets


                Comment on this article