28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

      research-article
      , , *
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors have been shown to not be sufficiently informative to predict accurate DTIs. Thus, in this study, we propose a deep learning based DTI prediction model capturing local residue patterns of proteins participating in DTIs. When we employ a convolutional neural network (CNN) on raw protein sequences, we perform convolution on various lengths of amino acids subsequences to capture local residue patterns of generalized protein classes. We train our model with large-scale DTI information and demonstrate the performance of the proposed model using an independent dataset that is not seen during the training phase. As a result, our model performs better than previous protein descriptor-based models. Also, our model performs better than the recently developed deep learning models for massive prediction of DTIs. By examining pooled convolution results, we confirmed that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches. Our code is available at https://github.com/GIST-CSBL/DeepConv-DTI.

          Author summary

          Drugs work by interacting with target proteins to activate or inhibit a target’s biological process. Therefore, identification of DTIs is a crucial step in drug discovery. However, identifying drug candidates via biological assays is very time and cost consuming, which introduces the need for a computational prediction approach for the identification of DTIs. In this work, we constructed a novel DTI prediction model to extract local residue patterns of target protein sequences using a CNN-based deep learning approach. As a result, the detected local features of protein sequences perform better than other protein descriptors for DTI prediction and previous models for predicting PubChem independent test datasets. That is, our approach of capturing local residue patterns with CNN successfully enriches protein features from a raw sequence.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: not found
          • Article: not found

          Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Identification of common molecular subsequences.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Prediction of drug–target interaction networks from the integration of chemical and genomic spaces

              Motivation: The identification of interactions between drugs and target proteins is a key area in genomic drug discovery. Therefore, there is a strong incentive to develop new methods capable of detecting these potential drug–target interactions efficiently. Results: In this article, we characterize four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, and reveal significant correlations between drug structure similarity, target sequence similarity and the drug–target interaction network topology. We then develop new statistical methods to predict unknown drug–target interaction networks from chemical structure and genomic sequence information simultaneously on a large scale. The originality of the proposed method lies in the formalization of the drug–target interaction inference as a supervised learning problem for a bipartite graph, the lack of need for 3D structure information of the target proteins, and in the integration of chemical and genomic spaces into a unified space that we call ‘pharmacological space’. In the results, we demonstrate the usefulness of our proposed method for the prediction of the four classes of drug–target interaction networks. Our comprehensively predicted drug–target interaction networks enable us to suggest many potential drug–target interactions and to increase research productivity toward genomic drug discovery. Availability: Softwares are available upon request. Contact: Yoshihiro.Yamanishi@ensmp.fr Supplementary information: Datasets and all prediction results are available at http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: Data curationRole: MethodologyRole: SoftwareRole: Validation
                Role: ConceptualizationRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SupervisionRole: ValidationRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                14 June 2019
                June 2019
                : 15
                : 6
                : e1007129
                Affiliations
                [001]School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Buk-ku, Gwangju, Republic of Korea
                University of Houston, UNITED STATES
                Author notes

                No authors have competing interests.

                Author information
                http://orcid.org/0000-0002-8958-2945
                http://orcid.org/0000-0002-5109-9114
                Article
                PCOMPBIOL-D-18-01686
                10.1371/journal.pcbi.1007129
                6594651
                31199797
                fcfa5021-7c28-4c48-a15d-f82ff218158c
                © 2019 Lee et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 1 October 2018
                : 24 May 2019
                Page count
                Figures: 7, Tables: 0, Pages: 21
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100003621, Ministry of Science, ICT and Future Planning;
                Award ID: NRF-2018M3A9A7053266.
                Award Recipient :
                Funded by: Bio-Synergy Research Project
                Award ID: NRF-2017M3A9C4092978
                Award Recipient :
                This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2018M3A9A7053266), the Bio-Synergy Research Project (NRF-2017M3A9C4092978) of the Ministry of Science and ICT through the National Research Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Mathematical Functions
                Convolution
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Statistical Methods
                Forecasting
                Physical Sciences
                Mathematics
                Statistics
                Statistical Methods
                Forecasting
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Sequence Analysis
                Sequence Motif Analysis
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Deep Learning
                Computer and Information Sciences
                Neural Networks
                Biology and Life Sciences
                Neuroscience
                Neural Networks
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzymes
                Protein Kinases
                Biology and Life Sciences
                Biochemistry
                Proteins
                Enzymes
                Protein Kinases
                Medicine and Health Sciences
                Pharmacology
                Drug Research and Development
                Drug Discovery
                Research and Analysis Methods
                Extraction Techniques
                Protein Extraction
                Custom metadata
                vor-update-to-uncorrected-proof
                2019-06-26
                All code we used in manuscript are available from GitHub repository ( https://github.com/GIST-CSBL/DeepConv-DTI)

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article