5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks

      1 , 2 , 1 , 3 , 1 , 2
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Drug discovery demands rapid quantification of compound–protein interaction (CPI). However, there is a lack of methods that can predict compound–protein affinity from sequences alone with high applicability, accuracy and interpretability.

          Results

          We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC50 within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug–target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.

          Availability and implementation

          Data and source codes are available at https://github.com/Shen-Lab/DeepAffinity.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references43

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Pfam: the protein families database

          Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A comprehensive map of molecular drug targets

              The success of mechanism-based drug discovery depends on the definition of the drug target. This definition becomes even more important as we try to link drug response to genetic variation, understand stratified clinical efficacy and safety, rationalize the differences between drugs in the same therapeutic
                Bookmark

                Author and article information

                Contributors
                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                September 15 2019
                September 15 2019
                February 15 2019
                September 15 2019
                September 15 2019
                February 15 2019
                : 35
                : 18
                : 3329-3338
                Affiliations
                [1 ]Department of Electrical and Computer Engineering, College Station, TX, USA
                [2 ]TEES–AgriLife Center for Bioinformatics and Genomic Systems Engineering, College Station, TX, USA
                [3 ]Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
                Article
                10.1093/bioinformatics/btz111
                30768156
                63ba4e7b-0fba-44d7-8116-1a21b3f82b55
                © 2019

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article