6
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DeepCLIP: predicting the effect of mutations on protein–RNA binding with deep learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Nucleotide variants can cause functional changes by altering protein–RNA binding in various ways that are not easy to predict. This can affect processes such as splicing, nuclear shuttling, and stability of the transcript. Therefore, correct modeling of protein–RNA binding is critical when predicting the effects of sequence variations. Many RNA-binding proteins recognize a diverse set of motifs and binding is typically also dependent on the genomic context, making this task particularly challenging. Here, we present DeepCLIP, the first method for context-aware modeling and predicting protein binding to RNA nucleic acids using exclusively sequence data as input. We show that DeepCLIP outperforms existing methods for modeling RNA-protein binding. Importantly, we demonstrate that DeepCLIP predictions correlate with the functional outcomes of nucleotide variants in independent wet lab experiments. Furthermore, we show how DeepCLIP binding profiles can be used in the design of therapeutically relevant antisense oligonucleotides, and to uncover possible position-dependent regulation in a tissue-specific manner. DeepCLIP is freely available as a stand-alone application and as a webtool at http://deepclip.compbio.sdu.dk.

          Related collections

          Most cited references76

          • Record: found
          • Abstract: found
          • Article: not found

          Deep learning.

          Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Long Short-Term Memory

            Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.

              Genome-scale studies have revealed extensive, cell type-specific colocalization of transcription factors, but the mechanisms underlying this phenomenon remain poorly understood. Here, we demonstrate in macrophages and B cells that collaborative interactions of the common factor PU.1 with small sets of macrophage- or B cell lineage-determining transcription factors establish cell-specific binding sites that are associated with the majority of promoter-distal H3K4me1-marked genomic regions. PU.1 binding initiates nucleosome remodeling, followed by H3K4 monomethylation at large numbers of genomic regions associated with both broadly and specifically expressed genes. These locations serve as beacons for additional factors, exemplified by liver X receptors, which drive both cell-specific gene expression and signal-dependent responses. Together with analyses of transcription factor binding and H3K4me1 patterns in other cell types, these studies suggest that simple combinations of lineage-determining transcription factors can specify the genomic sites ultimately responsible for both cell identity and cell type-specific responses to diverse signaling inputs. Copyright 2010 Elsevier Inc. All rights reserved.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                27 July 2020
                19 June 2020
                19 June 2020
                : 48
                : 13
                : 7099-7118
                Affiliations
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Mathematics and Computer Science, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Mathematics and Computer Science, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Department of Mathematics and Computer Science, University of Southern Denmark , 5230 Odense M, Denmark
                Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich , 85354 Freising, Germany
                Department of Biochemistry and Molecular Biology, University of Southern Denmark , 5230 Odense M, Denmark
                Villum Center for Bioanalytical Sciences, University of Southern Denmark , 5230 Odense M, Denmark
                Author notes
                To whom correspondence should be addressed. Tel: +45 6550 2416; Email: thomaskd@ 123456bmb.sdu.dk

                The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.

                Author information
                https://orcid.org/http://orcid.org/0000-0002-5315-8325
                https://orcid.org/http://orcid.org/0000-0001-8076-5035
                https://orcid.org/http://orcid.org/0000-0001-7488-3035
                Article
                gkaa530
                10.1093/nar/gkaa530
                7367176
                32558887
                b8bcba9b-d0b5-49cc-a17a-8388dc451e46
                © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 June 2020
                : 11 May 2020
                : 23 October 2019
                Page count
                Pages: 20
                Product
                Funding
                Funded by: Lundbeckfonden, DOI 10.13039/501100003554;
                Award ID: R231-2016-2823
                Funded by: Muskelsvindfonden, DOI 10.13039/501100011641;
                Award ID: 4181-00515
                Funded by: Novo Nordisk Fonden, DOI 10.13039/501100009708;
                Award ID: NNF17OC0029240
                Funded by: ODEx;
                Funded by: VILLUM Young Investigator;
                Award ID: 73528
                Funded by: H2020, DOI 10.13039/100010661;
                Award ID: 777111
                Categories
                AcademicSubjects/SCI00010
                Narese/16
                Narese/24
                Computational Biology

                Genetics
                Genetics

                Comments

                Comment on this article