3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      TS-m6A-DL: Tissue-specific identification of N6-methyladenosine sites using a universal deep learning model

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Graphical abstract

          First step is the data pre-processing to retrieve the sequence from raw data.The second step is to encode the sequences using on-hot-encoding to make the data readable for the network. The third step is the neural network model construction, and the last step is to classify the sequence as methylated or non-methylated.

          Abstract

          The most communal post-transcriptional modification, N6-methyladenosine (m6A), is associated with a number of crucial biological processes. The precise detection of m6A sites around the genome is critical for revealing its regulatory function and providing new insights into drug design. Although both experimental and computational models for detecting m6A sites have been introduced, but these conventional methods are laborious and expensive. Furthermore, only a handful of these models are capable of detecting m6A sites in various tissues. Therefore, a more generic and optimized computational method for detecting m6A sites in different tissues is required. In this paper, we proposed a universal model using a deep neural network (DNN) and named it TS-m6A-DL, which can classify m6A sites in several tissues of humans ( Homo sapiens), mice ( Mus musculus), and rats ( Rattus norvegicus). To extract RNA sequence features and to convert the input into numerical format for the network, we utilized one-hot-encoding method. The model was tested using fivefold cross-validation and its stability was measured using independent datasets. The proposed model, TS-m6A-DL, achieved accuracies in the range of 75–85% using the fivefold cross-validation method and 72–84% on the independent datasets. Finally, to authenticate the generalization of the model, we performed cross-species testing and proved the generalization ability by achieving state-of-the-art results.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

          In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq.

            An extensive repertoire of modifications is known to underlie the versatile coding, structural and catalytic functions of RNA, but it remains largely uncharted territory. Although biochemical studies indicate that N(6)-methyladenosine (m(6)A) is the most prevalent internal modification in messenger RNA, an in-depth study of its distribution and functions has been impeded by a lack of robust analytical methods. Here we present the human and mouse m(6)A modification landscape in a transcriptome-wide manner, using a novel approach, m(6)A-seq, based on antibody-mediated capture and massively parallel sequencing. We identify over 12,000 m(6)A sites characterized by a typical consensus in the transcripts of more than 7,000 human genes. Sites preferentially appear in two distinct landmarks--around stop codons and within long internal exons--and are highly conserved between human and mouse. Although most sites are well preserved across normal and cancerous tissues and in response to various stimuli, a subset of stimulus-dependent, dynamically modulated sites is identified. Silencing the m(6)A methyltransferase significantly affects gene expression and alternative splicing patterns, resulting in modulation of the p53 (also known as TP53) signalling pathway and apoptosis. Our findings therefore suggest that RNA decoration by m(6)A has a fundamental role in regulation of gene expression.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons.

              Methylation of the N(6) position of adenosine (m(6)A) is a posttranscriptional modification of RNA with poorly understood prevalence and physiological relevance. The recent discovery that FTO, an obesity risk gene, encodes an m(6)A demethylase implicates m(6)A as an important regulator of physiological processes. Here, we present a method for transcriptome-wide m(6)A localization, which combines m(6)A-specific methylated RNA immunoprecipitation with next-generation sequencing (MeRIP-Seq). We use this method to identify mRNAs of 7,676 mammalian genes that contain m(6)A, indicating that m(6)A is a common base modification of mRNA. The m(6)A modification exhibits tissue-specific regulation and is markedly increased throughout brain development. We find that m(6)A sites are enriched near stop codons and in 3' UTRs, and we uncover an association between m(6)A residues and microRNA-binding sites within 3' UTRs. These findings provide a resource for identifying transcripts that are substrates for adenosine methylation and reveal insights into the epigenetic regulation of the mammalian transcriptome. Copyright © 2012 Elsevier Inc. All rights reserved.
                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Struct Biotechnol J
                Comput Struct Biotechnol J
                Computational and Structural Biotechnology Journal
                Research Network of Computational and Structural Biotechnology
                2001-0370
                10 August 2021
                2021
                10 August 2021
                : 19
                : 4619-4625
                Affiliations
                [a ]Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea
                [b ]Institute of Avionics and Aeronautics (IAA), Air University, Islamabad 44000, Pakistan
                [c ]School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea
                [d ]Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
                [e ]Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea
                Author notes
                [* ]Corresponding authors at: Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China (Q. Zou). Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea (K.T. Chong). zouquan@ 123456nclab.net kitchong@ 123456jbnu.ac.kr
                [1]

                Zeeshan Abbas and Hilal Tayara contributed equally.

                Article
                S2001-0370(21)00345-7
                10.1016/j.csbj.2021.08.014
                8383060
                34471503
                389ca1c4-1b1f-4bff-b23d-e342a054da4e
                © 2021 The Authors

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 24 May 2021
                : 8 August 2021
                : 9 August 2021
                Categories
                Research Article

                binary-encoding,deep neural network,n6-methyladenosine (m6a),motif,tissue-specific

                Comments

                Comment on this article