0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      iPro2L-DG: Hybrid network based on improved densenet and global attention mechanism for identifying promoter sequences

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The promoter is a key DNA sequence whose primary function is to control the initiation time and the degree of expression of gene transcription. Accurate identification of promoters is essential for understanding gene expression studies. Traditional sequencing techniques for identifying promoters are costly and time-consuming. Therefore, the development of computational methods to identify promoters has become critical. Since deep learning methods show great potential in identifying promoters, this study proposes a new promoter prediction model, called iPro2L-DG. The iPro2L-DG predictor, based on an improved Densely Connected Convolutional Network (DenseNet) and a Global Attention Mechanism (GAM), is constructed to achieve the prediction of promoters. The promoter sequences are combined feature encoding using C2 encoding and nucleotide chemical property (NCP) encoding. An improved DenseNet extracts advanced feature information from the combined feature encoding. GAM evaluates the importance of advanced feature information in terms of channel and spatial dimensions, and finally uses a Full Connect Neural Network (FNN) to derive prediction probabilities. The experimental results showed that the accuracy of iPro2L-DG in the first layer (promoter identification) was 94.10% with Matthews correlation coefficient value of 0.8833. In the second layer (promoter strength prediction), the accuracy was 89.42% with Matthews correlation coefficient value of 0.7915. The iPro2L-DG predictor significantly outperforms other existing predictors in promoter identification and promoter strength prediction. Therefore, our proposed model iPro2L-DG is the most advanced promoter prediction tool. The source code of the iPro2L-DG model can be found in https://github.com/leirufeng/iPro2L-DG.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          CD-HIT Suite: a web server for clustering and comparing biological sequences

          Summary: CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels. Availability: Free access at http://cd-hit.org Contact: liwz@sdsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            mRNAs, proteins and the emerging principles of gene expression control

            Gene expression involves transcription, translation and the turnover of mRNAs and proteins. The degree to which protein abundances scale with mRNA levels and the implications in cases where this dependency breaks down remain an intensely debated topic. Here we review recent mRNA-protein correlation studies in the light of the quantitative parameters of the gene expression pathway, contextual confounders and buffering mechanisms. Although protein and mRNA levels typically show reasonable correlation, we describe how transcriptomics and proteomics provide useful non-redundant readouts. Integrating both types of data can reveal exciting biology and is an essential step in refining our understanding of the principles of gene expression control.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome.

              Computational methods for automated genome annotation are critical to understanding and interpreting the bewildering mass of genomic sequence data presently being generated and released. A neural network model of the structural and compositional properties of a eukaryotic core promoter region has been developed and its application for analysis of the Drosophila melanogaster genome is presented. The model uses a time-delay architecture, a special case of a feed-forward neural network. The structure of this model allows for variable spacing between functional binding sites, which is known to play a key role in the transcription initiation process. Application of this model to a test set of core promoters not only gave better discrimination of potential promoter sites than previous statistical or neural network models, but also revealed indirectly subtle properties of the transcription initiation signal. When tested in the Adh region of 2.9 Mbases of the Drosophila genome, the neural network for promoter prediction (NNPP) program that incorporates the time-delay neural network model gives a recognition rate of 75% (69/92) with a false positive rate of 1/547 bases. The present work can be regarded as one of the first intensive studies that applies novel gene regulation technologies to the identification of the complex gene regulation sites in the genome of Drosophila melanogaster.
                Bookmark

                Author and article information

                Contributors
                Journal
                Heliyon
                Heliyon
                Heliyon
                Elsevier
                2405-8440
                06 March 2024
                30 March 2024
                06 March 2024
                : 10
                : 6
                : e27364
                Affiliations
                [a ]School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
                [b ]Business School, Jiangxi Institute of Fashion Technology, Nanchang, 330044, China
                Author notes
                []Corresponding author. jjh163yx@ 123456163.com
                Article
                S2405-8440(24)03395-4 e27364
                10.1016/j.heliyon.2024.e27364
                10950492
                38510021
                523c9fd6-a74a-41fa-9ed0-65e8a6bede93
                © 2024 The Authors

                This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

                History
                : 26 October 2023
                : 24 February 2024
                : 28 February 2024
                Categories
                Research Article

                promoter,promoter strength,densenet,global attention mechanism,encoding

                Comments

                Comment on this article