5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      DGA-5mC: A 5-methylcytosine site prediction model based on an improved DenseNet and bidirectional GRU method

      , ,
      Mathematical Biosciences and Engineering
      American Institute of Mathematical Sciences (AIMS)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          <abstract> <p>The 5-methylcytosine (5mC) in the promoter region plays a significant role in biological processes and diseases. A few high-throughput sequencing technologies and traditional machine learning algorithms are often used by researchers to detect 5mC modification sites. However, high-throughput identification is laborious, time-consuming and expensive; moreover, the machine learning algorithms are not so advanced. Therefore, there is an urgent need to develop a more efficient computational approach to replace those traditional methods. Since deep learning algorithms are more popular and have powerful computational advantages, we constructed a novel prediction model, called DGA-5mC, to identify 5mC modification sites in promoter regions by using a deep learning algorithm based on an improved densely connected convolutional network (DenseNet) and the bidirectional GRU approach. Furthermore, we added a self-attention module to evaluate the importance of various 5mC features. The deep learning-based DGA-5mC model algorithm automatically handles large proportions of unbalanced data for both positive and negative samples, highlighting the model's reliability and superiority. So far as the authors are aware, this is the first time that the combination of an improved DenseNet and bidirectional GRU methods has been used to predict the 5mC modification sites in promoter regions. It can be seen that the DGA-5mC model, after using a combination of one-hot coding, nucleotide chemical property coding and nucleotide density coding, performed well in terms of sensitivity, specificity, accuracy, the Matthews correlation coefficient (MCC), area under the curve and Gmean in the independent test dataset: 90.19%, 92.74%, 92.54%, 64.64%, 96.43% and 91.46%, respectively. In addition, all datasets and source codes for the DGA-5mC model are freely accessible at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/lulukoss/DGA-5mC">https://github.com/lulukoss/DGA-5mC</ext-link>.</p> </abstract>

          Related collections

          Most cited references46

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          CD-HIT: accelerated for clustering the next-generation sequencing data

          Summary: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ∼24 cores and a quasi-linear speedup for up to ∼8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. Availability: http://cd-hit.org. Contact: liwz@sdsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Deep learning in bioinformatics.

            In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The landscape of cancer cell line metabolism

              Despite considerable efforts to identify cancer metabolic alterations that might unveil druggable vulnerabilities, systematic characterizations of metabolism as it relates to functional genomic features and associated dependencies remain uncommon. To further understand the metabolic diversity in cancer, we profiled 225 metabolites in 928 cell lines from more than 20 cancer types in the Cancer Cell Line Encyclopedia (CCLE) using liquid chromatography-mass spectrometry (LC-MS). This resource enables unbiased association analysis linking cancer metabolome to genetic alterations, epigenetic features, and gene dependencies. Additionally, by screening barcoded cell lines, we demonstrated that aberrant ASNS hypermethylation sensitizes subsets of gastric and hepatic cancers to asparaginase therapy. Finally, our analysis revealed distinct synthesis and secretion patterns of kynurenine, an immune-suppressive metabolite, in model cancer cell lines. Together, these findings and related methodology provide comprehensive resources that will help to clarify the landscape of cancer metabolism.
                Bookmark

                Author and article information

                Journal
                Mathematical Biosciences and Engineering
                MBE
                American Institute of Mathematical Sciences (AIMS)
                1551-0018
                2023
                2023
                : 20
                : 6
                : 9759-9780
                Article
                10.3934/mbe.2023428
                37322910
                797ea308-bc70-446a-a9ed-79bcf184c6e6
                © 2023
                History

                Comments

                Comment on this article