1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Graphical abstract

          The first step is Dataset preparation where sequences are extracted from the raw data. To make the data understandable for the network, the second step is sequence encoding where one-hot encoding is used. Further, as the third step CNN model training is carried out which is evaluated on the fourth step. At the last step, a webserver is generated for the researchers.

          Abstract

          DNA N4-methylcytosine (4mC) being a significant genetic modification holds a dominant role in controlling different biological functions, i.e., DNA replication, DNA repair, gene regulations and gene expression levels. The identification of 4mC sites is important to get insight information regarding different organics mechanisms. However, getting modification prediction from experimental methods is a challenging task due to high expenses and time-consuming techniques. Therefore, computational tools can be a great option for modification identification. Various computational tools are proposed in literature but their generalization and prediction performance require improvement. For this motive, we have proposed a neural network based tool named DCNN-4mC for identifying 4mC sites. The proposed model involves a set of neural network layers with a skip connection which allows to share the shallow features with dense layers. Skip connection have allowed to gather crucial information regarding 4mC sites. In literature, different models are employed on different species hence in many cases different datasets are available for a single species. In this research, we have combined all available datasets to create a single benchmark dataset for every species. To the best of our knowledge, no model in literature is employed on more than six different species. To ensure the generalizability of DCNN-4mC we have used 12 different species for performance evaluation. The DCNN-4mC tool has attained 2% to 14% higher accuracy than state-of-the-art tools on all available datasets of different species. Furthermore, independent test datasets are also engaged and DCNN-4mC have overall yielded high performance in them as well.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

          Background To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet. Accuracy and F1 score computed on confusion matrices have been (and still are) among the most popular adopted metrics in binary classification tasks. However, these statistical measures can dangerously show overoptimistic inflated results, especially on imbalanced datasets. Results The Matthews correlation coefficient (MCC), instead, is a more reliable statistical rate which produces a high score only if the prediction obtained good results in all of the four confusion matrix categories (true positives, false negatives, true negatives, and false positives), proportionally both to the size of positive elements and the size of negative elements in the dataset. Conclusions In this article, we show how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario. We believe that the Matthews correlation coefficient should be preferred to accuracy and F1 score in evaluating binary classification tasks by all scientific communities.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A guide to deep learning in healthcare

            Here we present deep-learning techniques for healthcare, centering our discussion on deep learning in computer vision, natural language processing, reinforcement learning, and generalized methods. We describe how these computational techniques can impact a few key areas of medicine and explore how to build end-to-end systems. Our discussion of computer vision focuses largely on medical imaging, and we describe the application of natural language processing to domains such as electronic health record data. Similarly, reinforcement learning is discussed in the context of robotic-assisted surgery, and generalized deep-learning methods for genomics are reviewed.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The DNA methyltransferase family: a versatile toolkit for epigenetic regulation

              Frank Lyko (2018)
              The DNA methyltransferase (DNMT) family comprises a conserved set of DNA-modifying enzymes that have a central role in epigenetic gene regulation. Recent studies have shown that the functions of the canonical DNMT enzymes - DNMT1, DNMT3A and DNMT3B - go beyond their traditional roles of establishing and maintaining DNA methylation patterns. This Review analyses how molecular interactions and changes in gene copy numbers modulate the activity of DNMTs in diverse gene regulatory functions, including transcriptional silencing, transcriptional activation and post-transcriptional regulation by DNMT2-dependent tRNA methylation. This mechanistic diversity enables the DNMT family to function as a versatile toolkit for epigenetic regulation.
                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Struct Biotechnol J
                Comput Struct Biotechnol J
                Computational and Structural Biotechnology Journal
                Research Network of Computational and Structural Biotechnology
                2001-0370
                01 November 2021
                2021
                01 November 2021
                : 19
                : 6009-6019
                Affiliations
                [a ]Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea
                [b ]Department of Avionics Engineering, Air University, Islamabad 44000, Pakistan
                [c ]School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea
                [d ]Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea
                Author notes
                [* ]Corresponding author at: School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea (Hilal Tayara); Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea. (Kil To Chong) hilaltayara@ 123456jbnu.ac.kr kitchong@ 123456jbnu.ac.kr
                Article
                S2001-0370(21)00456-6
                10.1016/j.csbj.2021.10.034
                8605313
                34849205
                b4c37445-c9a9-4c2b-bab5-ade9ffb928e7
                © 2021 The Author(s)

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 28 September 2021
                : 27 October 2021
                : 28 October 2021
                Categories
                Research Article

                bioinformatics,4mc modification,computational biology,convolutional neural network,deep learning

                Comments

                Comment on this article