2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Antioxidant proteins are involved in several biological processes and can protect DNA and cells from the damage of free radicals. These proteins regulate the body's oxidative stress and perform a significant role in many antioxidant-based drugs. The current invitro-based medications are costly, time-consuming, and unable to efficiently screen and identify the targeted motif of antioxidant proteins.

          Methods

          In this model, we proposed an accurate prediction method to discriminate antioxidant proteins namely StackedEnC-AOP. The training sequences are formulation encoded via incorporating a discrete wavelet transform (DWT) into the evolutionary matrix to decompose the PSSM-based images via two levels of DWT to form a Pseudo position-specific scoring matrix (PsePSSM-DWT) based embedded vector. Additionally, the Evolutionary difference formula and composite physiochemical properties methods are also employed to collect the structural and sequential descriptors. Then the combined vector of sequential features, evolutionary descriptors, and physiochemical properties is produced to cover the flaws of individual encoding schemes. To reduce the computational cost of the combined features vector, the optimal features are chosen using Minimum redundancy and maximum relevance (mRMR). The optimal feature vector is trained using a stacking-based ensemble meta-model.

          Results

          Our developed StackedEnC-AOP method reported a prediction accuracy of 98.40% and an AUC of 0.99 via training sequences. To evaluate model validation, the StackedEnC-AOP training model using an independent set achieved an accuracy of 96.92% and an AUC of 0.98.

          Conclusion

          Our proposed StackedEnC-AOP strategy performed significantly better than current computational models with a ~ 5% and ~ 3% improved accuracy via training and independent sets, respectively. The efficacy and consistency of our proposed StackedEnC-AOP make it a valuable tool for data scientists and can execute a key role in research academia and drug design.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12859-024-05884-6.

          Related collections

          Most cited references83

          • Record: found
          • Abstract: found
          • Article: not found

          SMOTE: Synthetic Minority Over-sampling Technique

          An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            From local explanations to global understanding with explainable AI for trees

            Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are popular non-linear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here, we improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains. Exact game-theoretic explanations for ensemble tree-based predictions that guarantee desirable properties.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Oxidative stress, aging, and diseases

              Reactive oxygen and nitrogen species (RONS) are produced by several endogenous and exogenous processes, and their negative effects are neutralized by antioxidant defenses. Oxidative stress occurs from the imbalance between RONS production and these antioxidant defenses. Aging is a process characterized by the progressive loss of tissue and organ function. The oxidative stress theory of aging is based on the hypothesis that age-associated functional losses are due to the accumulation of RONS-induced damages. At the same time, oxidative stress is involved in several age-related conditions (ie, cardiovascular diseases [CVDs], chronic obstructive pulmonary disease, chronic kidney disease, neurodegenerative diseases, and cancer), including sarcopenia and frailty. Different types of oxidative stress biomarkers have been identified and may provide important information about the efficacy of the treatment, guiding the selection of the most effective drugs/dose regimens for patients and, if particularly relevant from a pathophysiological point of view, acting on a specific therapeutic target. Given the important role of oxidative stress in the pathogenesis of many clinical conditions and aging, antioxidant therapy could positively affect the natural history of several diseases, but further investigation is needed to evaluate the real efficacy of these therapeutic interventions. The purpose of this paper is to provide a review of literature on this complex topic of ever increasing interest.
                Bookmark

                Author and article information

                Contributors
                zouquan@nclab.net
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                4 August 2024
                4 August 2024
                2024
                : 25
                : 256
                Affiliations
                [1 ]Department of Zoology, Abdul Wali Khan University Mardan, ( https://ror.org/03b9y4e65) Mardan, 23200 KP Pakistan
                [2 ]Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, ( https://ror.org/04qr3zq92) Chengdu, 610054 People’s Republic of China
                [3 ]Department of Computer Science, Abdul Wali Khan University Mardan, ( https://ror.org/03b9y4e65) Mardan, 23200 KP Pakistan
                [4 ]Department of Management Information Systems (MIS), School of Business, King Faisal University (KFU), ( https://ror.org/00dn43547) 31982 Al-Ahsa, Saudi Arabia
                [5 ]GRID grid.54549.39, ISNI 0000 0004 0369 4060, Yangtze Delta Region Institute (Quzhou), , University of Electronic Science and Technology of China, ; Quzhou, 324000 People’s Republic of China
                Article
                5884
                10.1186/s12859-024-05884-6
                11298090
                39098908
                ac7ac31a-e2de-4fc5-9726-c2012d7fcc10
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

                History
                : 19 April 2024
                : 29 July 2024
                Funding
                Funded by: National Natural Science Foundation of China
                Award ID: 62131004
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100012166, National Key Research and Development Program of China;
                Award ID: 2022ZD0117700
                Award Recipient :
                Categories
                Research
                Custom metadata
                © BioMed Central Ltd., part of Springer Nature 2024

                Bioinformatics & Computational biology
                antioxidant proteins,transformation,evolutionary features,stacked ensemble model,feature selection,prediction

                Comments

                Comment on this article