2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      CrossFuse-XGBoost: accurate prediction of the maximum recommended daily dose through multi-feature fusion, cross-validation screening and extreme gradient boosting

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In the drug development process, approximately 30% of failures are attributed to drug safety issues. In particular, the first-in-human (FIH) trial of a new drug represents one of the highest safety risks, and initial dose selection is crucial for ensuring safety in clinical trials. With traditional dose estimation methods, which extrapolate data from animals to humans, catastrophic events have occurred during Phase I clinical trials due to interspecies differences in compound sensitivity and unknown molecular mechanisms. To address this issue, this study proposes a CrossFuse-extreme gradient boosting (XGBoost) method that can directly predict the maximum recommended daily dose of a compound based on existing human research data, providing a reference for FIH dose selection. This method not only integrates multiple features, including molecular representations, physicochemical properties and compound–protein interactions, but also improves feature selection based on cross-validation. The results demonstrate that the CrossFuse-XGBoost method not only improves prediction accuracy compared to that of existing local weighted methods [k-nearest neighbor (k-NN) and variable k-NN (v-NN)] but also solves the low prediction coverage issue of v-NN, achieving full coverage of the external validation set and enabling more reliable predictions. Furthermore, this study offers a high level of interpretability by identifying the importance of different features in model construction. The 241 features with the most significant impact on the maximum recommended daily dose were selected, providing references for optimizing the structure of new compounds and guiding experimental research. The datasets and source code are freely available at https://github.com/cqmu-lq/CrossFuse-XGBoost.

          Related collections

          Most cited references69

          • Record: found
          • Abstract: found
          • Article: not found

          clusterProfiler: an R package for comparing biological themes among gene clusters.

          Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            SciPy 1.0: fundamental algorithms for scientific computing in Python

            SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

              Abstract Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein–protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.
                Bookmark

                Author and article information

                Contributors
                Journal
                Brief Bioinform
                Brief Bioinform
                bib
                Briefings in Bioinformatics
                Oxford University Press
                1467-5463
                1477-4054
                January 2024
                12 January 2024
                12 January 2024
                : 25
                : 1
                : bbad511
                Affiliations
                Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University , Chongqing 400016, China
                Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University , Chongqing 400016, China
                Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University , Chongqing 400016, China
                Author notes
                Corresponding author: Jianbo Pan, Basic Medicine Research and Innovation Center for Novel Target and Therapeutic Intervention, Ministry of Education, Institute of Life Sciences, Chongqing Medical University, 1 Yixueyuan Road, Yuzhong District, Chongqing 400016, China. Tel.: +86 23 684 80209; Fax: +86 23 684 80209; E-mail: panjianbo@ 123456cqmu.edu.cn
                Author information
                https://orcid.org/0000-0002-7001-6310
                https://orcid.org/0009-0000-9402-8909
                https://orcid.org/0000-0001-6014-8160
                Article
                bbad511
                10.1093/bib/bbad511
                10786712
                38216539
                575db5e3-a5d4-4ca0-a01e-0bd047591400
                © The Author(s) 2024. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 28 July 2023
                : 4 December 2023
                : 13 December 2023
                Page count
                Pages: 13
                Funding
                Funded by: National Natural Science Foundation of China, DOI 10.13039/501100001809;
                Award ID: 82104063
                Funded by: Top-notch Talent Cultivation Program for Graduate Students of Chongqing Medical University;
                Award ID: BJRC202110
                Categories
                Problem Solving Protocol
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                crossfuse-xgboost,multi-feature fusion,cross-validation screening,maximum recommended daily dose

                Comments

                Comment on this article