+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Development and validation of a novel diagnostic model for initially clinical diagnosed gastrointestinal stromal tumors using an extreme gradient-boosting machine


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Gastrointestinal stromal tumor (GIST) is the most common gastrointestinal soft tissue tumor. Clinical diagnosis mainly relies on enhanced CT, endoscopy and endoscopic ultrasound (EUS), but the misdiagnosis rate is still high without fine needle aspiration biopsy. We aim to develop a novel diagnostic model by analyzing the preoperative data of the patients.


          We used the data of patients who were initially diagnosed as gastric GIST and underwent partial gastrectomy. The patients were randomly divided into training dataset and test dataset at a ratio of 3 to 1. After pre-experimental screening, max depth = 2, eta = 0.1, gamma = 0.5, and nrounds = 200 were defined as the best parameters, and in this way we developed the initial extreme gradient-boosting (XGBoost) model. Based on the importance of the features in the initial model, we improved the model by excluding the hematological features. In this way we obtained the final XGBoost model and underwent validation using the test dataset.


          In the initial XGBoost model, we found that the hematological indicators (including inflammation and nutritional indicators) examined before the surgery had little effect on the outcome, so we subsequently excluded the hematological indicators. Similarly, we also screened the features from enhanced CT and ultrasound gastroscopy, and finally determined the 6 most important predictors for GIST diagnosis, including the ratio of long and short diameter under CT, the CT value of the tumor, the enhancement of the tumor in arterial period and venous period, existence of liquid area and calcific area inside the tumor under EUS. Round or round-like tumors with a CT value of around 30 (25–37) and delayed enhancement, as well as liquid but not calcific area inside the tumor best indicate the diagnosis of GIST.


          We developed a model to further differential diagnose GIST from other tumors in initially clinical diagnosed gastric GIST patients by analyzing the results of clinical examinations that most patients should have completed before surgical resection.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12876-021-02048-1.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          MissForest--non-parametric missing value imputation for mixed-type data.

          Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set. Missing value imputation offers a solution to this problem. However, the majority of available imputation methods are restricted to one type of variable only: continuous or categorical. For mixed-type data, the different types are usually handled separately. Therefore, these methods ignore possible relations between variable types. We propose a non-parametric method which can cope with different types of variables simultaneously. We compare several state of the art methods for the imputation of missing values. We propose and evaluate an iterative imputation method (missForest) based on a random forest. By averaging over many unpruned classification or regression trees, random forest intrinsically constitutes a multiple imputation scheme. Using the built-in out-of-bag error estimates of random forest, we are able to estimate the imputation error without the need of a test set. Evaluation is performed on multiple datasets coming from a diverse selection of biological fields with artificially introduced missing values ranging from 10% to 30%. We show that missForest can successfully handle missing values, particularly in datasets including different types of variables. In our comparative study, missForest outperforms other methods of imputation especially in data settings where complex interactions and non-linear relations are suspected. The out-of-bag imputation error estimates of missForest prove to be adequate in all settings. Additionally, missForest exhibits attractive computational efficiency and can cope with high-dimensional data. The package missForest is freely available from http://stat.ethz.ch/CRAN/. stekhoven@stat.math.ethz.ch; buhlmann@stat.math.ethz.ch
            • Record: found
            • Abstract: found
            • Article: not found

            Machine Learning in Medicine.

            Rahul Deo (2015)
            Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome.
              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              Global epidemiology of gastrointestinal stromal tumours (GIST): A systematic review of population-based cohort studies.

              Gastrointestinal stromal tumours (GISTs) are rare, yet the most common mesenchymal tumour within the digestive tract. Lack of diagnostic criteria and no specific code in the ICD system has prevented epidemiological evaluation except from overt malignant cases in the past. A global estimate of incidence and disease patterns has thus not been available.

                Author and article information

                BMC Gastroenterol
                BMC Gastroenterol
                BMC Gastroenterology
                BioMed Central (London )
                18 December 2021
                18 December 2021
                : 21
                : 481
                GRID grid.411634.5, ISNI 0000 0004 0632 4559, Department of Gastrointestinal Surgery, , Peking University People’s Hospital, ; No.11 Xizhimen South Street, Xicheng District, Beijing, 100044 China
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                : 5 September 2021
                : 1 December 2021
                Funded by: FundRef http://dx.doi.org/10.13039/501100015083, Peking University People's Hospital;
                Award ID: RDL2020-06
                Award Recipient :
                Custom metadata
                © The Author(s) 2021

                Gastroenterology & Hepatology
                gastrointestinal stromal tumor,diagnostic model,xgboost
                Gastroenterology & Hepatology
                gastrointestinal stromal tumor, diagnostic model, xgboost


                Comment on this article