3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Purpose

          Machine learning (ML) is a strong candidate for making accurate predictions, as we can use large amount of data with powerful computational algorithms. We developed a ML based model to predict survival of patients with colorectal cancer (CRC) using data from two independent datasets.

          Materials and Methods

          A total of 364,316 and 1,572 CRC patients were included from the Surveillance, Epidemiology, and End Results (SEER) and a Korean dataset, respectively. As SEER combines data from 18 cancer registries, internal validation was done using 18-Fold-Cross-Validation then external validation was performed by testing the trained model on the Korean dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity and positive predictive values.

          Results

          Clinicopathological characteristics were significantly different between the two datasets and the SEER showed a significant lower 5-year survival rate compared to the Korean dataset (60.1% vs. 75.3%, p < 0.001). The ML-based model using the Light gradient boosting algorithm achieved a better performance in predicting 5-year-survival compared to American Joint Committee on Cancer stage (AUROC, 0.804 vs. 0.736; p < 0.001). The most important features which influenced model performance were age, number of examined lymph nodes, and tumor size. Sensitivity and positive predictive values of predicting 5-year-survival for classes including dead or alive were reported as 68.14%, 77.51% and 49.88%, 88.1% respectively in the validation set. Survival probability can be checked using the web-based survival predictor ( http://colorectalcancer.pythonanywhere.com).

          Conclusion

          ML-based model achieved a much better performance compared to staging in individualized estimation of survival of patients with CRC.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Cancer statistics, 2019

          Each year, the American Cancer Society estimates the numbers of new cancer cases and deaths that will occur in the United States and compiles the most recent data on cancer incidence, mortality, and survival. Incidence data, available through 2015, were collected by the Surveillance, Epidemiology, and End Results Program; the National Program of Cancer Registries; and the North American Association of Central Cancer Registries. Mortality data, available through 2016, were collected by the National Center for Health Statistics. In 2019, 1,762,450 new cancer cases and 606,880 cancer deaths are projected to occur in the United States. Over the past decade of data, the cancer incidence rate (2006-2015) was stable in women and declined by approximately 2% per year in men, whereas the cancer death rate (2007-2016) declined annually by 1.4% and 1.8%, respectively. The overall cancer death rate dropped continuously from 1991 to 2016 by a total of 27%, translating into approximately 2,629,200 fewer cancer deaths than would have been expected if death rates had remained at their peak. Although the racial gap in cancer mortality is slowly narrowing, socioeconomic inequalities are widening, with the most notable gaps for the most preventable cancers. For example, compared with the most affluent counties, mortality rates in the poorest counties were 2-fold higher for cervical cancer and 40% higher for male lung and liver cancers during 2012-2016. Some states are home to both the wealthiest and the poorest counties, suggesting the opportunity for more equitable dissemination of effective cancer prevention, early detection, and treatment strategies. A broader application of existing cancer control knowledge with an emphasis on disadvantaged groups would undoubtedly accelerate progress against cancer.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The meaning and use of the area under a receiver operating characteristic (ROC) curve.

            A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Colorectal cancer statistics, 2020

              Colorectal cancer (CRC) is the second most common cause of cancer death in the United States. Every 3 years, the American Cancer Society provides an update of CRC occurrence based on incidence data (available through 2016) from population-based cancer registries and mortality data (through 2017) from the National Center for Health Statistics. In 2020, approximately 147,950 individuals will be diagnosed with CRC and 53,200 will die from the disease, including 17,930 cases and 3,640 deaths in individuals aged younger than 50 years. The incidence rate during 2012 through 2016 ranged from 30 (per 100,000 persons) in Asian/Pacific Islanders to 45.7 in blacks and 89 in Alaska Natives. Rapid declines in incidence among screening-aged individuals during the 2000s continued during 2011 through 2016 in those aged 65 years and older (by 3.3% annually) but reversed in those aged 50 to 64 years, among whom rates increased by 1% annually. Among individuals aged younger than 50 years, the incidence rate increased by approximately 2% annually for tumors in the proximal and distal colon, as well as the rectum, driven by trends in non-Hispanic whites. CRC death rates during 2008 through 2017 declined by 3% annually in individuals aged 65 years and older and by 0.6% annually in individuals aged 50 to 64 years while increasing by 1.3% annually in those aged younger than 50 years. Mortality declines among individuals aged 50 years and older were steepest among blacks, who also had the only decreasing trend among those aged younger than 50 years, and excluded American Indians/Alaska Natives, among whom rates remained stable. Progress against CRC can be accelerated by increasing access to guideline-recommended screening and high-quality treatment, particularly among Alaska Natives, and elucidating causes for rising incidence in young and middle-aged adults.
                Bookmark

                Author and article information

                Journal
                Cancer Res Treat
                Cancer Res Treat
                CRT
                Cancer Research and Treatment : Official Journal of Korean Cancer Association
                Korean Cancer Association
                1598-2998
                2005-9256
                April 2022
                15 June 2021
                : 54
                : 2
                : 517-524
                Affiliations
                [1 ]Faculty of Medicine, Zagazig University, Zagazig, Egypt
                [2 ]Faculty of Pharmacy, British University in Egypt (BUE), El Shorouk, Egypt
                [3 ]Department of Surgery, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
                [4 ]Department of Surgery, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
                Author notes
                Correspondence: Jeonghyun Kang, Department of Surgery, Gangnam Severance Hospital, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea, Tel: 82-2-2019-3372, Fax: 82-2-3462-5994, E-mail: ravic@ 123456naver.com
                Article
                crt-2021-206
                10.4143/crt.2021.206
                9016295
                34126702
                34139eb2-9195-4605-802b-c1a3ab131100
                Copyright © 2022 by the Korean Cancer Association

                This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 February 2021
                : 13 June 2021
                Categories
                Original Article
                Gastrointestinal Cancer

                Oncology & Radiotherapy
                machine learning,lightgbm,colorectal neoplasms,area under the curve,mortality,seer

                Comments

                Comment on this article