+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine learning to predict early recurrence after oesophageal cancer surgery

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Early cancer recurrence after oesophagectomy is a common problem, with an incidence of 20–30 per cent despite the widespread use of neoadjuvant treatment. Quantification of this risk is difficult and existing models perform poorly. This study aimed to develop a predictive model for early recurrence after surgery for oesophageal adenocarcinoma using a large multinational cohort and machine learning approaches.


          Consecutive patients who underwent oesophagectomy for adenocarcinoma and had neoadjuvant treatment in one Dutch and six UK oesophagogastric units were analysed. Using clinical characteristics and postoperative histopathology, models were generated using elastic net regression (ELR) and the machine learning methods random forest (RF) and extreme gradient boosting (XGB). Finally, a combined (ensemble) model of these was generated. The relative importance of factors to outcome was calculated as a percentage contribution to the model.


          A total of 812 patients were included. The recurrence rate at less than 1 year was 29·1 per cent. All of the models demonstrated good discrimination. Internally validated areas under the receiver operating characteristic (ROC) curve (AUCs) were similar, with the ensemble model performing best (AUC 0·791 for ELR, 0·801 for RF, 0·804 for XGB, 0·805 for ensemble). Performance was similar when internal–external validation was used (validation across sites, AUC 0·804 for ensemble). In the final model, the most important variables were number of positive lymph nodes (25·7 per cent) and lymphovascular invasion (16·9 per cent).


          The model derived using machine learning approaches and an international data set provided excellent performance in quantifying the risk of early recurrence after surgery, and will be useful in prognostication for clinicians and patients.


          Early recurrence after surgery for adenocarcinoma of the oesophagus is common. A risk prediction model was derived using modern machine learning methods that accurately predicts risk of early recurrence using postoperative pathology.

          Machine learning may help

          Translated abstract


          la recidiva precoz del cáncer tras esofaguectomía es un problema frecuente con una incidencia del 20‐30% a pesar del uso generalizado del tratamiento neoadyuvante. La cuantificación de este riesgo es difícil y los modelos actuales funcionan mal. Este estudio se propuso desarrollar un modelo predictivo para la recidiva precoz después de la cirugía para el adenocarcinoma de esófago utilizando una gran cohorte multinacional y enfoques con aprendizaje automático.


          Se analizaron pacientes consecutivos sometidos a esofaguectomía por adenocarcinoma y que recibieron tratamiento neoadyuvante en 6 unidades de cirugía esofagogástrica del Reino Unido y 1 de los Países Bajos. Con la utilización de características clínicas y la histopatología postoperatoria se generaron modelos mediante regresión de red elástica ( elastic net regression, ELR) y métodos de aprendizaje automático Random Forest (RF) y XG boost (XGB). Finalmente, se generó un modelo combinado (Ensemble) de dichos métodos. La importancia relativa de los factores respecto al resultado se calculó como porcentaje de contribución al modelo.


          En total se incluyeron 812 pacientes. La tasa de recidiva a menos de 1 año fue del 29,1%. Todos los modelos demostraron una buena discriminación. Las áreas bajo la curva ROC (AUC) validadas internamente fueron similares, con el modelo Ensemble funcionando mejor (ELR = 0,791, RF = 0,801, XGB = 0,804, Ensemble = 0,805). El rendimiento fue similar cuando se utilizaba validación interna‐externa (validación entre centros, Ensemble AUC = 0,804). En el modelo final, las variables más importantes fueron el número de ganglios linfáticos positivos (25,7%) y la invasión linfovascular (16,9%).


          El modelo derivado con la utilización de aproximaciones con aprendizaje automático y un conjunto de datos internacional proporcionó un rendimiento excelente para cuantificar el riesgo de recidiva precoz tras la cirugía y será útil para clínicos y pacientes a la hora de establecer un pronóstico.

          Related collections

          Most cited references 50

          • Record: found
          • Abstract: found
          • Article: not found

          Internal validation of predictive models: efficiency of some procedures for logistic regression analysis.

          The performance of a predictive model is overestimated when simply determined on the sample of subjects that was used to construct the model. Several internal validation methods are available that aim to provide a more accurate estimate of model performance in new subjects. We evaluated several variants of split-sample, cross-validation and bootstrapping methods with a logistic regression model that included eight predictors for 30-day mortality after an acute myocardial infarction. Random samples with a size between n = 572 and n = 9165 were drawn from a large data set (GUSTO-I; n = 40,830; 2851 deaths) to reflect modeling in data sets with between 5 and 80 events per variable. Independent performance was determined on the remaining subjects. Performance measures included discriminative ability, calibration and overall accuracy. We found that split-sample analyses gave overly pessimistic estimates of performance, with large variability. Cross-validation on 10% of the sample had low bias and low variability, but was not suitable for all performance measures. Internal validity could best be estimated with bootstrapping, which provided stable estimates with low bias. We conclude that split-sample validation is inefficient, and recommend bootstrapping for estimation of internal validity of a predictive logistic regression model.
            • Record: found
            • Abstract: not found
            • Article: not found

            Random forests

              • Record: found
              • Abstract: found
              • Article: not found

              American Joint Committee on Cancer acceptance criteria for inclusion of risk models for individualized prognosis in the practice of precision medicine.

              The American Joint Committee on Cancer (AJCC) has increasingly recognized the need for more personalized probabilistic predictions than those delivered by ordinal staging systems, particularly through the use of accurate risk models or calculators. However, judging the quality and acceptability of a risk model is complex. The AJCC Precision Medicine Core conducted a 2-day meeting to discuss characteristics necessary for a quality risk model in cancer patients. More specifically, the committee established inclusion and exclusion criteria necessary for a risk model to potentially be endorsed by the AJCC. This committee reviewed and discussed relevant literature before creating a checklist unique to this need of AJCC risk model endorsement. The committee identified 13 inclusion and 3 exclusion criteria for AJCC risk model endorsement in cancer. The emphasis centered on performance metrics, implementation clarity, and clinical relevance. The facilitation of personalized probabilistic predictions for cancer patients holds tremendous promise, and these criteria will hopefully greatly accelerate this process. Moreover, these criteria might be useful for a general audience when trying to judge the potential applicability of a published risk model in any clinical domain. CA Cancer J Clin 2016;66:370-374. © 2016 American Cancer Society.

                Author and article information

                Br J Surg
                Br J Surg
                The British Journal of Surgery
                John Wiley & Sons, Ltd (Chichester, UK )
                30 January 2020
                July 2020
                : 107
                : 8 ( doiID: 10.1002/bjs.v107.8 )
                : 1042-1052
                [ 1 ] Cancer Sciences Unit University of Southampton Southampton UK
                [ 2 ] Department of Public Health Sciences and Medical Statistics University of Southampton Southampton UK
                [ 3 ] Department of Surgery Nottingham University Hospitals NHS Trust Nottingham UK
                [ 4 ] Department of Surgery Portsmouth Hospitals NHS Trust Portsmouth UK
                [ 5 ] Department of Upper Gastrointestinal Surgery University Hospitals Birmingham NHS Foundation Trust Birmingham UK
                [ 6 ] Cambridge Oesophagogastric Centre Addenbrookes Hospital, Cambridge University Hospitals Foundation Trust Cambridge UK
                [ 7 ] Hutchison/Medical Research Council Cancer Unit University of Cambridge Cambridge UK
                [ 8 ] Centre for Cancer Research and Cell Biology Queen's University Belfast Belfast UK
                [ 9 ] Department of Surgery University Medical Centre Utrecht the Netherlands
                Author notes
                [* ] Correspondence to: Professor T. J. Underwood, Cancer Sciences Unit, University of Southampton, Tremona Road, Southampton SO16 6YD, UK (e‐mail: tju@ ; @TimTheSurgeon, @SaqRahman, @Robwalker27, @uoscares, @HeartburnCancer)

                Members of the OCCAMS Consortium are co‐authors of this study and are listed in Appendix S1 (supporting information)

                © 2020 The Authors. BJS published by John Wiley & Sons Ltd on behalf of BJS Society Ltd.

                This is an open access article under the terms of the License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                Page count
                Figures: 3, Tables: 5, Pages: 11, Words: 5809
                Funded by: Programme Grant from Cancer Research UK
                Award ID: RG81771
                Award ID: RG84119
                Funded by: Cancer Research UK and Royal College of Surgeons of England Advanced Clinician Scientist Fellowship
                Award ID: A23924
                Upper GI
                Original Article
                Original Articles
                Custom metadata
                July 2020
                Converter:WILEY_ML3GV2_TO_JATSPMC version:5.8.4 mode:remove_FC converted:26.06.2020



                Comment on this article