1. INTRODUCTION
Cancer is a major public health problem and is the main cause of death worldwide; gliomas, lung cancer, liver cancer and colorectal cancer (CRC) are the most common tumors endangering human health [1]. Surgery, radiotherapy, chemotherapy and immunotherapy are recommended anticancer therapies based on precise diagnosis from medical images. However, conventional image assessment has recognized limitations, owing to differences in individual radiologists’ interpretation and inability to read hidden high-dimensional features. Artificial intelligence (AI) can recognize complex patterns in imaging automatically and provide quantitative, rather than qualitative, assessment of radiographic characteristics [2]. AI has been frequently and successfully applied in the medical image analysis field [3, 4].
AI is a new technical science that studies and develops theories, methods, technologies and application systems to simulate, extend and expand human intelligence. Machine learning is a branch of AI that aims to automatically extract trends and features from data and help make wise decisions. According to whether labels are needed, machine learning algorithms can be divided into supervised learning, semisupervised learning and unsupervised learning. In medical imaging, the use of radiomics or deep learning for differential diagnosis, segmentation and detection of tumors is a promising field.
Radiomics [5] is a research field aimed at establishing models that may play important roles in diagnosis, prognosis and prediction. First-order and high-order image features of medical images are key in helping the models analyze and make correct judgements. Because the tumor target is relatively small, and the internal heterogeneity of images is high, researchers often use radiomics for differential diagnosis of tumors [6]. The main steps in radiomics include data standardization, region of interest (ROI) placement, feature extraction [7], feature standardization, feature selection and model construction. With the development of radiomics, several available tools [8, 9] help researchers better perform radiomics.
However, because the neural networks have more parameters than traditional machine learning models, the performance is often better on larger datasets.
Deep learning is a nonlinear parameter model, and most deep learning algorithms are based on neural networks. Convolutional neural networks (CNNs) are the main neural networks used in computer vision and medical imaging. Owing to more nonlinear combinations and hyperparameters, deep learning in large datasets tends to achieve good performance [10]. Because labeling medical image data is difficult, deep learning strategies based on semisupervised or self-supervised methods are gradually being applied [11, 12]. Although semisupervised or self-supervised methods are not as effective as supervised learning, these two methods have great potential for future applications, because of the difficulty of medical data annotation. The transformer framework is another branch of deep learning, which was proposed by Google in 2017. On the basis of the transformer framework, Vision transformer (VIT) has been proposed [13]. VIT is composed mainly of multihead attention, position encoding, feed forward, input/output embedding and normalization layers. Compared with CNN, VIT has advantages in some tasks, including differential diagnoses of diseases, lesion segmentation and image registration [14]. Although deep learning performs better in large datasets, the deep learning algorithm is akin to a “black box” and cannot currently be fully explained from the mathematical theory level. Therefore, the interpretability of the characteristics of deep learning is poor. Although the locations focused on by the network (such as lesion locations and regions) can be revealed through feature map visualization, the meaning of deep learning features cannot be qualitatively explained. Even the focus of the network reflected by the feature maps is not the actual focus of physicians. Because radiomics and deep learning each have their own advantages in feature dimension and interpretability, the method of joint feature modeling is expected to provide improvements in solving existing medical problems [15].
Therefore, this article provides a summary of current clinical applications of AI in determining the diagnosis and treatment response for several cancers ( Figure 1 ).
2. GLIOMAS
Gliomas, the most common primary intracranial tumors, represent approximately 81% of primary malignant intracranial tumors [16, 17]. Compared with other cancers, gliomas require a relatively high resource burden for diagnosis and management [18]. For the detection, diagnosis, classification and prognosis prediction of gliomas, magnetic resonance imaging (MRI) is the imaging method of choice [19]. Advanced MRI provides more details of gliomas than conventional MRI. However, the assessment of multiparameter MRI is time consuming and challenging for radiologists. With the development of computer technology, AI reveals more invisible details and is useful in the diagnosis and management of gliomas.
2.1 Diagnosis of gliomas
Automatic glioma detection and categorization are key components of preoperative evaluation. Biopsy is the gold standard for diagnosis, but the results across studies are widely heterogeneous [20]. Preoperative MRI plays an important role in the detection and categorization of glioma [19]. Glioma and brain metastasis are the two most common malignant intracranial tumors in adults, and differentiating the two is challenging for radiologists [21]. The application of AI techniques helps radiologists improve the discriminatory performance, with an AUC >0.80 [22, 23]. The differential diagnosis of glioma, lymphoma, meningioma and pituitary tumors in adults and children has also been improved by AI techniques [24–26].
On the basis of histological appearance, gliomas have been assigned WHO grades I, II, III and IV, which indicate different degrees of malignancy [27]. Limited by the diversity of MRI manifestations, radiologists have limited ability to distinguish high-grade from low-grade gliomas. Radiomics, machine learning or deep learning models have demonstrated good performance in glioma grading [28–30]. Application of novel deep neural network models to extract the common and distinctive information from multimodal data has effectively combined multimodal data information and increased the accuracy of glioma grading [31]. Although deep learning models have good or even excellent performance, they are limited by the large number of MRI parameters and complex calculations; the problem of low speed must be solved. A lightweight 3D UNet deep learning framework has been developed to solve the dilemma of low speed by simple, effective and noninvasive diagnostic approaches [32].
2.2 Genetics and molecular marker detection
Molecular biomarkers in gliomas have shown more potential than WHO grading in predicting prognosis and accurately classifying gliomas; examples include IDH mutation, 1p/19q codeletion and O-6-methylguanine-DNA methyltransferase [27, 33]. The new classification based on molecular biomarkers was added to the World Health Organization criteria [34]. Accurate classification of molecular subtypes of gliomas is crucial for treatment planning and prognostic assessment, and tumor biopsy is the gold standard. However, tumor biopsy is invasive and not feasible in all patients. Noninvasive recognition of molecular subtypes is desirable in routine clinical practice.
Indeed, even experienced radiologists find that accurately classifying molecular subtypes on the basis of preoperative MR images is challenging [35, 36]. Machine learning or deep learning models have shown good performance in molecular biomarker prediction, with an accuracy of 83.8%–97.14 [37–39]. However, the diagnostic accuracy of AI algorithms based on routine preoperative MRI sequences still must be further improved before they can be reliably applied in routine clinical practice [35, 40]. The inclusion of advanced MR techniques has improved the diagnostic performance of AI models. Bumes et al. have applied magnetic resonance spectroscopy to build a machine learning model and achieved good diagnostic performance (sensitivity: 82.6%, specificity: 72.7%) [41]. Several advanced MRI techniques have also been applied to build AI models, such as DWI, ADC, T1 perfusion and ASL.
Therefore, the diagnosis and classification of gliomas are challenging for radiologists, and the development of AI techniques and advanced MRI sequences have shown tremendous improvement and promising results in clinical applications.
2.3 Monitoring response to treatment
Predicting the survival of patients with glioma has substantial clinical value and poses challenges for radiologists in routine clinical practice. Radiomics models based on multimodal images perform better than monomodal images [42]. Advanced MRI images (diffusion, perfusion, arterial spin labeling and MR spectroscopy) have also been used for prognostication [43–46]. However, missing MRI sequences in the multimodal imaging feature-based prediction model limit its application and generalization in routine clinical use. Deep learning models can synthesize missing sequences to ensure the successful implementation of models [47–49]. The post-treatment imaging feature-based prediction model is generalizable and performs better than the pretreatment imaging feature-based prediction model [50]. Multimodal and multi-time-point glioma segmentation is time consuming, and its clinical application is immensely challenging. The deep learning model performs rapidly and accurately in the segmentation of both preoperative and postoperative gliomas [51]. Li et al. have developed a deep learning model for predicting the prognosis of gliomas from whole-brain MRI without tumor segmentation and achieved good performance [52].
Distinguishing pseudoprogression from true tumor progression is challenging and has critical implications for treatment protocols [53]. MRI with gadolinium contrast is the standard imaging method for determining tumor growth. Because traditional MRI cannot reliably distinguish pseudoprogression from true tumor progression, several approaches, including advanced MRI imaging, radiomics and machine learning, have been applied to solve this problem. AI models with all available MRI parameters have also demonstrated better diagnostic performance in distinguishing pseudoprogression from true tumor progression than a model with a single MRI parameter [54, 55].
A summary of the AI studies in gliomas can be found in Table 1 .
Reference | Application | No. of cases | Imaging modality | Algorithm | Image feature type | Type of validation | Result |
---|---|---|---|---|---|---|---|
Conte et al. [47] | Synthesizing missing MRI sequences | 210 | T1 WI, T2 WI, FLAIR, T1-CE | Generative adversarial networks | Deep learning | Internal validation | MSE of 0.004–0.103 |
Bathla et al. [26] | Differentiating between glioblastoma and lymphoma | 94 | T1 WI, T2 WI, DWI | Machine learning | Radiomics | Fivefold cross-validation | AUC of 0.977 |
Ahammed et al. [30] | Glioma grade identification | 557 | T2 WI | Wndchrm tool based classifier and VGG-19 DNN | Wndchrm feature and deep learning | Internal validation | Accuracy of 92.86% for the Wndchrm classifier and 98.25% for VGG-19 DNN classifier |
Bangalore et al. [37] | Classification of IDH mutation status | 214 | T1 WI, T2 WI, FLAIR, T1-CE | Deep-learning network | Deep learning | Threefold cross-validation | Sensitivity of 0.98±0.02, specificity of 0.97±0.001, and AUC of 0.99±0.01 |
Bumes et al. [41] | Classification of IDH mutation status | 101 | MRS | Machine learning | Voxel for MRS | External validation | Sensitivity of 82.6% and specificity of 72.7% |
Cheng et al. [31] | Glioma grade identification | 350 | T1 WI, T2 WI, FLAIR, T1-CE | MMD-VAE model | Radiomics | External validation | AUC of 0.9611–0.9939 |
Kim et al. [55] | Differentiation of pseudoprogression from early tumor progression | 118 | T1-CE, FLAIR, ADC and CBV | Radiomics model | Radiomics | External validation | AUC of 0.85 |
Li et al. [52] | Prediction of glioma overall survival | 1556 | T1 WI, T2 WI, FLAIR, T1-CE | DeepRisk model | Whole-brain MRI without tumor segmentation | External validation | AUC of 0.77–0.94 |
George et al. [50] | Prediction of progression-free survival and overall survival | 113 | Pretreatment and first on-treatment time point MR imaging | Random survival forest algorithm | Radiomics | External validation | Concordance index of 0.680–0.715 |
Yu et al. [32] | Glioma grade identification | 560 | T1 WI, T2 WI, FLAIR, T1-CE | Lightweight 3D UNet deep learning framework | Deep learning | Internal validation | Accuracy of 89.29% |
MSE, mean squared error; DNN, deep convolutional neural network; IDH, isocitrate dehydrogenase; T1-CE, contrast-enhanced T1 WI; MRS, magnetic resonance spectroscopy; MMD-VAE, multimodal disentangled variational autoencoder.
3. LUNG CANCER
Currently, lung cancer is the second most common cancer and the leading cause of worldwide cancer-associated death [1]. Non-small cell lung cancer (NSCLC) makes up 85–90% of the different pathological types of lung cancer, including adenocarcinoma, squamous cell carcinoma, large cell carcinoma and squamous adenocarcinoma [56]. Early diagnosis, screening and personalized treatment regimens for patients are the main aims of precision medicine [57]. In actual clinical processes, AI has been broadly used for diagnosing pulmonary nodules and evaluating lung cancer-associated treatment responses.
3.1 Diagnosis of NSCLC
Screening is becoming an important part of early lung cancer diagnosis. In 2020, Maldonado et al. [58] constructed a radiomics model with an AUC of 0.90 to predict the risk of incidentally discovered indeterminate pulmonary nodules. The model performs well and may facilitate the management of indeterminate pulmonary nodules. Meng et al. [59] have established a nomogram to identify indolent and invasive lung adenocarcinomas manifesting as GGNs by combining radiomics, CT semantic features and clinical information, with an AUC reaching 0.946 in the test set. This radiomics nomogram can accurately predict the invasiveness of GGNs before surgery to assist clinicians in formulating individualized treatment strategies. Hu et al. [60] have differentiated benign from malignant GGNs by using a fusion model combined with deep learning and radiomics features, with an accuracy of 75.6%, a value higher than those of the DNN model and radiomics model individually. In addition, the authors have found that it was helpful to build DNN model with limited dataset by using transfer learning.
Different histopathological subtypes of lung cancer determine the choice of treatment modality. Pathological diagnosis is informative but is also invasive. In addition, missampling due to tumor heterogeneity can affect biopsy results. These limitations have encouraged researchers to develop noninvasive and accurate imaging markers to assess the overall tumor and to complement the deficiencies in invasive pathological examinations. Studies have attempted to predict histopathological subtypes of lung cancer on the basis of radiomics. Zhu et al. [61] have extracted five imaging features to classify adenocarcinoma vs. squamous cell carcinoma in NSCLC, and achieved an AUC of 0.893 in the test set, thus demonstrating the potential of noninvasive imaging in the histopathological classification of lung cancer. Maldonado et al. [62] have developed a noninvasive computer-aided nodule risk assessment tool to stratify the adenocarcinoma spectrum. Song et al. [63] have interrogated an entire tumor in a non-invasive manner by establishing a radiomics model, which has provided additional diagnostic information to recognize the micropapillary component and thus guide treatment planning.
3.2 Genetics and molecular marker detection
Currently, targeted therapy and immune therapy for lung cancer have been widely used in clinical settings, and the selection of patients who would benefit depends mainly on genetic mutation status (such as EGFR or ALK) and the expression of tumor immune microenvironment markers (such as PD-L1). Increasing evidence indicates that AI can identify and predict tumor gene mutation status and immune expression on the basis of images. Tan et al. [64] have constructed a stacked ensemble model by fusing one deep learning model and five machine learning models for predicting EGFR and ALK mutation status. The final stacked ensemble yielded an optimal diagnostic performance, thereby providing crucial guidance in TKI selection for targeted therapy in patients with NSCLC. Park et al. [65] have predicted a cytolytic activity score (CytAct) by using a deep learning-based biomarker, which has predicted outcomes in lung adenocarcinoma by estimating a tumor immune profile with FDG-PET noninvasively. Wu et al. [66] have developed an AI system using whole slide images to automatically assess the tumor proportion score of PD-L1 expression. The deep learning model also improves diagnostic repeatability and efficiency for pathologists.
3.3 Treatment response of NSCLC
TNM stage is considered the main prognostic factor after surgery. However, patients with the same stage show significantly different responses to treatment [67]; therefore, the capability to stratify patients according to prognosis is urgently needed, to select those at high risk of recurrence and achieve individualized treatment. In 2020, Wang et al. [68] confirmed that the radiomics label can be used as an independent imaging signature to predict the postoperative recurrence-free survival in patients with early NSCLC and to guide personalized clinical treatment. Kim et al. [69] have developed a CT-based deep learning model to predict disease-free survival, which performs well for patients with clinical stage I lung adenocarcinoma. Adjuvant chemotherapy based on cisplatin is recommended for patients with stage II NSCLC [70]. However, the overall 5-year survival benefit of adjuvant chemotherapy compared with that of surgery alone has been controversial [71, 72]. Because results from different trials have been inconsistent, an accurate biomarker is needed to select patients who may benefit from adjuvant chemotherapy. Vaidya et al. [73] have constructed a nomogram using a radiomics signature that can noninvasively evaluate the survival benefit and predict the actual efficacy of adjuvant chemotherapy.
Preoperative neoadjuvant chemotherapy improves resectability by decreasing tumor size and metastasis. However, the survival benefits of surgery compared with chemoradiotherapy for patients with stage IIIa have long been controversial [74, 75]. Khorrami et al. [76] have retrospectively analyzed 90 patients with stage IIIa disease and selected a total of 13 intratumor and peritumor radiomics features, which can predict pathological remission after neoadjuvant chemotherapy (≤10% residual tumor defined as major pathologic response), with an AUC of 0.90 in the training group and 0.86 in the test group, respectively; the radiomics characteristics were also associated with overall and progression-free survival. Platinum-based chemotherapy is the standard first-line treatment for advanced NSCLC. However, the objective response rate of the initial treatment regimen was only 24%–31% [77]. Currently, no clinically validated biomarkers can be used to select patients with advanced NSCLC who will benefit from platinum chemotherapy regimens. Khorrami et al. [77] have performed a retrospective study of 125 patients receiving pemetrexed plus platinum chemotherapy. Joint clinical data and intra- and peritumor and radiomics features were analyzed and used to predict lung cancer response to chemotherapy (training group and test group AUC values of 0.82 and 0.77, respectively). The study has indicated that in baseline CT image building, the radiomics signature can be individualized for predicting response to chemotherapy, and is significantly associated with tumor progression and overall survival time.
Before clinical decision-making, evaluation of the therapeutic effects of EGFR-TKIs should be predicted, and priority in alternative treatments should be given to patients at high risk of rapid tumor progression [78]. A multicenter retrospective study has used an imaging radiomics model to predict progression-free survival after EGFR-TKI therapy for patients with stage IV NSCLC with EGFR mutation. A study has included 117 patients treated with EGFR-TKI and selected 12 key characteristics to establish a predictive model. The AUC of 10-month progression-free survival was close to 0.72, and that of 1-year progression-free survival was close to 0.80. The CT-based predictive strategy can predict the PFS probability of EGFR-TKI therapy in NSCLC, thus improving the personalized management of TKIs [79].
AI has a potentially important role in predicting the response to immunotherapy. Vaidya et al. [80] have retrospectively analyzed 109 patients with advanced NSCLC receiving PD-L1/PD-1 inhibitor treatment and divided them into responders and nonresponders. The random forest classifier successfully distinguished excessive progression from other treatment responses. This study suggests that AI may help identify patients at high risk of progression of advanced NSCLC treated with PD-L1/PD-1 inhibitors. Another study has used PET-CT images to construct a model to select patients who are most likely to benefit from immunotherapy. Therefore, these data can provide accurate and personalized evidence to support clinical treatment decision for patients with advanced NSCLC [81].
A summary of the AI studies in NSCLC can be found in Table 2 .
Reference | Application | No. of cases | Imaging modality | Algorithm | Image feature type | Type of validation | Result |
---|---|---|---|---|---|---|---|
Diagnosis of NSCLC | |||||||
Maldonado et al. [58] | Prediction of the risk of incidentally detected indeterminate pulmonary nodules | 855 | LDCT | Brock model, BRODERS classifier | Radiomics | Internal validation | AUC of 0.87 for the Brock model; AUC of 0.90 for the BRODERS model. |
Hu et al. [60] | Classification of benign versus malignant GGNs | 513 | CT | 3D U-Net and DNN classification model, SVM classifier | Deep learning, radiomics, fusion | Internal validation | Accuracy of 75.6% |
Zhu et al. [61] | Distinguishing SCC from ADC | 129 | CT | LASSO logistic regression model | Radiomics | Internal validation | AUC of 0.893 |
Genetics and Molecular Marker Detection | |||||||
Park et al. [65] | Prediction of CytAct score | 152 | FDG-PET | 3D CNN model | Deep learning | External validation | Spearman rho = 0.32, p = 0.04 in SNUH cohort; spearman rho = 0.47, p = 0.07 in TCGA cohort |
Wu et al. [66] | Assessment of the TPS of PD-L1 expression | 251 | WSIs | U-Net structure with residual blocks | Deep learning | External validation | Accuracy of 0.9326; specificity of 0.9641 |
Treatment response of NSCLC | |||||||
Wang et al. [68] | Prediction of RFS in patients with resected stage I NSCLC | 378 | CT | LASSO logistic regression model | Radiomics | Internal validation | Radiomics signature: C-index of 0.776Radiomics signature integrated with the histologyC-index of 0.829 |
Kim et al. [69] | Prediction of DFS in patients with lung adenocarcinoma | 908 | CT | DLPM model | Deep learning | Internal and external validation | C indexes of 0.74–0.80 in the internal validation and 0.71–0.78 in the external validation |
Khorrami et al. [76] | Prediction of MPR | 90 | CT | Multivariate Cox regression model | Radiomics | Internal validation | AUC of 0.90±0.025 in the training set and 0.86 in the test set; radiomic signature also significantly associated with OS (HR = 11.18) and DFS (HR = 2.78) in the testing set |
Song et al. [79] | PFS prediction for TKI therapy in multicenter patients with stage IV EGFR-mutated NSCLC | 314 | CT | LASSO logistic regression model | Radiomics | External validation | C-index of the nomogram of 0.743 for the training cohort and 0.718 and 0.720 for the two validation cohorts, respectively |
Vaidya et al. [80] | Characterization of the response patterns in patients with NSCLC treated with PD-1/PD-L1 inhibitors | 109 | CT | Random forest classifier | Radiomics | Internal validation | AUC of 0.85±0.06 in the training set and 0.96 in the validation set |
NSCLC, non-small cell lung cancer; GGNs, ground glass nodules; SCC, squamous cell carcinoma; ADC, lung adenocarcinoma; CytAct, cytolytic activity score; WSIs, whole slide images; TPS, tumor proportion score; RFS, recurrence-free survival; DFS, disease-free survival; MPR, major pathological response; PFS, progression-free survival; PD-1, anti-programmed cell death 1; PD-L1, programmed cell death ligand 1.
4. LIVER CANCER
Primary liver cancer is the fourth leading cause of cancer-associated deaths worldwide. Because 90% of liver cancer cases are due to cirrhosis, very early liver cancer nodules are difficult to diagnose by visual evaluation, which depends on the diagnostic experience of radiologists and is qualitative. Therefore, developing a highly sensitive, objective and quantitative diagnostic method is urgently needed in clinical practice. AI technology can efficiently use imaging examination results to provide more valuable information for preoperative diagnosis and prediction of treatment response to aid in developing individual treatment plans.
4.1 Diagnosis of liver cancer
In the diagnosis of liver cancer, the accuracy of diagnosis has been significantly improved with AI. Nie [82] has developed CT-based imaging models for the preoperative differentiation of hepatocellular carcinoma (HCC) from hepatocellular adenoma and focal nodular hyperplasia. Each model has good differentiation ability, with an AUC exceeding 0.94. Ponnoprat [83] has developed an image radiomics model based on multiphase CT enhanced images to distinguish HCC from intrahepatic cholangiocarcinoma (ICC), and achieved a model accuracy of 88%. Some researchers have also applied radiomics techniques to the differential diagnosis of HCC subtypes. Liu [84] has reported an MRI imaging radiomics model for distinguishing HCC, ICC and HCC-ICC, with an AUC of 0.77. Huang [85] has constructed a diagnostic model for dual-phenotypic HCC based on GD-Eob-DTPA-enhanced MR images. Currently, CT- or MR-based imaging radiomics techniques have shown good diagnostic performance similar to that of experienced radiologists.
In the diagnosis of liver cancer, many preliminary studies have assessed the potential application value of deep learning in the differential diagnosis of liver cancer. Most of these studies have classified liver lesions into different categories on the basis of MR or CT images and compared them with the diagnostic findings of radiologists with different levels of experience. Yasaka, [86] has developed an algorithm for liver tumor classification based on multiphase CT enhanced images, with an accuracy of 0.84 on the test set. Hamm [87] has used an iteratively optimized CNN network to diagnose six common hepatic focal lesions in multiphase MR enhanced images. The model’s accuracy, sensitivity and specificity are 0.92, 0.92 and 0.98, respectively [88]. An interpretable model analysis system has been constructed, and the pretrained deep neural network decision-making principle has been illustrated through analysis of key image features of internal subsets.
4.2 Genetics and molecular marker detection
Many studies on the molecular typing of HCC have shown that radiomics technology can extract complete high-dimensional image information from whole tumor images. The radiomics method supplements traditional genetic analysis, which may have sampling errors, thereby increasing the accuracy of diagnosis.
CK19 is a biliary-specific marker associated with poor prognosis of HCC. Wang [89] has used radiomics features to predict CK19 in HCC on the basis of gadoxetic acid-enhanced MRI. Combined with radiomics characteristics and clinical factors, the sensitivity and specificity of the validation model are 0.769 and 0.818, respectively.
The Ki-67 index is another molecular biomarker for recurrence and poor prognosis in HCC. Recently, several studies have predicted Ki-67 on the basis of preoperative CT and MRI. Wu [90] has evaluated the correlation between radiomics features and the Ki-67 labeling index. The radiomics parameters predicted Ki-67 status with AUCs ranging from 0.777 to 0.836. Ye [91] has investigated the whole-lesion texture analysis based on preoperative gadoxetic acid-enhanced MRI for predicting Ki-67 status in HCC. The model combining the texture features and clinical data showed high discrimination ability, with an AUC of 0.795.
4.3 Response to treatment
Radiomics methods have been used to evaluate the response to local ablation, recurrence, surgical resection and arterial chemoembolization (TACE) [92].
Wen [93] has built a model to predict the early recurrence of liver cancer after radiofrequency ablation (within 2 years) for local liver ablation. Multivariate logistic regression analysis has established a predictive model including preoperative platelet count and the radiographic Rad-Score, with an AUC of 0.98. Ma [94] has developed an imaging radiomics model based on dynamic contrast-enhanced ultrasound to predict recurrence after ablative therapy in patients with HCC with a diameter of less than 5 cm. The joint imaging radiomics model based on a deep learning reference has shown better prediction for early recurrence, with AUC of 0.89 vs. 0.84. Furthermore, a model combining dynamic contrast-enhanced ultrasound imaging and clinical features has been used to stratify the risk of late relapse.
Zheng [95] has developed programs combining CT imaging radiomics and clinical features to predict recurrence-free survival and overall survival outcomes after hepatectomy for isolated HCC for surgical liver resection. Kong [96] has explored the feasibility of preoperative MRI features to predict the response of liver tumor to TACE. Consequently, eight imaging features associated with TACE response have been screened, and a predictive model has successfully been constructed. Kozzi [24] has performed radiomics analysis on CT-enhanced images of patients with HCC before radiotherapy for liver radiotherapy treatment. Combined with the RECIST criteria, the CT-enhanced image radiomics model for patients with HCC before radiotherapy can be used to evaluate the efficacy of radiotherapy. Chen [97] has applied radiomics analysis of CT-enhanced images of patients with liver cancer before radiotherapy and concluded that the CT-enhanced image radiomics model can evaluate the efficacy of radiotherapy, with an AUC of 0.80.
A summary of the AI studies in liver cancer can be found in Table 3 .
Reference | Application | No. of cases | Imaging modality | Algorithm | Image feature type | Type of validation | Result |
---|---|---|---|---|---|---|---|
Diagnosis of liver cancer | |||||||
AI. Nie [82] | Preoperative differentiation of FNH from HCC | 156 patients with FNH (n = 55) and HCC (n = 101) | Contrast CT image | Multivariate logistic regression analysis | Radiomics | Internal validation | AUC of 0.917; 95% CI, 0.800–1.000) |
Ponnoprat et al. [83] | Differentiation between HCC and ICC | 187 | Contrast CT image | CNN model Support vector machine model | Radiomics | Tenfold cross-validation | Correct detection of 89.30% of HCC and 84.42% of ICC |
Liu et al. [84] | Differentiation between combined HCC-CC vs. CC and HCC | 85 | MRI and CT | Support vector machine classifier | Radiomics | Internal validation | AUC of 0.77 |
Huang et al. [85] | Description of clinical characteristics and outcomes of DPHCC | 50 | Gd-EOB-DTPA-enhanced MRI | Multi-layer perceptron, support vector machines, logistic regression and K-nearest neighbor | Radiomics | Internal validation | Accuracy of 0.766 in PVP, 0.798 in DP, 0.756 in HBP, 0.798 in in PVP |
Yasaka et al. [86] | Differentiation of liver masses | 460 | Contrast CT image | CNN | Deep learning | Internal validation | Accuracy of 0.84. |
Treatment response | |||||||
Wen et al. [93] | Preoperative prediction of early recurrence (≤2 years) of small HCC | 111 | MRI | Multivariable logistic regression | Radiomics | Fivefold cross-validation | AUC of 0.980 |
Ma et al. [94] | Prediction of early and late recurrence in patients with a single HCC lesion ≤ 5 cm after thermal ablation | 318 | Contrast-enhanced ultrasound | Multivariable logistic regression | Radiomics | Internal validation | C-index of 0.77 |
Zheng et al. [95] | Estimation of postoperative recurrence and survival in patients with solitary HCC | 319 | Contrast CT image | Cox’s hazard regression | Radiomics | Internal validation | Recurrence: hazard ratio: 2.472; survival hazard ratio: 1.558. |
Kong et al. [96] | Prediction of the response of HCC to TACE | 86 | MRI | Multivariable logistic regression | Radiomics | Internal validation | AUC of 0.794 |
Chen et al. [97] | Prediction of treatment response to first TACE in patients with intermediate-stage HCC | 595 | Contrast CT image | Multivariable logistic regression | Radiomics | External validation | AUC of 0.90 |
FNH, focal nodular hyperplasia; HCC, hepatocellular carcinoma; CC, cholangiocarcinoma; DPHCC, dual-phenotype hepatocellular carcinoma.
5. COLORECTAL CANCER
CRC ranks third among all malignant tumors worldwide [1, 98]. CT and MRI of CRC are crucial for tumor evaluation, which relies heavily on radiologists’ experience, thus affecting diagnostic accuracy and consistency. Recently, AI has provided more precise methods to evaluate CRC.
5.1 Diagnosis of colorectal cancer
For primary tumor (T), lymph node (N) and metastasis (M) staging in CRC, both CT and MRI are essential. AI has proven useful in TNM staging based on CT and MRI. Wang et al. [99] have developed four network models to distinguish among cancer and normal, M0 and M1, normal and elevated carcinoembryonic antigen, and clinical stage I–II and III–IV, on the basis of The Cancer Genome Atlas database. The predictive accuracy in a tenfold cross-validation test was 93.75–99.39%, 80.58–88.24%, 67.21–92.31% and 59.13–68.85%, respectively. You et al. [100] have developed a support vector machine model for T staging diagnosis (T1/T2 vs. T3/T4) using MRI data for 154 patients. In the test set, the models based on T2 WI, ADC and the above two sequences combined achieved AUCs of 0.845, 0.881 and 0.910, respectively. The prediction of lymph node metastasis remains a major topic. Huang et al. [101] have built a radiomics nomogram to predict lymph node status in CRC on the basis of CT of primary tumors, with a C-index of 0.778 in internal validation. Liu et al. [102] have built five support vector machine models to predict LN metastasis of the tumor and mesorectum in rectal cancer. The model based on the combined data, including clinical information, radiomics features from tumor and mesorectal areas achieved the best performance, with an AUC of 0.832. Metastasis is an important risk marker of poor prognosis [103–105]. Li et al. [8] have developed and assessed a radiomics model to predict liver metastasis in CRC on the basis of primary tumor CT data. The model using both the radiomics and clinical information signatures performed best, with an AUC of 0.899 in the test set.
5.2 Histopathological aspects and genetic detection
AI provides a new method to predict the risk factors in CRC that cannot be interpreted by radiologists. Patients with tumor invasion of nerve structures and the lymphovascular wall, histologically defined as lymphatic vascular infiltration and perineural invasion, respectively, may have poor prognosis. Guo et al. [106] have built a radiomics model based on T2 WI and CE-CT to predict perineural invasion, with an AUC of 0.884 in the testing datasets. Zhang et al. [107] have built a similar radiomics model to predict lymphatic vascular infiltration, with an AUC of 0.876. Microsatellite instability (MSI) is crucial for making individualized treatment plans, and is defined as the loss of one or more mismatch repair proteins according to immunohistochemistry [108, 109]. Fan et al. [110] have extracted radiomic features from the portal venous phase of CE-CT to predict the MSI in stage II CRC, and achieved an AUC of 0.752. Zhang et al. have built a DL model to predict MSI on MRI, with an AUC of 0.868 in the test cohort [111]. Kirsten rat sarcoma viral oncogene homolog (KRAS) gene mutations are also key to optimal individualized therapeutic strategies in CRC. He et al. [112] have investigated a DL model to predict KRAS mutation by using pretreatment contrast-enhanced CT in patients with CRC, with an AUC of 0.90 in the testing dataset. Hence, AI has the potential to assist in noninvasive histopathological aspects and genetic estimation to aid in personalized treatment.
5.3 Treatment response of colorectal cancer
Treatment response evaluation in locally advanced or advanced CRC is a critical clinical process [113]. Distinguishing minimal residual tumors after neoadjuvant chemoradiotherapy and predicting treatment effectiveness remain challenges in CRC [114]. Kim et al. [115], in a study aimed at identifying pathological complete response (pCR, defined as ypT0), have shown that a radiomics model based on T2 WI performs better than radiologists (AUC, 0.82 vs. 0.74; sensitivity, 80.0% vs. 15.6%). Several studies [15, 116, 117] have developed radiomic models to predict pCR (defined as ypT0N0), with AUCs of 0.812–0.89 [116, 117]. Furthermore, Tian et al. [15] have constructed a RadioPathomics Integrated Prediction System to predict pCR after neoadjuvant chemoradiotherapy by using MRI and biopsy whole-slide images, which may aid in clinical decision-making.
Adjuvant chemotherapy is an important treatment strategy. Therefore, predicting the response to adjuvant chemotherapy has become a major topic. On the basis of T2 WI and ADC, Tian et al. have built a radiomic model to predict distant metastasis of LARC after adjuvant chemotherapy and shown that a higher radiomic signature indicates poorer prognosis. Patients with pN2 and a low-radiomic signature could receive adjuvant chemotherapy to achieve better prognosis [118]. Hence, radiomics may aid in the formulation of individual treatment plans by screening potential patients who would benefit from therapy.
The AI studies of CRC are summarized in Table 4 .
Reference | Application | No. of cases | Imaging modality | Algorithm | Image feature type | Type of validation | Result |
---|---|---|---|---|---|---|---|
Diagnosis | |||||||
Wang et al. [99] | Clinical stage I–II/III–IV | 633 | CT | BP and LVQ | Artificial neural networks | Tenfold cross-validation test | Accuracy of 59.13–68.85% |
Liu et al. [102] | Prediction of lymph node metastasis | 186 | MRI | SVM | Radiomics | Internal validation | AUC of 0.832 |
Li et al. [8] | Prediction of liver metastasis | 100 | CT | Logistic regression | Radiomics | Cross-validation set and test set | AUC of 0.899 |
Histopathological aspects and genetic detection | |||||||
Guo et al. [106] | Prediction of preoperative PNI | 94 | CT and MRI | Multivariate logistic regression analysis | Radiomics | Internal validation | AUC of 0.884 |
Fan et al. [110] | Prediction of MSI | 119 | CT | Logistic regression model | Radiomics | Internal validation | AUC of 0.752 |
He et al. [112] | Prediction of KRAS mutation status | 157 | CT | ResNet | Deep learning | Internal validation | AUC of 0.90 |
Treatment response | |||||||
Kim et al. [115] | Prediction of pCR | 898 | MRI | Least absolute shrinkage and selection operator | Radiomics | Tenfold cross-validation | AUC of 0.82 |
Tian et al. [15] | Prediction of pCR | 1033 | MRI | RAdioPathomics Integrated preDiction System | Radiomics | External validation | AUC of 0.81 |
Tian et al. [118] | Prediction of adjuvant chemotherapy responders | 629 | MRI | Least absolute shrinkage and selection operator (LASSO)-Cox regression | Radiomics | External validation | C-indices of 0.803–0.848 |
PNI, perineural invasion; MSI, microsatellite instability; pCR, pathological complete response.
6. CHALLENGES AND FUTURE DIRECTIONS
Imaging plays an essential role in the diagnosis and treatment of tumors. Currently, AI is a major research area. However, many problems are becoming apparent as the number of studies increases.
Insufficient data are a bottleneck in building radiomics and deep learning models, particularly in tumor diagnosis and prognostication. Retrospective clinical data are the most readily available, but substantial biases can arise between model training data and validation data. Constructing multicenter datasets or using public datasets is necessary to obtain homogeneous and unbiased clinical data. In the future, training data may be obtained from multiple institutions through federated learning techniques [119] and sharing models without the need to exchange data.
Poor generalization is an inherent limitation of imaging histology techniques. Radiomics features depend highly on the scanning protocol, ROI selection and feature extraction method. Many factors can lead to high variability in the extracted imaging histology features. However, only stable, reproducible and discriminable models can be generalized in clinical practice. A recent review has proposed several reproducible and generalizable strategies for image histology analysis [120] that can be used to obtain reproducible histological features. In addition, the image histology stability quality score [121], quantitative networks [122] and image biomarker standardization initiatives [123] have gradually become standard for multicenter data applications.
Poor interpretability is a continuing challenge. The issue of interpretability is increasingly important for imaging biomarkers for developing optimal therapy plans. In recent years, the need for interpretable methods has received increasing attention as deep learning techniques continue to evolve. Owing to the complexity of deep learning techniques, millions of parameters are typically optimized during training, data patterns associated with target outputs are automatically extracted, and spurious data correlations may be identified and used, thus decreasing the system’s reliability. Many studies [124, 125] have combined visualization schemes for uncertainty assessment with computer prediction but require imaging experts to validate the visual assessment results. Consequently, interpretable assessments are highly dependent on human judgment and experience, thus making the methodological design and assessment of AI interpretable studies subjective and prone to cognitive bias. In the future, with the gradual integration of AI systems with multidimensional patient information [126], such as clinical indicators, imaging features, molecular pathways and clinical scores, the interpretability of AI models will have more research possibilities.
Automation of the entire AI workflow is another important challenge. Currently, the Picture Archiving and Communication System has been widely used to ensured easy retrieval of data. However, to support further AI research, the data also must be managed for labeling, annotations, segmentations and quality assurance, thus requiring substantial labor. Medical image segmentation based on deep learning has been demonstrated to have good performance in many object segmentation, which can be generally divided into automatic segmentation models and interactive segmentation models [127]. Deploying the constructed segmentation models among radiologists is challenging. The advantage of fully automatic segmentation models is that all segmentation results are provided at one time, thus making this model highly suitable for images with many slices and large volumes of segmentation targets. Although the model uses an advanced attention algorithm to improve the segmentation efficiency [128], the model efficiency may not be satisfactory for the segmentation of multiple and small targets. The interactive segmentation model can automatically identify 2D or 3D regions of interest through the annotation or radiologists clicking on key areas of the image [129]. This model is highly effective for small target organs or ROI annotation with fewer slices but takes more time for images with more slices. In summary, AI segmentation models will help radiologists achieve accurate annotation in most scenarios, and different models should be selected according to task requirements. Regardless of the model used for segmentation with the purpose of accurate quantification, secondary confirmation by a radiologist remains necessary.
AI research continues to deepen, thus bringing hope for practical clinical applications. In addition to glioma, lung cancer, liver cancer and CRC, as described above, AI has numerous applications in breast and cervical cancers [130–132], thus providing substantial advantages for optimization of tumors management strategies.