Radiogenomics aims to analyze clinical images and information, and to predict key molecular profiles of tumors. However, imaging protocol is usually different in facilities, and it has been rarely examined whether the performance of developed methods in a dataset is robustly sustained even in other independent datasets. We explored machine learning and matrix decomposition methods using preoperative magnetic resonance images (MRIs) of glioma patients to establish versatile platform regardless of the heterogeneity of the datasets.
Preoperative glioma MRIs and clinical information were obtained from public dataset of The Cancer Imaging Archive (TCIA, N=159) and National Cancer Center Hospital (NCC, N=166). More than 16,000 radiomic features were applied for the prediction of tumor grading and IDH mutation status. Accuracy of prediction was evaluated by AUROC (area under the receiver operating characteristic curves).
The performances were comparable between the image features regardless of dimension reduction methods (the best accuracy for tumor grading and IDH status prediction was 0.91 and 0.88, respectively), but they were drastically decreased in the transfer learning (0.70 and 0.69). On the other hand, they were successfully improved by applying matrix decomposition and brain embedding (0.86 and 0.79).