42
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep learning with multimodal representation for pancancer prognosis prediction

      research-article
      1 , 2
      Bioinformatics
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Estimating the future course of patients with cancer lesions is invaluable to physicians; however, current clinical methods fail to effectively use the vast amount of multimodal data that is available for cancer patients. To tackle this problem, we constructed a multimodal neural network-based model to predict the survival of patients for 20 different cancer types using clinical data, mRNA expression data, microRNA expression data and histopathology whole slide images (WSIs). We developed an unsupervised encoder to compress these four data modalities into a single feature vector for each patient, handling missing data through a resilient, multimodal dropout method. Encoding methods were tailored to each data type—using deep highway networks to extract features from clinical and genomic data, and convolutional neural networks to extract features from WSIs.

          Results

          We used pancancer data to train these feature encodings and predict single cancer and pancancer overall survival, achieving a C-index of 0.78 overall. This work shows that it is possible to build a pancancer model for prognosis that also predicts prognosis in single cancer sites. Furthermore, our model handles multiple data modalities, efficiently analyzes WSIs and represents patient multimodal data flexibly into an unsupervised, informative representation. We thus present a powerful automated tool to accurately determine prognosis, a key step towards personalized treatment for cancer patients.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Systematic analysis of breast cancer morphology uncovers stromal features associated with survival.

          The morphological interpretation of histologic sections forms the basis of diagnosis and prognostication for cancer. In the diagnosis of carcinomas, pathologists perform a semiquantitative analysis of a small set of morphological features to determine the cancer's histologic grade. Physicians use histologic grade to inform their assessment of a carcinoma's aggressiveness and a patient's prognosis. Nevertheless, the determination of grade in breast cancer examines only a small set of morphological features of breast cancer epithelial cells, which has been largely unchanged since the 1920s. A comprehensive analysis of automatically quantitated morphological features could identify characteristics of prognostic relevance and provide an accurate and reproducible means for assessing prognosis from microscopic image data. We developed the C-Path (Computational Pathologist) system to measure a rich quantitative feature set from the breast cancer epithelium and stroma (6642 features), including both standard morphometric descriptors of image objects and higher-level contextual, relational, and global image features. These measurements were used to construct a prognostic model. We applied the C-Path system to microscopic images from two independent cohorts of breast cancer patients [from the Netherlands Cancer Institute (NKI) cohort, n = 248, and the Vancouver General Hospital (VGH) cohort, n = 328]. The prognostic model score generated by our system was strongly associated with overall survival in both the NKI and the VGH cohorts (both log-rank P ≤ 0.001). This association was independent of clinical, pathological, and molecular factors. Three stromal features were significantly associated with survival, and this association was stronger than the association of survival with epithelial characteristics in the model. These findings implicate stromal morphologic structure as a previously unrecognized prognostic determinant for breast cancer.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation

            Accurate lung nodule segmentation from computed tomography (CT) images is of great importance for image-driven lung cancer analysis. However, the heterogeneity of lung nodules and the presence of similar visual characteristics between nodules and their surroundings make it difficult for robust nodule segmentation. In this study, we propose a data-driven model, termed the Central Focused Convolutional Neural Networks (CF-CNN), to segment lung nodules from heterogeneous CT images. Our approach combines two key insights: 1) the proposed model captures a diverse set of nodule-sensitive features from both 3-D and 2-D CT images simultaneously; 2) when classifying an image voxel, the effects of its neighbor voxels can vary according to their spatial locations. We describe this phenomenon by proposing a novel central pooling layer retaining much information on voxel patch center, followed by a multi-scale patch learning strategy. Moreover, we design a weighted sampling to facilitate the model training, where training samples are selected according to their degree of segmentation difficulty. The proposed method has been extensively evaluated on the public LIDC dataset including 893 nodules and an independent dataset with 74 nodules from Guangdong General Hospital (GDGH). We showed that CF-CNN achieved superior segmentation performance with average dice scores of 82.15% and 80.02% for the two datasets respectively. Moreover, we compared our results with the inter-radiologists consistency on LIDC dataset, showing a difference in average dice score of only 1.98%.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks.

              Clinical data, such as patient history, laboratory analysis, ultrasound parameters--which are the basis of day-to-day clinical decision support--are often underused to guide the clinical management of cancer in the presence of microarray data. We propose a strategy based on Bayesian networks to treat clinical and microarray data on an equal footing. The main advantage of this probabilistic model is that it allows to integrate these data sources in several ways and that it allows to investigate and understand the model structure and parameters. Furthermore using the concept of a Markov Blanket we can identify all the variables that shield off the class variable from the influence of the remaining network. Therefore Bayesian networks automatically perform feature selection by identifying the (in)dependency relationships with the class variable. We evaluated three methods for integrating clinical and microarray data: decision integration, partial integration and full integration and used them to classify publicly available data on breast cancer patients into a poor and a good prognosis group. The partial integration method is most promising and has an independent test set area under the ROC curve of 0.845. After choosing an operating point the classification performance is better than frequently used indices.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                July 2019
                05 July 2019
                05 July 2019
                : 35
                : 14
                : i446-i454
                Affiliations
                [1 ]Monta Vista High School, Cupertino, CA, USA
                [2 ]Department of Medicine and Biomedical Data Science, Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
                Author notes
                To whom correspondence should be addressed. ogevaert@ 123456stanford.edu
                Article
                btz342
                10.1093/bioinformatics/btz342
                6612862
                31510656
                ffcdffb8-47c6-4d18-badd-e9ab71129b05
                © The Author(s) 2019. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 9
                Funding
                Funded by: National Institutes of Health 10.13039/100000002
                Award ID: R01EB020527
                Funded by: National Institute of Dental and Craniofacial Research 10.13039/100000072
                Award ID: U01DE025188
                Funded by: National Cancer Institute 10.13039/100000054
                Award ID: U01CA199241
                Award ID: U01CA217851
                Categories
                Ismb/Eccb 2019 Conference Proceedings
                Studies of Phenotypes and Clinical Applications

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article