10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Contemporary deep learning approaches show cutting-edge performance in a variety of complex prediction tasks. Nonetheless, the application of deep learning in healthcare remains limited since deep learning methods are often considered as non-interpretable black-box models. However, the machine learning community made recent elaborations on interpretability methods explaining data point-specific decisions of deep learning techniques. We believe that such explanations can assist the need in personalized precision medicine decisions via explaining patient-specific predictions.

          Methods

          Layer-wise Relevance Propagation (LRP) is a technique to explain decisions of deep learning methods. It is widely used to interpret Convolutional Neural Networks (CNNs) applied on image data. Recently, CNNs started to extend towards non-Euclidean domains like graphs. Molecular networks are commonly represented as graphs detailing interactions between molecules. Gene expression data can be assigned to the vertices of these graphs. In other words, gene expression data can be structured by utilizing molecular network information as prior knowledge. Graph-CNNs can be applied to structured gene expression data, for example, to predict metastatic events in breast cancer. Therefore, there is a need for explanations showing which part of a molecular network is relevant for predicting an event, e.g., distant metastasis in cancer, for each individual patient.

          Results

          We extended the procedure of LRP to make it available for Graph-CNN and tested its applicability on a large breast cancer dataset. We present Graph Layer-wise Relevance Propagation (GLRP) as a new method to explain the decisions made by Graph-CNNs. We demonstrate a sanity check of the developed GLRP on a hand-written digits dataset and then apply the method on gene expression data. We show that GLRP provides patient-specific molecular subnetworks that largely agree with clinical knowledge and identify common as well as novel, and potentially druggable, drivers of tumor progression.

          Conclusions

          The developed method could be potentially highly useful on interpreting classification results in the context of different omics data and prior knowledge molecular networks on the individual patient level, as for example in precision medicine approaches or a molecular tumor board.

          Supplementary Information

          The online version contains supplementary material available at (10.1186/s13073-021-00845-7).

          Related collections

          Most cited references55

          • Record: found
          • Abstract: not found
          • Article: not found

          Gradient-based learning applied to document recognition

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Molecular portraits of human breast tumours.

            Human breast tumours are diverse in their natural history and in their responsiveness to treatments. Variation in transcriptional programs accounts for much of the biological diversity of human cells and tumours. In each cell, signal transduction and regulatory systems transduce information from the cell's identity to its environmental status, thereby controlling the level of expression of every gene in the genome. Here we have characterized variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals, using complementary DNA microarrays representing 8,102 human genes. These patterns provided a distinctive molecular portrait of each tumour. Twenty of the tumours were sampled twice, before and after a 16-week course of doxorubicin chemotherapy, and two tumours were paired with a lymph node metastasis from the same patient. Gene expression patterns in two tumour samples from the same individual were almost always more similar to each other than either was to any other sample. Sets of co-expressed genes were identified for which variation in messenger RNA levels could be related to specific features of physiological variation. The tumours could be classified into subtypes distinguished by pervasive differences in their gene expression patterns.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A general framework for weighted gene co-expression network analysis.

              Gene co-expression networks are increasingly used to explore the system-level functionality of genes. The network construction is conceptually straightforward: nodes represent genes and nodes are connected if the corresponding genes are significantly co-expressed across appropriately chosen tissue samples. In reality, it is tricky to define the connections between the nodes in such networks. An important question is whether it is biologically meaningful to encode gene co-expression using binary information (connected=1, unconnected=0). We describe a general framework for ;soft' thresholding that assigns a connection weight to each gene pair. This leads us to define the notion of a weighted gene co-expression network. For soft thresholding we propose several adjacency functions that convert the co-expression measure to a connection weight. For determining the parameters of the adjacency function, we propose a biologically motivated criterion (referred to as the scale-free topology criterion). We generalize the following important network concepts to the case of weighted networks. First, we introduce several node connectivity measures and provide empirical evidence that they can be important for predicting the biological significance of a gene. Second, we provide theoretical and empirical evidence that the ;weighted' topological overlap measure (used to define gene modules) leads to more cohesive modules than its ;unweighted' counterpart. Third, we generalize the clustering coefficient to weighted networks. Unlike the unweighted clustering coefficient, the weighted clustering coefficient is not inversely related to the connectivity. We provide a model that shows how an inverse relationship between clustering coefficient and connectivity arises from hard thresholding. We apply our methods to simulated data, a cancer microarray data set, and a yeast microarray data set.
                Bookmark

                Author and article information

                Contributors
                tim.beissbarth@bioinf.med.uni-goettingen.de
                Journal
                Genome Med
                Genome Med
                Genome Medicine
                BioMed Central (London )
                1756-994X
                11 March 2021
                11 March 2021
                2021
                : 13
                : 42
                Affiliations
                [1 ]GRID grid.411984.1, ISNI 0000 0001 0482 5331, Medical Bioinformatics, University Medical Center Göttingen, ; Göttingen, Germany
                [2 ]GRID grid.16149.3b, ISNI 0000 0004 0551 4246, Dept. of Medicine A (Hematology, Oncology, Hemostaseology and Pulmonology), , University Hospital Münster, ; Münster, Germany
                [3 ]GRID grid.20522.37, ISNI 0000 0004 1767 9005, Hospital del Mar Medical Research Institute (IMIM), ; Barcelona, Spain
                [4 ]GRID grid.434682.f, ISNI 0000 0004 7666 5287, geneXplain GmbH, ; Wolfenbüttel, Germany
                [5 ]GRID grid.7307.3, ISNI 0000 0001 2108 9006, IT Infrastructure for Translational Medical Research, University of Augsburg, ; Augsburg, Germany
                [6 ]GRID grid.411984.1, ISNI 0000 0001 0482 5331, Medical Statistics, University Medical Center Göttingen, ; Göttingen, Germany
                [7 ]GRID grid.7450.6, ISNI 0000 0001 2364 4210, Campus-Institute Data Science (CIDAS), University of Göttingen, ; Göttingen, Germany
                Author information
                http://orcid.org/0000-0001-6509-2143
                Article
                845
                10.1186/s13073-021-00845-7
                7953710
                33706810
                8c586e49-3c0f-4a1c-b6e2-867eb7a4f61d
                © The Author(s) 2021

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 26 August 2020
                : 5 February 2021
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100011937, Nieders?chsische Ministerium f?r Wissenschaft und Kultur;
                Award ID: MTB-report
                Funded by: FundRef http://dx.doi.org/10.13039/501100002347, Bundesministerium f?r Bildung und Forschung;
                Award ID: 031L0024
                Funded by: FundRef http://dx.doi.org/10.13039/501100001659, Deutsche Forschungsgemeinschaft;
                Award ID: 424252458
                Categories
                Research
                Custom metadata
                © The Author(s) 2021

                Molecular medicine
                gene expression data,explainable ai,personalized medicine,precision medicine,classification of cancer,deep learning,prior knowledge,molecular networks

                Comments

                Comment on this article