+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep Learning to Estimate Human Epidermal Growth Factor Receptor 2 Status from Hematoxylin and Eosin-Stained Breast Tissue Images


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Several therapeutically important mutations in cancers are economically detected using immunohistochemistry (IHC), which highlights the overexpression of specific antigens associated with the mutation. However, IHC panels can be imprecise and relatively expensive in low-income settings. On the other hand, although hematoxylin and eosin (H&E) staining used to visualize the general tissue morphology is a routine and low cost, it does not highlight any specific antigen or mutation.


          Using the human epidermal growth factor receptor 2 (HER2) mutation in breast cancer as an example, we strengthen the case for cost-effective detection and screening of overexpression of HER2 protein in H&E-stained tissue.

          Settings and Design:

          We use computational methods that reliably detect subtle morphological changes associated with the over-expression of mutation-specific proteins directly from H&E images.

          Subjects and Methods:

          We trained a classification pipeline to determine HER2 overexpression status of H&E stained whole slide images. Our training dataset was derived from a single hospital containing 26 (11 HER2+ and 15 HER2–) cases. We tested the classification pipeline on 26 (8 HER2+ and 18 HER2–) held-out cases from the same hospital and 45 independent cases (23 HER2+ and 22 HER2–) from the TCGA-BRCA cohort. The pipeline was composed of a stain separation module and three deep neural network modules in tandem for robustness and interpretability.

          Statistical Analysis Used:

          We evaluate our trained model through area under the curve (AUC)-receiver operating characteristic.


          Our pipeline achieved an AUC of 0.82 (confidence interval [CI]: 0.65–0.98) on held-out cases and an AUC of 0.76 (CI: 0.61–0.89) on the independent dataset from TCGA. We also demonstrate the region-level correspondence of HER2 overexpression between a patient's IHC and H&E serial sections.


          Our work strengthens the case for automatically quantifying the overexpression of mutation-specific proteins in H&E-stained digital pathology, and it highlights the importance of multi-stage machine learning pipelines for added robustness and interpretability.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning

          Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and subtype of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep convolutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most commonly mutated genes in LUAD. We found that six of them-STK11, EGFR, FAT1, SETBP1, KRAS and TP53-can be predicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH .
            • Record: found
            • Abstract: found
            • Article: not found

            Breast cancer intrinsic subtype classification, clinical use and future trends.

            Breast cancer is composed of multiple subtypes with distinct morphologies and clinical implications. The advent of microarrays has led to a new paradigm in deciphering breast cancer heterogeneity, based on which the intrinsic subtyping system using prognostic multigene classifiers was developed. Subtypes identified using different gene panels, though overlap to a great extent, do not completely converge, and the avail of new information and perspectives has led to the emergence of novel subtypes, which complicate our understanding towards breast tumor heterogeneity. This review explores and summarizes the existing intrinsic subtypes, patient clinical features and management, commercial signature panels, as well as various information used for tumor classification. Two trends are pointed out in the end on breast cancer subtyping, i.e., either diverging to more refined groups or converging to the major subtypes. This review improves our understandings towards breast cancer intrinsic classification, current status on clinical application, and future trends.
              • Record: found
              • Abstract: found
              • Article: not found

              A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology.

              Nuclear segmentation in digital microscopic tissue images can enable extraction of high-quality features for nuclear morphometrics and other analysis in computational pathology. Conventional image processing techniques, such as Otsu thresholding and watershed segmentation, do not work effectively on challenging cases, such as chromatin-sparse and crowded nuclei. In contrast, machine learning-based segmentation can generalize across various nuclear appearances. However, training machine learning algorithms requires data sets of images, in which a vast number of nuclei have been annotated. Publicly accessible and annotated data sets, along with widely agreed upon metrics to compare techniques, have catalyzed tremendous innovation and progress on other image classification problems, particularly in object recognition. Inspired by their success, we introduce a large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries, whose quality was validated by a medical doctor. Because our data set is taken from multiple hospitals and includes a diversity of nuclear appearances from several patients, disease states, and organs, techniques trained on it are likely to generalize well and work right out-of-the-box on other H&E-stained images. We also propose a new metric to evaluate nuclear segmentation results that penalizes object- and pixel-level errors in a unified manner, unlike previous metrics that penalize only one type of error. We also propose a segmentation technique based on deep learning that lays a special emphasis on identifying the nuclear boundaries, including those between the touching or overlapping nuclei, and works well on a diverse set of test images.

                Author and article information

                J Pathol Inform
                J Pathol Inform
                Journal of Pathology Informatics
                Wolters Kluwer - Medknow (India )
                24 July 2020
                : 11
                : 19
                [1 ]Department of Electrical Engineering, IIT Bombay, Mumbai, Maharashtra, India
                [2 ]Department of Computing Science, University of Alberta, Edmonton, Canada
                [3 ]Alberta Machine Intelligence Institute, Edmonton, Canada
                [4 ]Department of Pathology, Tata Memorial Centre - ACTREC, HBNI, Navi Mumbai, Maharashtra, India
                [5 ]Department of Pathology, University of Illinois, Chicago, USA
                Author notes
                Address for correspondence: Mr. Deepak Anand, Department of Electrical Engineering, IIT Bombay, Powai, Mumbai - 400 076, Maharashtra, India. E-mail: deepakanandece@ 123456gmail.com

                Equal Contribution

                Copyright: © 2020 Journal of Pathology Informatics

                This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.

                : 10 February 2020
                : 23 March 2020
                : 17 May 2020
                Research Article

                breast cancer,convolutional neural networks,histopathology,human epidermal growth factor receptor 2,immunohistochemistry,mutation detection,nucleus detection


                Comment on this article