6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Substantial batch effects in TCGA exome sequences undermine pan-cancer analysis of germline variants

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In recent years, research on cancer predisposition germline variants has emerged as a prominent field. The identity of somatic mutations is based on a reliable mapping of the patient germline variants. In addition, the statistics of germline variants frequencies in healthy individuals and cancer patients is the basis for seeking candidates for cancer predisposition genes. The Cancer Genome Atlas (TCGA) is one of the main sources of such data, providing a diverse collection of molecular data including deep sequencing for more than 30 types of cancer from > 10,000 patients.

          Methods

          Our hypothesis in this study is that whole exome sequences from blood samples of cancer patients are not expected to show systematic differences among cancer types. To test this hypothesis, we analyzed common and rare germline variants across six cancer types, covering 2241 samples from TCGA. In our analysis we accounted for inherent variables in the data including the different variant calling protocols, sequencing platforms, and ethnicity.

          Results

          We report on substantial batch effects in germline variants associated with cancer types. We attribute the effect to the specific sequencing centers that produced the data. Specifically, we measured 30% variability in the number of reported germline variants per sample across sequencing centers. The batch effect is further expressed in nucleotide composition and variant frequencies. Importantly, the batch effect causes substantial differences in germline variant distribution patterns across numerous genes, including prominent cancer predisposition genes such as BRCA1, RET, MAX, and KRAS. For most of known cancer predisposition genes, we found a distinct batch-dependent difference in germline variants.

          Conclusion

          TCGA germline data is exposed to strong batch effects with substantial variabilities among TCGA sequencing centers. We claim that those batch effects are consequential for numerous TCGA pan-cancer studies. In particular, these effects may compromise the reliability and the potency to detect new cancer predisposition genes. Furthermore, interpretation of pan-cancer analyses should be revisited in view of the source of the genomic data after accounting for the reported batch effects.

          Electronic supplementary material

          The online version of this article (10.1186/s12885-019-5994-5) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Pathogenic Germline Variants in 10,389 Adult Cancers

          We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Realizing the promise of cancer predisposition genes.

            Genes in which germline mutations confer highly or moderately increased risks of cancer are called cancer predisposition genes. More than 100 of these genes have been identified, providing important scientific insights in many areas, particularly the mechanisms of cancer causation. Moreover, clinical utilization of cancer predisposition genes has had a substantial impact on diagnosis, optimized management and prevention of cancer. The recent transformative advances in DNA sequencing hold the promise of many more cancer predisposition gene discoveries, and greater and broader clinical applications. However, there is also considerable potential for incorrect inferences and inappropriate clinical applications. Realizing the promise of cancer predisposition genes for science and medicine will thus require careful navigation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Prevalence and penetrance of germline BRCA1 and BRCA2 mutations in a population series of 649 women with ovarian cancer.

              A population-based series of 649 unselected incident cases of ovarian cancer diagnosed in Ontario, Canada, during 1995-96 was screened for germline mutations in BRCA1 and BRCA2. We specifically tested for 11 of the most commonly reported mutations in the two genes. Then, cases were assessed with the protein-truncation test (PTT) for exon 11 of BRCA1, with denaturing gradient gel electrophoresis for the remainder of BRCA1, and with PTT for exons 10 and 11 of BRCA2. No mutations were found in all 134 women with tumors of borderline histology. Among the 515 women with invasive cancers, we identified 60 mutations, 39 in BRCA1 and 21 in BRCA2. The total mutation frequency among women with invasive cancers, 11.7% (95% confidence interval [95%CI] 9.2%-14.8%), is higher than previous estimates. Hereditary ovarian cancers diagnosed at age 60 years were due to BRCA2. Mutations were found in 19% of women reporting first-degree relatives with breast or ovarian cancer and in 6.5% of women with no affected first-degree relatives. Risks of ovarian, breast, and stomach cancers and leukemias/lymphomas were increased nine-, five-, six- and threefold, respectively, among first-degree relatives of cases carrying BRCA1 mutations, compared with relatives of noncarriers, and risk of colorectal cancer was increased threefold for relatives of cases carrying BRCA2 mutations. For carriers of BRCA1 mutations, the estimated penetrance by age 80 years was 36% for ovarian cancer and 68% for breast cancer. In breast-cancer risk for first-degree relatives, there was a strong trend according to mutation location along the coding sequence of BRCA1, with little evidence of increased risk for mutations in the 5' fifth, but 8.8-fold increased risk for mutations in the 3' fifth (95%CI 3.6-22.0), corresponding to a carrier penetrance of essentially 100%. Ovarian, colorectal, stomach, pancreatic, and prostate cancer occurred among first-degree relatives of carriers of BRCA2 mutations only when mutations were in the ovarian cancer-cluster region (OCCR) of exon 11, whereas an excess of breast cancer was seen when mutations were outside the OCCR. For cancers of all sites combined, the estimated penetrance of BRCA2 mutations was greater for males than for females, 53% versus 38%. Past studies may have underestimated the contribution of BRCA2 to ovarian cancer, because mutations in this gene cause predominantly late-onset cancer, and previous work has focused more on early-onset disease. If confirmed in future studies, the trend in breast-cancer penetrance, according to mutation location along the BRCA1 coding sequence, may have significant impact on treatment decisions for carriers of BRCA1-mutations. As well, BRCA2 mutations may prove to be a greater cause of cancer in male carriers than previously has been thought.
                Bookmark

                Author and article information

                Contributors
                roni.rasnic@mail.huji.ac.il
                nadav.brandes@mail.huji.ac.il
                or.zuk@mail.huji.ac.il
                michall@mail.huji.ac.il
                Journal
                BMC Cancer
                BMC Cancer
                BMC Cancer
                BioMed Central (London )
                1471-2407
                7 August 2019
                7 August 2019
                2019
                : 19
                : 783
                Affiliations
                [1 ]ISNI 0000 0004 1937 0538, GRID grid.9619.7, The Rachel and Selim Benin School of Computer Science and Engineering, , The Hebrew University of Jerusalem, ; Jerusalem, Israel
                [2 ]ISNI 0000 0004 1937 0538, GRID grid.9619.7, Department of Statistics, , The Hebrew University of Jerusalem, ; Jerusalem, Israel
                [3 ]ISNI 0000 0004 1937 0538, GRID grid.9619.7, Department of Biological Chemistry, Institute of Life Sciences, , The Hebrew University of Jerusalem, ; Jerusalem, Israel
                Author information
                http://orcid.org/0000-0002-1200-6325
                Article
                5994
                10.1186/s12885-019-5994-5
                6686424
                31391007
                aea8decd-c609-4d24-8763-ce1382f9b3de
                © The Author(s). 2019

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 18 November 2018
                : 30 July 2019
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2019

                Oncology & Radiotherapy
                cancer predisposition,tcga,germline variants,batch effect,somatic mutations,personalized medicine,next generation sequencing,brca1,genomic sequencing centers

                Comments

                Comment on this article