+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SMuRF: a novel tool to identify regulatory elements enriched for somatic point mutations

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Single Nucleotide Variants (SNVs), including somatic point mutations and Single Nucleotide Polymorphisms (SNPs), in noncoding cis-regulatory elements (CREs) can affect gene regulation and lead to disease development. Several approaches have been developed to identify highly mutated regions, but these do not take into account the specific genomic context, and thus likelihood of mutation, of CREs.


          Here, we present SMuRF (Significantly Mutated Region Finder), a user-friendly command-line tool to identify these significantly mutated regions from user-defined genomic intervals and SNVs. We demonstrate this using publicly available datasets in which SMuRF identifies 72 significantly mutated CREs in liver cancer, including known mutated gene promoters as well as previously unreported regions.


          SMuRF is a helpful tool to allow the simple identification of significantly mutated regulatory elements. It is open-source and freely available on GitHub (

          Electronic supplementary material

          The online version of this article (10.1186/s12859-018-2501-y) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references 5

          • Record: found
          • Abstract: found
          • Article: not found

          Genome-wide analysis of non-coding regulatory mutations in cancer

          Cancer primarily develops due to somatic alterations in the genome. Advances in sequencing have enabled large-scale sequencing studies across many tumor types, emphasizing discovery of alterations in protein-coding genes. However, the protein-coding exome comprises less than 2% of the human genome. Here, we analyze complete genome sequences of 863 human tumors from The Cancer Genome Atlas and other sources to systematically identify non-coding regions that are recurrently mutated in cancer. We utilize novel frequency and sequence-based approaches to comprehensively scan the genome for non-coding mutations with potential regulatory impact. We identified recurrent mutations in regulatory elements upstream of PLEKHS1, WDR74, and SDHD, as well as previously identified mutations in the TERT promoter. SDHD promoter mutations are frequent in melanoma and associated with reduced gene expression and poor patient prognosis. The non-protein-coding cancer genome remains widely unexplored and our findings represent a step towards targeting the entire genome for clinical purposes.
            • Record: found
            • Abstract: found
            • Article: not found

            Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer.

            Liver cancer, which is most often associated with virus infection, is prevalent worldwide, and its underlying etiology and genomic structure are heterogeneous. Here we provide a whole-genome landscape of somatic alterations in 300 liver cancers from Japanese individuals. Our comprehensive analysis identified point mutations, structural variations (STVs), and virus integrations, in noncoding and coding regions. We discovered mutational signatures related to liver carcinogenesis and recurrently mutated coding and noncoding regions, such as long intergenic noncoding RNA genes (NEAT1 and MALAT1), promoters, CTCF-binding sites, and regulatory regions. STV analysis found a significant association with replication timing and identified known (CDKN2A, CCND1, APC, and TERT) and new (ASH1L, NCOR1, and MACROD2) cancer-related genes that were recurrently affected by STVs, leading to altered expression. These results emphasize the value of whole-genome sequencing analysis in discovering cancer driver mutations and understanding comprehensive molecular profiles of liver cancer, especially with regard to STVs and noncoding mutations.
              • Record: found
              • Abstract: found
              • Article: not found

              Cell-of-origin chromatin organization shapes the mutational landscape of cancer

              Cancer is a disease potentiated by mutations in somatic cells. Cancer mutations are not distributed uniformly along the genome. Instead, different genomic regions vary by up to 5-fold in the local density of somatic mutations 1 , posing a fundamental problem for statistical methods of cancer genomics. Epigenomic organization has been proposed as a major determinant of the cancer mutational landscape 1-5 . However, both somatic mutagenesis and epigenomic features are highly cell-type-specific 6,7 . We investigated the distribution of mutations in multiple samples of diverse cancer types and compared them to cell-type-specific epigenomic features. Here, we show that chromatin accessibility and modification, together with replication timing, explain up to 86% of the variance in mutation rates along cancer genomes. Overwhelmingly, the best predictors of local somatic mutation density are epigenomic features derived from the most likely cell type of origin of the corresponding malignancy. Moreover, we find that cell-of-origin chromatin features are much stronger determinants of cancer mutation profiles than chromatin features of cognate cancer cell lines. We show further that the cell type of origin of a cancer can be accurately determined based on the distribution of mutations along its genome. Thus, DNA sequence of a cancer genome encompasses a wealth of information about the identity and epigenomic features of its cell of origin.

                Author and article information

                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                26 November 2018
                26 November 2018
                : 19
                [1 ]ISNI 0000 0004 0474 0428, GRID grid.231844.8, Princess Margaret Cancer Centre, The MaRS Center, , University Health Network, ; 101 College Street, Toronto, ON M5G 1L7 Canada
                [2 ]ISNI 0000 0001 2157 2938, GRID grid.17063.33, Department of Medical Biophysics, , University of Toronto, ; Toronto, ON Canada
                [3 ]ISNI 0000 0004 0626 690X, GRID grid.419890.d, Ontario Institute for Cancer Research, ; Toronto, ON Canada
                © The Author(s). 2018

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

                Funded by: Stand Up To Cancer (CDN)
                Award ID: SU2C-AACR-DT-19-15
                Award Recipient :
                Funded by: FundRef, Prostate Cancer Canada;
                Award ID: RS2014-04
                Award Recipient :
                Funded by: FundRef, Institute of Cancer Research;
                Award ID: MFE 338954
                Award Recipient :
                Custom metadata
                © The Author(s) 2018


                Comment on this article