9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Rediscover: an R package to identify mutually exclusive mutations

      1 , 1 , 1
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Discover is an algorithm developed to identify mutually exclusive genomic events. Its main contribution is a statistical analysis based on the Poisson–Binomial (PB) distribution to take into account the mutation rate of genes and samples. Discover is very effective for identifying mutually exclusive mutations at the expense of speed in large datasets: the PB is computationally costly to estimate, and checking all the potential mutually exclusive alterations requires millions of tests.

          Results

          We have implemented a new version of the package called Rediscover that implements exact and approximate computations of the PB. Rediscover exact implementation is slightly faster than Discover for large and medium-sized datasets. The approximation is 100–1000 times faster for them making it possible to get results in less than a minute with a standard desktop. The memory footprint is also smaller in Rediscover. The new package is available at CRAN and provides some functions to integrate its usage with other R packages such as maftools and TCGAbiolinks.

          Availability and implementation

          Rediscover is available at CRAN (https://cran.r-project.org/web/packages/Rediscover/index.html).

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references8

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Maftools: efficient and comprehensive analysis of somatic variants in cancer

          Numerous large-scale genomic studies of matched tumor-normal samples have established the somatic landscapes of most cancer types. However, the downstream analysis of data from somatic mutations entails a number of computational and statistical approaches, requiring usage of independent software and numerous tools. Here, we describe an R Bioconductor package, Maftools, which offers a multitude of analysis and visualization modules that are commonly used in cancer genomic studies, including driver gene identification, pathway, signature, enrichment, and association analyses. Maftools only requires somatic variants in Mutation Annotation Format (MAF) and is independent of larger alignment files. With the implementation of well-established statistical and computational methods, Maftools facilitates data-driven research and comparative analysis to discover novel results from publicly available data sets. In the present study, using three of the well-annotated cohorts from The Cancer Genome Atlas (TCGA), we describe the application of Maftools to reproduce known results. More importantly, we show that Maftools can also be used to uncover novel findings through integrative analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            COSMIC: the Catalogue Of Somatic Mutations In Cancer

            Abstract COSMIC, the Catalogue Of Somatic Mutations In Cancer (https://cancer.sanger.ac.uk) is the most detailed and comprehensive resource for exploring the effect of somatic mutations in human cancer. The latest release, COSMIC v86 (August 2018), includes almost 6 million coding mutations across 1.4 million tumour samples, curated from over 26 000 publications. In addition to coding mutations, COSMIC covers all the genetic mechanisms by which somatic mutations promote cancer, including non-coding mutations, gene fusions, copy-number variants and drug-resistance mutations. COSMIC is primarily hand-curated, ensuring quality, accuracy and descriptive data capture. Building on our manual curation processes, we are introducing new initiatives that allow us to prioritize key genes and diseases, and to react more quickly and comprehensively to new findings in the literature. Alongside improvements to the public website and data-download systems, new functionality in COSMIC-3D allows exploration of mutations within three-dimensional protein structures, their protein structural and functional impacts, and implications for druggability. In parallel with COSMIC’s deep and broad variant coverage, the Cancer Gene Census (CGC) describes a curated catalogue of genes driving every form of human cancer. Currently describing 719 genes, the CGC has recently introduced functional descriptions of how each gene drives disease, summarized into the 10 cancer Hallmarks.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx

              The advent of Next-Generation Sequencing (NGS) technologies has opened new perspectives in deciphering the genetic mechanisms underlying complex diseases. Nowadays, the amount of genomic data is massive and substantial efforts and new tools are required to unveil the information hidden in the data. The Genomic Data Commons (GDC) Data Portal is a platform that contains different genomic studies including the ones from The Cancer Genome Atlas (TCGA) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiatives, accounting for more than 40 tumor types originating from nearly 30000 patients. Such platforms, although very attractive, must make sure the stored data are easily accessible and adequately harmonized. Moreover, they have the primary focus on the data storage in a unique place, and they do not provide a comprehensive toolkit for analyses and interpretation of the data. To fulfill this urgent need, comprehensive but easily accessible computational methods for integrative analyses of genomic data that do not renounce a robust statistical and theoretical framework are required. In this context, the R/Bioconductor package TCGAbiolinks was developed, offering a variety of bioinformatics functionalities. Here we introduce new features and enhancements of TCGAbiolinks in terms of i) more accurate and flexible pipelines for differential expression analyses, ii) different methods for tumor purity estimation and filtering, iii) integration of normal samples from other platforms iv) support for other genomics datasets, exemplified here by the TARGET data. Evidence has shown that accounting for tumor purity is essential in the study of tumorigenesis, as these factors promote confounding behavior regarding differential expression analysis. With this in mind, we implemented these filtering procedures in TCGAbiolinks. Moreover, a limitation of some of the TCGA datasets is the unavailability or paucity of corresponding normal samples. We thus integrated into TCGAbiolinks the possibility to use normal samples from the Genotype-Tissue Expression (GTEx) project, which is another large-scale repository cataloging gene expression from healthy individuals. The new functionalities are available in the TCGAbiolinks version 2.8 and higher released in Bioconductor version 3.7.
                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                February 01 2022
                January 12 2022
                October 19 2021
                February 01 2022
                January 12 2022
                October 19 2021
                : 38
                : 3
                : 844-845
                Affiliations
                [1 ]Department of Biomedical Engineering and Sciences, TECNUN, University of Navarra, San Sebastian, Spain
                Article
                10.1093/bioinformatics/btab709
                34664620
                6df39c0f-f46b-40a6-b327-8d193069fe11
                © 2021

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article