72
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Simultaneous Identification of Multiple Driver Pathways in Cancer

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Distinguishing the somatic mutations responsible for cancer ( driver mutations) from random, passenger mutations is a key challenge in cancer genomics. Driver mutations generally target cellular signaling and regulatory pathways consisting of multiple genes. This heterogeneity complicates the identification of driver mutations by their recurrence across samples, as different combinations of mutations in driver pathways are observed in different samples. We introduce the Multi-Dendrix algorithm for the simultaneous identification of multiple driver pathways de novo in somatic mutation data from a cohort of cancer samples. The algorithm relies on two combinatorial properties of mutations in a driver pathway: high coverage and mutual exclusivity. We derive an integer linear program that finds set of mutations exhibiting these properties. We apply Multi-Dendrix to somatic mutations from glioblastoma, breast cancer, and lung cancer samples. Multi-Dendrix identifies sets of mutations in genes that overlap with known pathways – including Rb, p53, PI(3)K, and cell cycle pathways – and also novel sets of mutually exclusive mutations, including mutations in several transcription factors or other genes involved in transcriptional regulation. These sets are discovered directly from mutation data with no prior knowledge of pathways or gene interactions. We show that Multi-Dendrix outperforms other algorithms for identifying combinations of mutations and is also orders of magnitude faster on genome-scale data. Software available at: http://compbio.cs.brown.edu/software.

          Author Summary

          Cancer is a disease driven largely by the accumulation of somatic mutations during the lifetime of an individual. The declining costs of genome sequencing now permit the measurement of somatic mutations in hundreds of cancer genomes. A key challenge is to distinguish driver mutations responsible for cancer from random passenger mutations. This challenge is compounded by the observation that different combinations of driver mutations are observed in different patients with the same cancer type. One reason for this heterogeneity is that driver mutations target signaling and regulatory pathways which have multiple points of failure. We introduce an algorithm, Multi-Dendrix, to find these pathways solely from patterns of mutual exclusivity between mutations across a cohort of patients. Unlike earlier approaches, we simultaneously find multiple pathways, an essential feature for analyzing cancer genomes where multiple pathways are typically perturbed. We apply our algorithm to mutation data from hundreds of glioblastoma, breast cancer, and lung adenocarcinoma patients. We identify sets of interacting genes that overlap known pathways, and gene sets containing subtype-specific mutations. These results show that multiple cancer pathways can be identified directly from patterns in mutation data, and provide an approach to analyze the ever-growing cancer mutation datasets.

          Related collections

          Most cited references26

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Cancer genes and the pathways they control.

            The revolution in cancer research can be summed up in a single sentence: cancer is, in essence, a genetic disease. In the last decade, many important genes responsible for the genesis of various cancers have been discovered, their mutations precisely identified, and the pathways through which they act characterized. The purposes of this review are to highlight examples of progress in these areas, indicate where knowledge is scarce and point out fertile grounds for future investigation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              iRefIndex: A consolidated protein interaction database with provenance

              Background Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. Results We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. Conclusion We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at . This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                May 2013
                May 2013
                23 May 2013
                : 9
                : 5
                : e1003054
                Affiliations
                [1 ]Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
                [2 ]Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
                ETH Zurich, Switzerland
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: RS BJR. Performed the experiments: MDML. Analyzed the data: MDML RS BJR. Wrote the paper: MDML RS BJR. Implemented the algorithm: MDML DB.

                Article
                PCOMPBIOL-D-12-01395
                10.1371/journal.pcbi.1003054
                3662702
                23717195
                cc30401c-511c-46de-91e7-3277b49576da
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 29 August 2012
                : 26 March 2013
                Page count
                Pages: 15
                Funding
                This work is supported by NSF grant IIS-1016648. BJR is supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, an Alfred P. Sloan Research Fellowship, and an NSF CAREER Award (CCF-1053753). RS was supported by a research grant from the Israel Science Foundation (grant no. 241/11). MDML was supported by NSF GRFP DGE 0228243. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Computer Science
                Algorithms

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article