+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Use of Data-Biased Random Walks on Graphs for the Retrieval of Context-Specific Networks from Genomic Data

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Extracting network-based functional relationships within genomic datasets is an important challenge in the computational analysis of large-scale data. Although many methods, both public and commercial, have been developed, the problem of identifying networks of interactions that are most relevant to the given input data still remains an open issue. Here, we have leveraged the method of random walks on graphs as a powerful platform for scoring network components based on simultaneous assessment of the experimental data as well as local network connectivity. Using this method, NetWalk, we can calculate distribution of Edge Flux values associated with each interaction in the network, which reflects the relevance of interactions based on the experimental data. We show that network-based analyses of genomic data are simpler and more accurate using NetWalk than with some of the currently employed methods. We also present NetWalk analysis of microarray gene expression data from MCF7 cells exposed to different doses of doxorubicin, which reveals a switch-like pattern in the p53 regulated network in cell cycle arrest and apoptosis. Our analyses demonstrate the use of NetWalk as a valuable tool in generating high-confidence hypotheses from high-content genomic data.

          Author Summary

          Analysis of high-content genomic data within the context of known networks of interactions of genes can lead to a better understanding of the underlying biological processes. However, finding the networks of interactions that are most relevant to the given data is a challenging task. We present a random walk-based algorithm, NetWalk, which integrates genomic data with networks of interactions between genes to score the relevance of each interaction based on both the data values of the genes as well as their local network connectivity. This results in a distribution of Edge Flux values, which can be used for dynamic reconstruction of user-defined networks. Edge Flux values can be further subjected to statistical analyses such as clustering, allowing for direct numerical comparisons of context-specific networks between different conditions. To test NetWalk performance, we carried out microarray gene expression analysis of MCF7 cells subjected to lethal and sublethal doses of a DNA damaging agent. We compared NetWalk to other network-based analysis methods and found that NetWalk was superior in identifying coherently altered sub-networks from the genomic data. Using NetWalk, we further identified p53-regulated networks that are differentially involved in cell cycle arrest and apoptosis, which we experimentally tested.

          Related collections

          Most cited references 36

          • Record: found
          • Abstract: not found
          • Article: not found

          Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

            • Record: found
            • Abstract: found
            • Article: not found

            Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

            Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
              • Record: found
              • Abstract: found
              • Article: not found

              KEGG: kyoto encyclopedia of genes and genomes.

               S. Goto,  M Kanehisa (2000)
              KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www.

                Author and article information

                [1 ]Department of Systems Biology, University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
                [2 ]Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
                University of Illinois at Urbana-Champaign, United States of America
                Author notes

                Conceived and designed the experiments: KK PTR. Performed the experiments: KK. Analyzed the data: KK. Contributed reagents/materials/analysis tools: MAW. Wrote the paper: KK PTR.

                Role: Editor
                PLoS Comput Biol
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                August 2010
                August 2010
                19 August 2010
                : 6
                : 8
                Komurov et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                Pages: 10
                Research Article
                Computational Biology/Genomics
                Computational Biology/Systems Biology

                Quantitative & Systems biology


                Comment on this article