Inferring Regulatory Networks from Expression Data Using Tree-Based Methods

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.

Related collections

Most cited references 25

Record: found
Abstract: found
Article: not found

Cluster analysis and display of genome-wide expression patterns.

P. T. Spellman, P. O. Brown, D Botstein … (1998)

A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

0 comments Cited 1865 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Inferring genetic networks and identifying compound mode of action via expression profiling.

Timothy S Gardner, Diego di Bernardo, David Lorenz … (2003)

The complexity of cellular gene, protein, and metabolite networks can hinder attempts to elucidate their structure and function. To address this problem, we used systematic transcriptional perturbations to construct a first-order model of regulatory interactions in a nine-gene subnetwork of the SOS pathway in Escherichia coli. The model correctly identified the major regulatory genes and the transcriptional targets of mitomycin C activity in the subnetwork. This approach, which is experimentally and computationally scalable, provides a framework for elucidating the functional properties of genetic networks and identifying molecular targets of pharmacological compounds.

0 comments Cited 311 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Inferring cellular networks using probabilistic graphical models.

Nir Friedman (2004)

High-throughput genome-wide molecular assays, which probe cellular networks from different perspectives, have become central to molecular biology. Probabilistic graphical models are useful for extracting meaningful biological insights from the resulting data sets. These models provide a concise representation of complex cellular networks by composing simpler submodels. Procedures based on well-understood principles for inferring such models from data facilitate a model-based methodology for analysis and discovery. This methodology and its capabilities are illustrated by several recent applications to gene expression data.

0 comments Cited 309 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS One

Journal ID (publisher-id): plos

Journal ID (pmc): plosone

Title: PLoS ONE

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Electronic): 1932-6203

Publication date Collection: 2010

Publication date (Electronic): 28 September 2010

Volume: 5

Issue: 9

Electronic Location Identifier: e12776

Affiliations

[1 ]Department of Electrical Engineering and Computer Science, Systems and Modeling, University of Liège, Liège, Belgium

[2 ]GIGA-Research, Bioinformatics and Modeling, University of Liège, Liège, Belgium

Center for Genomic Regulation, Spain

Author notes

* E-mail: vahuynh@ 123456ulg.ac.be

Conceived and designed the experiments: VAHT PG. Performed the experiments: VAHT. Analyzed the data: VAHT AI LW PG. Wrote the paper: VAHT AI LW PG.

Article

Publisher ID: 10-PONE-RA-18635R1

DOI: 10.1371/journal.pone.0012776

PMC ID: 2946910

PubMed ID: 20927193

SO-VID: 8b4d93dc-167f-4fd6-82f4-72a06ad73bb5

Copyright © Huynh-Thu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 5 May 2010

Date accepted : 9 August 2010

Page count

Pages: 10

Comments

Comment on this article

scite_

Cited by 594

See all cited by

Most referenced authors 761

See all reference authors

Inferring Regulatory Networks from Expression Data Using Tree-Based Methods

Read this article at

Abstract

Related collections

PLOS Climate

Most cited references 25

Cluster analysis and display of genome-wide expression patterns.

Inferring genetic networks and identifying compound mode of action via expression profiling.

Inferring cellular networks using probabilistic graphical models.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 169

Cited by 594

Most referenced authors 761