Altered networks of gene regulation underlie many complex conditions, including cancer. Inferring gene regulatory networks from high-throughput microarray expression data is a fundamental but challenging task in computational systems biology and its translation to genomic medicine. Although diverse computational and statistical approaches have been brought to bear on the gene regulatory network inference problem, their relative strengths and disadvantages remain poorly understood, largely because comparative analyses usually consider only small subsets of methods, use only synthetic data, and/or fail to adopt a common measure of inference quality.
We report a comprehensive comparative evaluation of nine state-of-the art gene regulatory network inference methods encompassing the main algorithmic approaches (mutual information, correlation, partial correlation, random forests, support vector machines) using 38 simulated datasets and empirical serous papillary ovarian adenocarcinoma expression-microarray data. We then apply the best-performing method to infer normal and cancer networks. We assess the druggability of the proteins encoded by our predicted target genes using the CancerResource and PharmGKB webtools and databases.
We observe large differences in the accuracy with which these methods predict the underlying gene regulatory network depending on features of the data, network size, topology, experiment type, and parameter settings. Applying the best-performing method (the supervised method SIRENE) to the serous papillary ovarian adenocarcinoma dataset, we infer and rank regulatory interactions, some previously reported and others novel. For selected novel interactions we propose testable mechanistic models linking gene regulation to cancer. Using network analysis and visualization, we uncover cross-regulation of angiogenesis-specific genes through three key transcription factors in normal and cancer conditions. Druggabilty analysis of proteins encoded by the 10 highest-confidence target genes, and by 15 genes with differential regulation in normal and cancer conditions, reveals 75% to be potential drug targets.
Our study represents a concrete application of gene regulatory network inference to ovarian cancer, demonstrating the complete cycle of computational systems biology research, from genome-scale data analysis via network inference, evaluation of methods, to the generation of novel testable hypotheses, their prioritization for experimental validation, and discovery of potential drug targets.