Functional genomics screens using multi-parametric assays are powerful approaches for identifying genes involved in particular cellular processes. However, they suffer from problems like noise, and often provide little insight into molecular mechanisms. A bottleneck for addressing these issues is the lack of computational methods for the systematic integration of multi-parametric phenotypic datasets with molecular interactions. Here, we present Integrative Multi Profile Analysis of Cellular Traits (IMPACT). The main goal of IMPACT is to identify the most consistent phenotypic profile among interacting genes. This approach utilizes two types of external information: sets of related genes (IMPACT-sets) and network information (IMPACT-modules). Based on the notion that interacting genes are more likely to be involved in similar functions than non-interacting genes, this data is used as a prior to inform the filtering of phenotypic profiles that are similar among interacting genes. IMPACT-sets selects the most frequent profile among a set of related genes. IMPACT-modules identifies sub-networks containing genes with similar phenotype profiles. The statistical significance of these selections is subsequently quantified via permutations of the data. IMPACT (1) handles multiple profiles per gene, (2) rescues genes with weak phenotypes and (3) accounts for multiple biases e.g. caused by the network topology. Application to a genome-wide RNAi screen on endocytosis showed that IMPACT improved the recovery of known endocytosis-related genes, decreased off-target effects, and detected consistent phenotypes. Those findings were confirmed by rescreening 468 genes. Additionally we validated an unexpected influence of the IGF-receptor on EGF-endocytosis. IMPACT facilitates the selection of high-quality phenotypic profiles using different types of independent information, thereby supporting the molecular interpretation of functional screens.
Genome-scale functional genomics screens are important tools for investigating the function of genes. Technological progress allows for the simultaneous measurement of multiple parameters quantifying the response of cells to gene perturbations such as RNA interference. Such multi-dimensional screens provide rich data, but there is a lack of computational methods for interpreting these complex measurements. We have developed two computational methods that combine the data from multi-dimensional functional genomics screens with protein interaction information. These methods search for phenotype patterns that are consistent among interacting genes. Thereby, we could reduce the noise in the data and facilitate the mechanistic interpretation of the findings. The performance of the methods was demonstrated through application to a genome-wide screen studying endocytosis. Subsequent experimental validation demonstrated the improved detection of phenotypic profiles through the use of protein interaction data. Our analysis revealed unexpected roles of specific network modules and protein complexes with respect to endocytosis. Detailed follow-up experiments investigating the dynamics of endocytosis uncovered crosstalk between the cancer-related EGF and IGF pathways with so far unknown effects on endocytosis and cargo trafficking.