Network inference with ensembles of bi-clustering trees

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Network inference is crucial for biomedicine and systems biology. Biological entities and their associations are often modeled as interaction networks. Examples include drug protein interaction or gene regulatory networks. Studying and elucidating such networks can lead to the comprehension of complex biological processes. However, usually we have only partial knowledge of those networks and the experimental identification of all the existing associations between biological entities is very time consuming and particularly expensive. Many computational approaches have been proposed over the years for network inference, nonetheless, efficiency and accuracy are still persisting open problems. Here, we propose bi-clustering tree ensembles as a new machine learning method for network inference, extending the traditional tree-ensemble models to the global network setting. The proposed approach addresses the network inference problem as a multi-label classification task. More specifically, the nodes of a network (e.g., drugs or proteins in a drug-protein interaction network) are modelled as samples described by features (e.g., chemical structure similarities or protein sequence similarities). The labels in our setting represent the presence or absence of links connecting the nodes of the interaction network (e.g., drug-protein interactions in a drug-protein interaction network).

Results

We extended traditional tree-ensemble methods, such as extremely randomized trees (ERT) and random forests (RF) to ensembles of bi-clustering trees, integrating background information from both node sets of a heterogeneous network into the same learning framework. We performed an empirical evaluation, comparing the proposed approach to currently used tree-ensemble based approaches as well as other approaches from the literature. We demonstrated the effectiveness of our approach in different interaction prediction (network inference) settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein and gene regulatory networks. We also applied our proposed method to two versions of a chemical-protein association network extracted from the STITCH database, demonstrating the potential of our model in predicting non-reported interactions.

Conclusions

Bi-clustering trees outperform existing tree-based strategies as well as machine learning methods based on other algorithms. Since our approach is based on tree-ensembles it inherits the advantages of tree-ensemble learning, such as handling of missing values, scalability and interpretability.

Related collections

Most cited references 36

Record: found
Abstract: found
Article: not found

Gene Ontology: tool for the unification of biology

Michael Ashburner, Catherine A. Ball, Judith Blake … (2002)

Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

0 comments Cited 15324 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Prediction of drug–target interaction networks from the integration of chemical and genomic spaces

Yoshihiro Yamanishi, Michihiro Araki, Alex Gutteridge … (2008)

Motivation: The identification of interactions between drugs and target proteins is a key area in genomic drug discovery. Therefore, there is a strong incentive to develop new methods capable of detecting these potential drug–target interactions efficiently. Results: In this article, we characterize four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, and reveal significant correlations between drug structure similarity, target sequence similarity and the drug–target interaction network topology. We then develop new statistical methods to predict unknown drug–target interaction networks from chemical structure and genomic sequence information simultaneously on a large scale. The originality of the proposed method lies in the formalization of the drug–target interaction inference as a supervised learning problem for a bipartite graph, the lack of need for 3D structure information of the target proteins, and in the integration of chemical and genomic spaces into a unified space that we call ‘pharmacological space’. In the results, we demonstrate the usefulness of our proposed method for the prediction of the four classes of drug–target interaction networks. Our comprehensively predicted drug–target interaction networks enable us to suggest many potential drug–target interactions and to increase research productivity toward genomic drug discovery. Availability: Softwares are available upon request. Contact: Yoshihiro.Yamanishi@ensmp.fr Supplementary information: Datasets and all prediction results are available at http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/.

0 comments Cited 327 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

ML-KNN: A lazy learning approach to multi-label learning

Min-Ling Zhang, Zhi-Hua Zhou (2007)

0 comments Cited 325 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Konstantinos Pliakos:

ORCID: http://orcid.org/0000-0002-1989-357X

konstantinos.pliakos@kuleuven.be

Celine Vens: celine.vens@kuleuven.be

Journal

Journal ID (nlm-ta): BMC Bioinformatics

Journal ID (iso-abbrev): BMC Bioinformatics

Title: BMC Bioinformatics

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2105

Publication date (Electronic): 28 October 2019

Publication date PMC-release: 28 October 2019

Publication date Collection: 2019

Volume: 20

Electronic Location Identifier: 525

Affiliations

[1 ]ISNI 0000 0001 0668 7884, GRID grid.5596.f, KU Leuven, Campus KULAK, Department of Public Health and Primary Care, Faculty of Medicine, ; Kortrijk, Belgium

[2 ]ITEC, imec research group at KU Leuven, Kortrijk, Belgium

Author information

Konstantinos Pliakos http://orcid.org/0000-0002-1989-357X

Article

Publisher ID: 3104

DOI: 10.1186/s12859-019-3104-y

PMC ID: 6819564

PubMed ID: 31660848

SO-VID: 91edda90-c03c-4545-b549-a5b78175151e

License:

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 28 February 2019

Date accepted : 20 September 2019

Custom metadata

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: biomedical networks,network inference,interaction prediction,tree-ensembles,multi-label classification

Data availability:

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: biomedical networks, network inference, interaction prediction, tree-ensembles, multi-label classification

Network inference with ensembles of bi-clustering trees

Read this article at

Abstract

Background

Results

Conclusions

Related collections

Network and Systems Medicine

Most cited references 36

Gene Ontology: tool for the unification of biology

Prediction of drug–target interaction networks from the integration of chemical and genomic spaces

ML-KNN: A lazy learning approach to multi-label learning

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 51

Cited by 5

Most referenced authors 1,532