MultiCapsNet: A General Framework for Data Integration and Interpretable Classification

Wang, Lifei; Miao, Xuexia; Nie, Rui; Zhang, Zhang; Zhang, Jiang; Cai, Jun

doi:10.3389/fgene.2021.767602

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: found

Is Open Access

MultiCapsNet: A General Framework for Data Integration and Interpretable Classification

methods-article

Author(s): Lifei Wang ¹ ^, ² ^, ³ ^, ⁴ , Xuexia Miao ² ^, ³ , Rui Nie ² ^, ³ ^, ⁴ , Zhang Zhang ⁵ , Jiang Zhang ⁵ ^, ^* , Jun Cai ² ^, ³ ^, ⁴ ^, ^*

Publication date (Electronic): 24 November 2021

Journal: Frontiers in Genetics

Publisher: Frontiers Media S.A.

Keywords: capsule network, classification, data integration, interpretability, modular feature

Read this article at

ScienceOpen Publisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The latest progresses of experimental biology have generated a large number of data with different formats and lengths. Deep learning is an ideal tool to deal with complex datasets, but its inherent “black box” nature needs more interpretability. At the same time, traditional interpretable machine learning methods, such as linear regression or random forest, could only deal with numerical features instead of modular features often encountered in the biological field. Here, we present MultiCapsNet ( https://github.com/wanglf19/MultiCapsNet), a new deep learning model built on CapsNet and scCapsNet, which possesses the merits such as easy data integration and high model interpretability. To demonstrate the ability of this model as an interpretable classifier to deal with modular inputs, we test MultiCapsNet on three datasets with different data type and application scenarios. Firstly, on the labeled variant call dataset, MultiCapsNet shows a similar classification performance with neural network model, and provides importance scores for data sources directly without an extra importance determination step required by the neural network model. The importance scores generated by these two models are highly correlated. Secondly, on single cell RNA sequence (scRNA-seq) dataset, MultiCapsNet integrates information about protein-protein interaction (PPI), and protein-DNA interaction (PDI). The classification accuracy of MultiCapsNet is comparable to the neural network and random forest model. Meanwhile, MultiCapsNet reveals how each transcription factor (TF) or PPI cluster node contributes to classification of cell type. Thirdly, we made a comparison between MultiCapsNet and SCENIC. The results show several cell type relevant TFs identified by both methods, further proving the validity and interpretability of the MultiCapsNet.

Related collections

Most cited references 42

Record: found
Abstract: found
Article: not found

SCENIC: Single-cell regulatory network inference and clustering

Sara Aibar, Carmen Bravo Gonzalez-Blas, Thomas Moerman … (2017)

Although single-cell RNA-seq is revolutionizing biology, data interpretation remains a challenge. We present SCENIC for the simultaneous reconstruction of gene regulatory networks and identification of cell states. We apply SCENIC to a compendium of single-cell data from tumors and brain, and demonstrate that the genomic regulatory code can be exploited to guide the identification of transcription factors and cell states. SCENIC provides critical biological insights into the mechanisms driving cellular heterogeneity.

0 comments Cited 1663 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge

Katarzyna Tomczak, Patrycja Czerwinska, Maciej Wiznerowicz (2015)

The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive “atlas” of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements.

0 comments Cited 1174 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq.

Amit Zeisel, Ana Muñoz-Manchado, Simone Codeluppi … (2015)

The mammalian cerebral cortex supports cognitive functions such as sensorimotor integration, memory, and social behaviors. Normal brain function relies on a diverse set of differentiated cell types, including neurons, glia, and vasculature. Here, we have used large-scale single-cell RNA sequencing (RNA-seq) to classify cells in the mouse somatosensory cortex and hippocampal CA1 region. We found 47 molecularly distinct subclasses, comprising all known major cell types in the cortex. We identified numerous marker genes, which allowed alignment with known cell types, morphology, and location. We found a layer I interneuron expressing Pax6 and a distinct postmitotic oligodendrocyte subclass marked by Itpr2. Across the diversity of cortical cell types, transcription factors formed a complex, layered regulatory code, suggesting a mechanism for the maintenance of adult cell type identity. Copyright © 2015, American Association for the Advancement of Science.

0 comments Cited 681 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Lifei Wang: URI : https://loop.frontiersin.org/people/1465897/overview

Xuexia Miao: URI : https://loop.frontiersin.org/people/1530321/overview

Rui Nie: URI : https://loop.frontiersin.org/people/1509617/overview

Jun Cai: URI : https://loop.frontiersin.org/people/1460899/overview

Journal

Journal ID (nlm-ta): Front Genet

Journal ID (iso-abbrev): Front Genet

Journal ID (publisher-id): Front. Genet.

Title: Frontiers in Genetics

Publisher: Frontiers Media S.A.

ISSN (Electronic): 1664-8021

Publication date (Electronic): 24 November 2021

Publication date Collection: 2021

Volume: 12

Electronic Location Identifier: 767602

Affiliations

[ ¹ ]Shulan (Hangzhou) Hospital Affiliated to Zhejiang Shuren University Shulan International Medical College, Hangzhou, China

[ ² ]China National Center for Bioinformation, Beijing, China

[ ³ ]Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China

[ ⁴ ]University of Chinese Academy of Sciences, Beijing, China

[ ⁵ ]School of Systems Science, Beijing Normal University, Beijing, China

Author notes

Edited by: Jin Chen, University of Kentucky, United States

Reviewed by: Md Selim, University of Kentucky, United States

Lucas Jing Liu, University of Kentucky, United States

*Correspondence: Jiang Zhang, zhangjiang@ 123456bnu.edu.cn ; Jun Cai, juncai@ 123456big.ac.cn

This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics

Article

Publisher ID: 767602

DOI: 10.3389/fgene.2021.767602

PMC ID: 8652257

SO-VID: 9ed95177-301c-4a6c-aeeb-7579a990c728

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

MultiCapsNet: A General Framework for Data Integration and Interpretable Classification

Read this article at

Abstract

Related collections

Genome Integrity

Most cited references 42

SCENIC: Single-cell regulatory network inference and clustering

The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge

Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 65

Cited by 3

Most referenced authors 687